NetApp Usable Capacity – Going, Going, Gone - Around the Storage Block Blog -
NetApp Usable Capacity – Going, Going, Gone

By Jim Haberkorn

I just returned from a business trip to India where I visited various NetApp customers.  At one customer the issue of NetApp usable capacity came up and it so reminded me of conversations I'd had with other NetApp customers and resellers that I feel it is worth reporting.   A little background:  NetApp usable capacity has been a running battle with NetApp for as long as I can remember.  And frankly, this long time controversy surprises me because every time I have a conversation with a knowledgeable NetApp customer and am able to develop some rapport, I always hear the same thing (usually said with a chuckle):  "yes, of course, NetApp has a usable capacity issue. We all know it."  I bring this up because there was an EMC blog a few months back that tackled this issue, and after being hit with a barrage of counter-points from a variety of NetApp sources, the EMC blogger finally said something to the effect, "you can argue against this all you want, but we at EMC sell into a lot of NetApp environments and we hear about NetApp usable capacity issues from customers all the time."  Obviously, EMC is hearing the same things from customers that we are.

The Indian NetApp customer first told me that he was running at 56% usable capacity - which seemed high to me because all our tests showed the real NetApp usable capacity to be in the low to mid 40% range or even lower if you follow every default and best practice to the letter.  But then the customer went on to explain that to achieve this 56% number he had to violate most NetApp best practices and had to take a noticeable hit on performance as well.  He said that he had to set the aggregate and volume space reservations to zero, as well as the LUN space reservation.  Also, he had to place the NetApp root volume on his data disks instead of leaving it in its own aggregate.  He did all that and still only reached 56% and he was not happy because all those space reservations are put in place by NetApp for a reason - either to protect performance or to protect access to data.  But now that he had bought NetApp there wasn't much he could do.    

Jim Haberkorn


Posted 11-24-2008 5:15 PM by CalvinZ
Filed under: , ,

Comments

ChuckBrown wrote re: NetApp Usable Capacity – Going, Going, Gone
on 11-24-2008 6:36 PM

As a former NetApp customer, I've heard this claim a lot from HP and EMC, but never seen a concrete example with configuraiton info. I can't figure out how I would come to such a poor utilization.

Can you post an example of this or should we file this under other rumors like "I heard that if you mix pop rocks and soda, it will explode"?

Alex McDonald wrote re: NetApp Usable Capacity – Going, Going, Gone
on 11-25-2008 1:17 PM

We all have our stories of usable capacity or performance from a customer where <insert anecdotal evidence here>. Nice touch where they "chuckle", though. Adds that human dimension to the story. .  

Unfortunately, the facts tend to get in the way. blogs.netapp.com/.../how-long-is-a-s.html

Then we guaranteed space for VMware users; www.netapp.com/.../news-rel-20080930.html

Honestly, Jim, give over on the bash NetApp posts that you guys are pumping out. WAFL performance dissed by Avanade, your hot air and rhetoric dissed by some real figures. It's like watching EMC on the reruns channel.

Why not tell us about the HP storage offerings? What's the advantage of an EVA with usable space? Performance? VMware VDI implementations?

Go on, make me chuckle instead. Say something about HP's storage solutions.

Jim Haberkorn wrote re: NetApp Usable Capacity – Going, Going, Gone
on 11-25-2008 4:13 PM

Hi Chuck, thanks for your comments.  In terms of rumors, NetApp claims of usable capacity superiority are usually filed under the same category as Elvis sightings - you hear about them but when you go to investigate they are really hard to pin down.  Now on to your question:  If you want to figure out your true NetApp usable capacity use these rough figures and you won’t be far wrong:

 Subtract 32% for rightsizing, spares, parity, and aggregate space reservation taken out automatically by the OS

 Then subtract 10% from the aggregate as an additional user configurable space reservation (could go as high as 20% in some environments)

 Then subtract another 10% from the volume (could also be as high as 20%)

 Then subtract another 30% for LUN space reservation if you are in a block environment

 Then subtract another drive or two per filer for the root volumes (this depends on how you want to configure it)

 Notice that I did not bring up snap reserve space since I don’t consider that wasted space.

 Also, don’t forget to convert the Seagate base10 numbers into base2 as this is how the numbers are reported on the NetApp GUI.

Now, you might ask, how did I learn all this?  The answer is a little complex because you can’t find the entire NetApp usable capacity policies described in a single document.  Some of this I found in various NetApp white papers and some I learned from NetApp customers and from former NetApp employees.  And then I’ve verified these finding on a NetApp filer I have access to.  The whole issue of NetApp usable capacity is extremely complex.  I believe there are three reasons for this:

 There is no single NetApp white paper that addresses the entire issue - even though I think NetApp customers would greatly appreciate one.  

 There are parts of the space reservation calculation that are done automatically without user intervention, and there are parts where the user has some control over it.  If NetApp is in a competitive deal they can simply set all the user configurable space reservations to zero which will work in most cases in the short term.  

 Many usable capacity issues are first manifested as performance problems and not directly as an issue with the filer running out of space.  The typical scenario is that a customer complains of a performance problem and the NetApp engineer then recommends adding more disks.  There is then a sliding relationship between usable capacity and performance.  If a customer wants to sacrifice performance they can achieve a better usable capacity and vice versa.    

I brought this issue up because I don’t think it ever reached a satisfying conclusion in the EMC blog I mentioned earlier.  The solution in my mind really begins with looking at the NetApp screen shots and seeing what space reservations happen automatically without user intervention.  And then adding on the user configurable space reservations.  Seems doable to me as long as all the space reservations and wasted space policies are clearly out on the table.  

Btw, thanks for the humor at the end of your comments.  I always appreciate a good chuckle and it helps to keep these discussions light.  

Jim Haberkorn

Alex McDonald wrote re: NetApp Usable Capacity – Going, Going, Gone
on 11-25-2008 11:26 PM

Jim, this is one long piece of unadulterated guff. If I subtracted all the paragraphs with meaningless numbers or unsubstantiated mumbo jumbo in them, I'd end up with one word; Elvis.

Please do your homework and post up the results where they can be properly analysed. Your hyperbole, and Karl Dohm's inability to understand why NetApp customers can and do run Exchange (link below), is giving your HP storage colleagues and HP itself an unattractive reputation for pointless, factless, mudslinging FUD blogging.  

blogs.netapp.com/.../mad-blog-hp-not.html

Jim Haberkorn wrote re: NetApp Usable Capacity – Going, Going, Gone
on 11-26-2008 8:27 AM

Hi Alex, thanks for your comments.

“Well, he should arm himself if he’s going to decorate his saloon with my friend” – Clint Eastwood after shooting the saloon owner – The Unforgiven – 1992 Best Picture.  And by the same token, if a company distributes a white paper like the Mercer/Wyman paper to all its customers attacking a competitor’s product they should expect to get blogged.  It’s nothing personal, and don’t get me wrong, I’m not against all competitive white papers.  I consider them a valid part of any company’s overall marketing effort.  But you’ll notice that I’m not blogging EMC or IBM or any of the other companies that produce such papers – just NetApp.  And I’ve already explained why in my previous blog – the NetApp paper was in a class by itself when it came to marketing spin.  And again I ask, ‘if NetApp didn’t have a usable capacity problem why would it stoop to comparing the space required by its snapshots with the space required by its competitors’ full-copy clones in a cost-of-ownership paper?”  I think the answer is clear.  

I looked over the two sites you directed me to, and I am again struck by the strenuous angles that NetApp will take to try to prove that it does not have a usable capacity problem.  It reminds me a little bit of a man standing by a barn door and trying to prove that he has a horse inside by coming up with all these ingenious arguments when all that anyone is really asking him to do is to open the barn door and show us.  In the same way, the more NetApp continues to skirt around the usable capacity issue rather than just answering the questions directly, the more I am skeptical of its claims.  The pie chart you directed me to showed NetApp filers filled to 37% capacity with 34% used for overhead, 6% for snap space, and another 23% as free space.  I’m sorry, and I’m not trying to be difficult here, but for some reason I’m not finding a pie chart showing NetApp customers filling their filers up to 37% as settling the argument on NetApp usable capacity.  Also, one of the sites you directed me to was about the NetApp 50% capacity guarantee.  I’m surprised you sent me there.  I’ll cover that in my next blog.  

I have access to a NetApp filer and I have spoken to many NetApp customers and resellers over the last two years, and former NetApp employees, and I know that NetApp is well aware of all its issues with usable capacity.  In fact, I never used to question NetApp on the subject until a former NetApp employee saw a presentation I gave and pointed out that I was missing NetApp’s biggest weakness.  Since then I have researched it myself and have become convinced it is definitely an issue that NetApp tries hard to cover up.  

But let me make you an offer:  the real intent of my blog on the subject is to get to the bottom of the issue and I can’t really do it without NetApp’s help – so I need you to stay engaged with me on this subject.  In my previous response to Chuck Brown, I laid out what I believed about NetApp usable capacity including giving percentages – so if I’m wrong, I’ve given you a big, fat target to shoot at.  If I am in error then please correct me.  And, of course, I will try to reproduce your suggestions on the NetApp filer I have and verify them against NetApp white papers.  Sounds like an easy test to me.   Are you game?  

And though we are competitors, let’s try and keep this light and professional – though I don’t think anyone minds the occasional clever repartee.    

Jim Haberkorn      

Alex McDonald wrote re: NetApp Usable Capacity – Going, Going, Gone
on 11-26-2008 11:08 PM

My blog entry (blogs.netapp.com/.../how-long-is-a-s.html) that I pointed you to was over 7000 customer's systems running real life workloads storing real data, showing an average 66% usable space across FC SANs, iSCSI SANs and NAS, both CIFS and NFS.

Your response?

"I am again struck by the strenuous angles that NetApp will take to try to prove that it does not have a usable capacity problem."

Sheesh. You're attacking NetApp with your made up numbers, why shouldn't I prove them to be no more than figments of your fevered imagination?

You don't make a single substantive point in response. Just more of your stories from chuckling resellers, giggling customers et al, and more hand waving diversions. The sum total of your blog and every response you have made is

"NetApp Sucks. It's True Because Lots of People Say This To Me!"

So I'll not stay engaged with you on this. If you were a customer trying to solve a real problem, I'd be there like a shot. But you're not; you're just a sack of anecdotes thrown together in an attempt to sound like you know what you're talking about.

I await your analysis of our VMware 50% Guarantee. Please surprise me with some facts and critical analysis this time.

Jim Haberkorn wrote re: NetApp Usable Capacity – Going, Going, Gone
on 12-01-2008 1:15 PM

Thanks Alex for your response.  

If anyone wants to hear a constructive discussion from NetApp on their usable capacity issues then they need to get on a NetApp blog that is filled with all NetApp users.  You’ll then see how NetApp technical experts handle questions about performance problems with their NetApp filer.  The answer is to add disks, not for the purpose of increasing spindle count, but for the purpose of increasing free space.  And you’ll also get a feel for how complex NetApp space reservation rules are – sometimes even the experts have trouble explaining them. The issue is especially prevalent in SANs and less so in NAS environments. And this is why the NetApp internal study of over 7000 NetApp customers is less than convincing – first, by its sheer size the study was weighted with a very high % of legacy NetApp NAS customers.  Second, it only showed the NetApp customers as filling their filers up to 37% - hardly conclusive. Third:  the issue with NetApp usable capacity and free space are intertwined.  To show a configuration filled to 37% data and with 23% free space doesn’t answer the question of how many customers could successfully access that free space without having system problems.

Now, a word about the blog tactics of a sizable number of NetApp defenders.   What can I say?  It has been my experience when reading various blogs that breathe even a word of criticism about NetApp, that the NetApp defenders’ go through the same unconstructive attack points – and almost in order.  What is usually lost in all this is that NetApp invites controversy by the highly questionable statements it makes about its competitors in the white papers it hands out to customers and offers as a free download on its website.    

For those of you who came in late on this discussion, this current discussion originally kicked off in a blog I wrote questioning the usable capacity facts NetApp published in a competitive white paper aimed at the HP EVA.  www.communities.hp.com/.../netapp-apparently-still-lags-in-cost-of-ownership.aspx

In its white paper NetApp based its claims against the EVA 100% on a small number of anonymous customer interviews.  So I hope those who may have questioned why I used an occasional customer or reseller anecdote to rebut those claims will understand now why I did so.  

And here is an example of what I mean by NetApp bringing its own troubles on itself.  If you believe one thing, you should believe that the major SAN vendors are technically very competitive.  If one serious competitor could claim and reasonably prove even a 10% disk utilization advantage over its competitors it would have a huge advantage.  A 20% advantage would be unheard of.  But in its study NetApp claims an astounding 100% advantage in acquired capacity for their FAS3070 over the EVA 8100.  In other words, they claim that HP customers have to buy twice the capacity of a NetApp customer for the same size database.  And here is a small but interesting point:  their report had the EVA, EMC CLARiiON CX3-80, and EMC DMX 3-950 customers all acquiring the identical 30.7TB for a 4TB database even though they all have a different RAID-5 stripe, different sparing schemes, and the CX uses vault drives.  Still think the NetApp report was based on real numbers?  

But consider this additional point from a truly independent source, which though it may not be 100% conclusive is nevertheless noteworthy,:  In its Q2-2008 StorageTracker, IDC reports that the average TB sold per SAN unit for the NetApp and HP products named in the NetApp study is 41.7TB and 20.8TB respectively.  In other words, though the NetApp paper claims that its FAS3070 customers have to buy half the storage of an HP EVA8100 customer in a SAN environment, the IDC numbers show that NetApp customers are having to buy twice as much as HP per unit.  So the question is ‘why do NetApp customers have to buy more storage for their SANs?’  Could it be because of usable capacity problems?  And please if anyone wants to claim its because NetApp with a SAN business one quarter the size of HP’s plays in bigger and more complex SANs – well, that one may be greeted by skepticism even by ardent NetApp supporters.      

When we train our sales force in how to deal with competitors, we tell them not to overdue the competitive stuff and to mainly focus on the good features of our products.  The reason why we can take that position is because most competitors will generally make claims about themselves and about us that are within the range of sanity.  Against NetApp, however, I tell the sales force that they have no choice but to bring up the competitive stuff because if the NetApp claims about itself and its competitors are left unchallenged then the customer will get a totally distorted view of how things really are.  

Jim Haberkorn

Alex McDonald wrote re: NetApp Usable Capacity – Going, Going, Gone
on 12-02-2008 1:13 PM

Thanks for replying. A number of unanswered questions remian, and I'd be grateful if you could address them.

First, the NetApp internal space utilisation study that you dismiss.

Do you have one for the EVA so that we can compare and contrast? Include as many legacy systems as you wish.

Second, the NetApp report. media.netapp.com/.../ar1038.pdf

The NetApp report you complain about; I agreed to the points you made (www.communities.hp.com/.../netapp-apparently-still-lags-in-cost-of-ownership.aspx first comment), but asked what Oliver Wyman (the study authors)  should have done as an alternative to full copy clones on the EVA; use snapshots? Your response; "Answer: it depends on what the customer is trying to achieve." Well, let's say he's trying to avoid full copy clones on an EVA. Say on an Oracle database or an Exchange system. Do you recommend the use of snapshots? Get as specific as you like. NetApp recommends them; snapshots are appropriate for all workloads because they have almost no impact on performance. Cite; SPC-1 benchmark with snapshots  www.storageperformance.org/.../a00062_NetApp_FAS3040-48hr-sustain_executive-summary.pdf

Lastly, the IDC report.

Your question "why do NetApp customers have to buy more storage for their SANs?"can be answered in a number of ways. Perhaps they can afford to. For instance, maybe the IDC report indicates that HP EVAs appear to be more costly per TB than the equivalent NetApp FAS?

Cleanur wrote re: NetApp Usable Capacity – Going, Going, Gone
on 12-03-2008 1:54 AM

Alex, your own document referenced above media.netapp.com/.../ar1038.pdf disproves that EVA's are more costly per TB than Netapp FAS, in fact it appears the reverse is true.

According to the report

"A NetApp solution deployed in a Fibre Channel SAN environment is 42% less expensive than a typical HP EVA solution."

"NetApp environments typically require 51% less primary disk space than typical EMC or HP environments. For instance, to store a 4 TB database, NetApp customers typically acquire 15 TB of primary and data protection storage space, while EMC and HP environments typically acquire more than 30 TB."

So if you accept the full copy clones didn't really create a level playing field, then it appears that an EVA with over double the storage capacity of a Netapp FAS is only 42% more costly. working backwards that suggests the EVA with equivalent capacity to the FAS would be considerably cheaper. All of that is without the space reservations required by the FAS which the report chose to ignore.

Jim Haberkorn wrote re: NetApp Usable Capacity – Going, Going, Gone
on 12-03-2008 8:22 PM

Thanks for your comments, Alex.

When defending itself against criticisms that its usable capacity is the worst in the industry, NetApp still reminds me of a man trying to prove he has a horse by showing everyone a bag of oats.  So let me ask a rhetorical question: in a discussion on usable capacity what do you find the most convincing:  

a.  Vendor interviews from anonymous customers

b.  Vendor-collected customer data that can be neither proved nor disproved by the competition

c.  A short white paper stating the vendor’s usable capacity parameters that can then be verified by anyone who owns that vendor’s product.

If you are NetApp marketing the answer is definitely not ‘c’.  

NetApp considers white papers based on anonymous customer interviews and statistics based on unverifiable customer data as the most convincing arguments in a usable capacity discussion.  I see things a little differently.  I think the best proof would be a clear two page document that covered NetApp space lost to:

  RAID levels

  Spare drives

  Disk drive formatting and metadata

  Root volumes

  Space reserved for memory dumps

  Free space requirements for aggregates, volumes, and LUNs  and under what performance loads

  Relevant best practices  

End of story, end of controversy.  Have you ever noticed, that when HP or any of the other vendors get in a scrap over usable capacity they pull out their own technical white papers and share their best practices openly.  At the end of the day most of the competitors end up being pretty close to each other and the entire episode tends to be a tempest in a teacup.  That doesn’t mean there aren’t some differences between the major vendors – but it’s rarely enough to win or lose a deal.  The exception is NetApp.  I never see NetApp bloggers refer to their own technical white papers when discussing usable capacity.  In my next blog on the NetApp 50% capacity guarantee I will show you why.  

Now, in the spirit of true bloggery and in gratitude to Alex for reconsidering his pledge and continuing to engage with me, here are my answers to his questions – in reverse order.

a.  In regards your point about the IDC numbers:  Alex, you are giving me whiplash.  First your own white paper says that NetApp customers buy half as much storage per unit as HP, then you defend the IDC numbers that say NetApp customers buy twice as much storage per unit as HP.  I couldn’t help focusing on the fact that when given a choice between defending the IDC numbers or NetApp’s own numbers, you defended the IDC numbers.

b.  In regards your question about the customer who didn’t want to use full-copy clones:  I feel moved to stick with my original answer – that it depends on the customer situation whether I would suggest snapshots or clones.  If a customer said he didn’t want to use clones, I wouldn’t just roll over, but would ask him why.  Clones have advantages over snapshots.  For example, NetApp snaps sit on the same disks as the primary volume and are part of the same fault domain, but clones sit on entirely different disks.  If you use your snaps as a platform for backups to tape then clones have a considerable performance advantage and an availability advantage as well.  Plus, if the comparison is with NetApp and you buy them for their snapshot technology then you are stuck with their usable capacity problem as well.  

c.  You asked if we had a customer study to prove our usable capacity claims.  No, not that I’m aware of, but if you would like a nifty little one page document that tells you everything you need to know about EVA usable capacity and that you can use to verify our claims on any EVA, I’d be happy to send it to you.  

Now, onward to NetApp’s 50% capacity guarantee – what I like to refer to as NetApp’s ‘Shining’ moment - my next blog.  

Jim Haberkorn

Jim Haberkorn wrote re: NetApp Usable Capacity – Going, Going, Gone
on 12-03-2008 8:43 PM

Thanks Cleanur for pitching in.  I must admit when I first read your comments I had to rush to the NetApp paper to verify what you said - and I'll be darned, but I think you are reading the paper correctly.  Good catch and this is another good lesson for aspiring bloggers - that once you start down the road of trying to defend an essentially undefendable position, you never have enough fingers to plug all the leaks - sorry for the mixed metaphor but I think you get my point.   .  

Jim Haberkorn  

Alex McDonald wrote re: NetApp Usable Capacity – Going, Going, Gone
on 12-08-2008 2:43 PM

http://www.dedupecalc.com/ and http://www.secalc.com/ are both available to anyone that wishes to use them to calculate NetApp space requirements.

So that's (a) (b) and (c) we do, unlike HP.

IDC numbers show HP EVA as 1.6 times more expensive per shipped TB as HP EVA systems. But you knew that, because you and I have the same report. I think your unsupportable assertion that even you admit "may not be 100% conclusive" is called cherry picking.

The rest is just too many words to reply to (my, can you churn them out!). Point me at a specific question and I'll answer it.  

cleanur wrote re: NetApp Usable Capacity – Going, Going, Gone
on 12-08-2008 9:23 PM

Alex,

You still appear to be concealing that damned equine. Maybe it's just a horse of a different colour ;-)

Jim Haberkorn wrote re: NetApp Usable Capacity – Going, Going, Gone
on 12-09-2008 10:29 AM

Thanks Cleanur.  I was happy to see that my horse analogy struck a cord.  I love a good metaphor.  With NetApp's 50% capacity guarantee - the subject of my next blog -  I came up with the following metaphor but never used it.  But here it is:  NetApp claiming that its dedupe is responsible tor the 50% capacity savings when in fact almost all the savings is due to comparing its RAID-DP against RAID-1, is like two men having a race to the top of a building and one takes the stairs and the other takes the elevator.  And then when the one in the elevator wins, he gives all the credit to his shoes. Jim Haberkorn  

Jim Haberkorn wrote re: NetApp Usable Capacity – Going, Going, Gone
on 12-09-2008 11:35 AM

Thanks again Alex for your comments.  

For two years now I’ve been sparring with NetApp over the issue of usable capacity and up till a few weeks ago have been content to just let it be an issue that I train my sales force on and educate the NetApp customers and resellers that I visit.  What made me decide to blog it was reading the recent EMC blog on the subject. Let me simply say that while I have to admit the EMC statements were provocative, in my opinion the NetApp defenders were in many cases inappropriately aggressive and in almost all cases misleading.  I don’t know if the behavior of the NetApp bloggers was a sign of NetApp’s changing internal culture or just an aberration by a few people who couldn’t handle the notoriety of a blog.  I don’t know.  But I do know that I respect Alex for his increasingly civil responses as my blog has progressed.  And I know that I respect every company that has the ability to compete hard in storage – this most competitive of all computer industries.  Think about it:  choose your industry - can anyone think of a vendors murderers’ row (baseball expression normally associated with the batting order of the 1927 Yankees – note: Lou Gehrig had 175 RBIs that year despite batting after Ruth who hit 60 homes runs – but I digress) more lethal than HP, IBM, EMC, HDS…and yes, NetApp.  What a lineup!  

With that said, I have a few more points to make in regards NetApp usable capacity.  My latest blog on the subject is titled, “NetApp’s ‘Shining’ Moment – its Capacity Guarantee Program”, and it should be posted today or tomorrow at the latest.  Here is a sneak preview from that blog:  “NetApp has a huge usable capacity issue in many environments that it tries desperately to hide but at the same time seems driven to confess as if subconsciously trying to purge some unresolved guilt.”    

Finally, Alex, I checked out the two capacity calculators you directed my readers to.  Alex, those are toys.  That’s not what your architects use to calculate capacity when configuring a system.  They’re marketing toys.  C’mon.  

Jim Haberkorn

Jim Haberkorn wrote re: NetApp Usable Capacity – Going, Going, Gone
on 12-09-2008 11:52 AM

I just reread the answer I gave to Chuck way at the top of this blog and it occurred to me that I needed to clarify something.  If you add up all the lost space %s that I gave, it might appear that you end up with 18% usable capacity - but that would be an erroneous conclusion. I just want to make it clear that except for the initial 32% space taken originally by the OS all the rest of the lost space is taken from increasing smaller subsets of the overall  remaining raw storage. In other words, if the OS takes out 32% then the 10% aggregate loss that I mention comes out of what is left.  And the volume loss comes out of what's left of that, and so on.  The end result is you usually end up with usable capacity below 50% with NetApp and if you follow all their best practices to the letter you can easily find yourself down below 40%.  Jim Haberkorn

Alex McDonald wrote re: NetApp Usable Capacity – Going, Going, Gone
on 12-09-2008 8:32 PM

"That’s not what your architects use to calculate capacity when configuring a system. They’re marketing toys.  C’mon."

You are right; we use a tool called Synergy. It's used for everything, including quoting usable vs raw. No, you can't have a copy; although I wish I could let you have access, as your last comment demonstrates you're in dire need of it.

Jim Haberkorn wrote re: NetApp Usable Capacity – Going, Going, Gone
on 12-10-2008 1:53 PM

Thanks for your comments, Alex.  In a future blog I’m going to talk about the usable capacity story NetApp tells customers during the sales pitch (and in blog responses) vs. the one that actually gets implemented during the installation vs. the one that the customer eventually has to live with forever after that.  That blog will address what I consider the most interesting question of all:  how does NetApp get away with this?  I have to admit when I first looked into this issue I thought I’d uncovered the first case of the Stockholm Syndrome playing out in a vendor/customer relationship, but I have since reached a slightly moderated conclusion.         Jim Haberkorn

Arne wrote re: NetApp Usable Capacity – Going, Going, Gone
on 12-18-2008 12:42 AM

Hi Jim,

sorry, I'm a bit late on this one, but I just found this blog entry and wanted to give you the NetApp usable capacity details that you asked for :-) I'm going to give an example based on the system configuration you've used for your recent NetApp Exchange performance test to give you a chance to verify these numbers on your system. The actual numbers should be what I'm stating here, but they might be off by a few GB on the actual system as I'm rounding to full numbers most of the time.

The configuration you've looked at was 20 disks with 144GB each, one Raid group with Raid-DP. The 144GB are in base10, often referred to as the marketing number, while the storage system operates in base2. So the actual amount of bytes you can store on each disk is not 144GB. That's not specific to NetApp as this is caused by the disk vendors using the higher base10 "markteting number" instead of the number of bytes one can actually store on the disk. The real size of the disk in the NetApp system is 136000MB (as indicated by the sysconfig -r command), which basically is what is physically available on the disk.

Let's convert that to GB as I will calculate everything in GB from now on: 136000 / 1024 = 132.8GB

So the raw capacity is 20 x 132.8GB = 2656GB

The system is configured with Raid-DP (default and best practise) which "costs" us two disks, so the remaining capacity is 18 x 132.8GB = 2390GB. You can check this via the aggr show_space <aggr_name> -g command, this is listed as total space.

The NetApp system stores an additional checksum of 8bytes for every 512 bytes of data, which is ~1.5% overhead, or 35GB, which leaves 2355GB.

The System takes a 10% overhead for WAFL metadata, which leaves us with 2120GB. (this is listed as WAFL reserve in the aggr show_space <aggr_name> -g output).

The aggregate has a default snapshot reserve of 5%. This is a default value that you should lower to 2% or so in your environment, but in order to prevent any arguing about dirty tweaks in my calculation I will just assume the default 5% value. This leaves us 2014GB usable space in the aggregate (reported as usable space in the aggr show_space <aggr_name> -g output).

While you could completely use that space, the NetApp best practise is to leave about 10-15% free space in the aggregate for best performance. Background: The WAFL system can write to any block, but of course it tries to find the best block (performance-wise). If you leave less than 10% free space on the system, you force WAFL to write to the few available blocks, e.g. it doesn't have a choice anymore and can't optimise the data layout while writing.

So let's assume 15% free space in the aggregate for best performance, that leaves us 1712 GB. Out of the 20 x 132.8 = 2656GB that's 64% usable capacity! Even if we take two hot spares into account, that's a total of 22 x 132.8GB = 2921GB, and we still have 58% usable capacity. And this really assumes a non-optimized setup, following all best practise recommendations, no tweaking etc. Depending on the customer requirements, the deploment could look a bit different (e.g. less aggr snap reserve, fewer spares) which would increase the usable capacity.

All this doesn't even take into account that in almost all environments the customer benefits from thin provisioning, dedupe, flexclones, etc. which "increases" the usable capacity quite a bit.

A few hints at what you should have configured differently on your system:

There is no reason to have a seperate aggregate for the root volume in your setup. The system by default has one aggregate with raid-DP and 3 disks that holds the root volume. Increase that aggregate by the number of disks you have in your system and place all your volumes in it. The aggregate is just a way of - well... aggregating disks :-) It is not used to seperate data, that's what volumes are for.

As you didn't use snapshots in your test environment, turn off snap reserve. It wouldn't be fair to compare a NetApp system that is configured for snapshots with an HP system that doesn't even have a usable (e.g. no performance impact) snapshot implementation ;-) So run the following commands to turn that off:

snap reserve <volume name> 0 command

vol options <volume name> nosnap on

vol options <volume name> fractional_reserve 0

You're more than welcome to verify these numbers on your system. As I assume that you have a valid support contract for your NetApp system, all the information in this post is also available to you on the NetApp NOW website (search the knowledge base for an article called "From disk size to aggregate capacity").

To sum it up: Even in a non-optimized setup with double parity protection (not even available on HP systems) and two spare disks (which one could consider overkill in a 20 disk environment) a NetApp customer still ends up with 58% usable capacity (which is roughly what I would expect from an HP or EMC array in a similar configuration) and can use the advanced space saving technologies such as dedupe (not even available on HP systems) to get even more out of his system. Are you still sure NetApp has a usable capacity issue?

Peter wrote re: NetApp Usable Capacity – Going, Going, Gone
on 03-30-2009 7:28 AM

Jim:

       This has been a great BLOG, can you please Lab Arne's solution and report back your findings.

I'm just trying to figure out why people purchase anything but Netapp it does everything and seems to have no downside.

Jim Haberkorn wrote re: NetApp Usable Capacity – Going, Going, Gone
on 04-01-2009 9:05 AM

Arne, thanks for your response, and apologies for the delay. We missed your comment.

And thanks for admitting that NetApp aggregates require an extra 10-15% space reservation for Exchange environments and 10% for WAFL overhead. You’d be amazed at how difficult it is to get NetApp enthusiasts to admit even that much. Second, the goal of your response was to prove that NetApp usable capacity was roughly at par with other competitor arrays. Don’t worry, I won’t report to NetApp marketing that you are contradicting their white papers claiming a 30%+ usable capacity advantage in Exchange environments over the competition (even without dedupe). However, I do believe you just proved a point I made in a previous blog that NetApp enthusiasts always follow the same pattern: If they think you don’t know anything about their technology they will claim their usable capacity is far and away the best in the industry. If they think you might know something about it, then they will reduce their claim and admit to having just parity with the competition. And then if you keep pushing, they will retreat into saying that their other strengths out-weigh their usable capacity disadvantages.  

I have to admit, for reasons that I think will be obvious below, I was tempted to not take your response seriously. Also, please note, we are very aware that Seagate numbers are in base10 and NetApp numbers in base2, and we convert everything to base2 when discussing NetApp.

Here is the information you forgot to mention in your example: The math in your example only works if you are talking about a single-node NetApp configuration in an Exchange Environment. How realistic is that? Did you forget the other node in the cluster? I think most people would consider that to be a fairly important detail to have left out. We actually did your theoretical configuration on a cluster and got 48.66% usable capacity for NetApp and that was with volume space reservation and snap reservation set to zero. That same config on an EVA was 73.48% usable capacity. Think about it: you admit you have to set aside 10% extra space for WAFL and 15% extra in the aggregate for performance, a whopping 25% total – something no other array has to do – how could you then keep a straight face while arguing that NetApp is on a par with other arrays in usable capacity? As far as spares are concerned, every aggregate needs a spare. Maybe there is some manual way to override that, but if so, we can’t seem to find it – the NetApp system forces us to have a spare with every aggregate. Therefore, if you have a cluster you’ll have two spares as a minimum regardless, even if you only have twenty disks in the entire system. By glossing over that, you implied that somehow we had deliberately over-configured the spares in our example to make NetApp look bad. We didn’t. We don’t have to use tricks to make our point. NetApp has a serious usable capacity problem as compared to the competition.  

Best regards,

Jim

Jim Haberkorn wrote re: NetApp Usable Capacity – Going, Going, Gone
on 04-02-2009 9:19 AM

Hi Peter, thanks for prodding me to respond to Arne's question. It came in just as I left on a month-long vacation, and I thought someone else had covered it.

Best regards,

Jim

Les wrote re: NetApp Usable Capacity – Going, Going, Gone
on 10-01-2009 4:43 PM

Its easy to work around the number of spares but to be honest if you take into account an aggregate can consisist of multiple RAID sets, 1 spare per 16TB sounds like good practice for any storage system. As NetApp is thin provisioned the 15% free spacementioned, is the space you would want free in proviosioned space for the server anyway... so to call this wasted space is not fair

The numbers above talk about dual disk protection, Sure NetApp have a 10% WAFL space penalty... but I can create 16 drive raid sets with 2 drive RAID penalty that can withstand two disk failures with no performance impact... and hold months of backup snapshot data online..again without performance penalty...  which simply results in a lower disk footprint...

Without going near de-dupe.... NetApp were slow in updating best practice to represent real world deployments sure... but get over it... those old complex reservation issues are yesterdays FUD.. Look at the total solution

For customers not interested in backup, dont want dual disk protection, have data that does not de-dupe (movies?), dont want to thin  provisioning, then they might have a 10% WAFL penalty with netapp..  however if they live in the real world and are care about any of the above then I suggest NetApp are more efficeient

Geert wrote re: NetApp Usable Capacity – Going, Going, Gone
on 10-02-2009 6:36 PM

Amen...!

PaulC aka Mrbios wrote re: NetApp Usable Capacity – Going, Going, Gone
on 01-25-2010 8:49 AM

Some things I don't like about NetApp OnTap v7.x

The complexity.  The basic Filer web gui isn't too bad.  But having to run maintenance to fix a bug that NetApp refuses to acknowledge: running out of  inodes and being forces to manually purge them with sis:

rsh netapp01 sis start -s /vol/auto ; > /tmp/sisstart.log

This process takes 3.5 days to complete on a 280GB volume.  The volume is online and available for use unless you ran out of inodes.

Also, there's the file limit.  Currently our Max Files are set to 13.8 million for the entire volume and we are close to that limit.  Is 13.8M file limit enterprise?  We are already looking at a different solution like Sun’s ZFS file system and that is after running NetApp for about 2 years.

Also, speed.  When I ran a simple rsh ls -all \vol\auto > index.txt it took more than 3 hours to complete a simple file list.  And that was for a single directory and its sub folders.

When I mounted the nfs volume as cifs and right clicked on the folder from xp to see total files it never finised - that was after running the entire weekend.  

Once you load up netapp the performance really declines.  The tech did not recommend increasing Max Files or inodes.

Complexity: the more you know / learn about NetApp the less I like about.  It is a strange hybrid between unix  / Linux and it proprietary modified OS.  How many admins have the time to learn all this just to operate a storage volume?  When you have a problem things can get very very complicated very quickly.  There’s too much black magic inside NetApp.

dedup = raid6 big deal.  I'm looking forward to using a brand new Dell Equalogic unit at a different account.  I'm tired of NetApp.

Sincerely,

Error (write error extracting inode 12528435, name <name unknown>: Stale NFS file handle)

Add a Comment

(required)  
(optional)
(required)  
Remember Me?

Type the numbers and letters above:
Powered by Community Server (Non-Commercial Edition), by Telligent Systems