VMware Linked Clones & NetApp Partial writes fun


Double Black Diamond
NetApp OnTap writes data in 4k-blocks. As long as the writes to NetApp are in 4k increments, all is good.

Let me step back. Given all the fun that I’ve experienced recently, I am going to alter my topic schedule to bring this topic forward while it is hot.

One more step back: The environment consists of a VMware ESX hosts getting their storage via NFS from NetApp storage. When data is not written to a storage system in increments matching its block size, then misalignment occurs. For NetApp, that block size is 4k. If I write 32k, that breaks nicely into 8 4k blocks. If I write 10k, that doesn’t as it ends up being 2 and a half.

20130830-222125.jpg

The misalignment problems has been well documented. VMware has a doc. NetApp has a doc. Other vendors (e.g. HP/3PAR and EMC) reference alignment problems in their docs. The problem is well known – and easily googled. With misalignment, more read & write operations are required because the underlying block is not aligned with the block that is going out to storage.

And yay! VMware addresses it in their VMFS-5 file system by making the blocks 1MB in size. That will divide up nicely. And Microsoft, with Windows 2008, they changed the starting block which helped alignment.

So, all our problems are gone, right??

NO.

VMware introduced linked clones which have a grain size of 512 (see Cormac Hogan’s blog)

Once this issue is discovered, you end up reading more of Cormac’s blog, and then maybe some of Duncan Epping‘s, and maybe some of Jack McLeod, not to mention Knowledge Base articles from both VMware & NetApp. The recommended solution is to use VAAI and let the NetApp handle clones on the backend. And these 512-byte writes are technically “partial” writes and not “misaligned”.

If everything is aligned, then the partial writes require 1 disk read operation (of 4k), an instruction to now wedge the 512 packet in appropriately to the 4k, and 1 write back out. If misalignment exists, then it requires twice the IO operations.

However, if you look at nfsstat -d, you’ll notice that there are a whole bunch of packets in the 0-511 range. Wait! I have partial writes, those show up in the 512-1k. What are all these?

At this point, I don’t entirely know (gee Jim, great piece – no answers), but according to VMware KB 1007909 VMware NFS is doing 84-byte (84!?) writes for NFS locking. Given the count in my 1-511 bytes, NFS locking can’t account for all of those – but what does this do to NetApp’s 4K byte blocks?

Jim – 08/30/13
@itbycrayon

View Jim Surlow's profile on LinkedIn (I don’t accept general LinkedIn invites – but if you say you read my blog, it will change my mind)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s