NetApp de-dupe, flexclone, fragmentation, and reallocation

Double Black DiamondSince NetApp has their own filesystem, WAFL, disk layouts and fragmentation behave differently than on traditional filesystems. Before NetApp introduced de-duplication, fragmentation was less common. Real quickly let me deviate a bit and describe some notions of WAFL:

WAFL stands for “Write-Anyway File Layout” The notion being that if inodes could be saved with part of the file, then there wouldn’t need to be a file allocation table or inode table at one part of the volume and then data at another.

Another goal was to minimize disk head seeks – so data through a RAID set would be striped across all the disks in the set at essentially the same sectors. If I have 4 data disks and a parity disk (keeping it simple here), the stripe of data would be 25% on one disk, 25% on the next, and so on, and parity on the last. Assuming I have 100 sectors (for ease of calculation), I could put data on disk one at sectors 0-24 and on the next disk 25-49 and on the next 10-34 and on the next one 95-100 and 0-3, then 88, then 87, then 5-10, etc. Well, that would be rather slow as the head jumps around the disk platter searching for those. NetApp tries to keep all the data in one spot on each disk.

So, data is now assembled in nice stripes and if the writes are large, then a file takes up nice consecutive blocks on the stripe. The writes are in 4k blocks, but with a much larger file, they line up nice and sequential.

From their Technical Report on the subject, they state that the goal is to line up data for nice sequential reads. So, when a read request takes place, a group of blocks can be pre-fetched in anticipation of serving up a larger sequence.

Well, with NetApp FlexClone (virtual clones) and NetApp ASIS (de-duplication), the data gets a bit more jumbled. (Same with writes of non-contiguous blocks). Here’s why:

If I have data that doesn’t live together, holes get created. Speaking of non-contiguous blocks, I’ve worked in areas with Oracle databases over NFS on NetApp. When Oracle does a write of a group of blocks, those may not be logically be grouped (like a text file might be). So, after some of the data is aged off, holes are created as block 1 might be kept, but not block 2 & 3.

With FlexClone, the snapshot is made writeable and then future writes to a volume are tracked separately and may get aged off separately. More clear is the notion of de-duplication. NetApp de-dupe is post-process so it accepts an entire write of a file and then later compares the contents of that file with other files on the system. It will then remove any duplicated blocks.

So, if I write a file and blocks 2, 3, & 4 match another file and blocks 1 & 5 do not, when the de-dupe process sweeps the file, it is going to remove those blocks 2,3, & 4. So, that stripe now has holes and that will impact the pre-fetch operation on the next read.

In the past, I only ran the reallocate command when I added disk to the disk aggregate, but now while searching through performance enhancing methods on NetApp, I’ve realized the impact of de-dupe on fragmentation.

Jim – 09/16/13

View Jim Surlow's profile on LinkedIn (I don’t accept general LinkedIn invites – but if you say you read my blog, it will change my mind)