Problem calculating workloads on Storage, in this case NetApp

Double Black Diamond

With a centralized storage array, there can be front-side limitations (outside of the array to the host or client) and back-side limitations (the actual disk in the storage array).

The problem that occurs is that from the storage array point of view, the workloads at any given moment in time are random and from the array the details of the workloads are invisible.  So, to alleviate load on the array has to be determined from the client side not the storage side.

Take for example a VMware environment with NFS storage on a NetApp array:image

Each ESX host has some number of VMs and each ESX host is mounting the same export from the NetApp array.

 

Let IA = The Storage Array’s front side IOPS load.
Let hn(t) = The IOPS generated from a particular host at time t and n = number of ESX hosts.

 

The array’s front side IOPS load at time t, equals the sum of IOPS load of each ESX host at time t.

IA(t) = Σ hn(t)

 

An ESX host’s IOPS load at time t, equals the sum of the IOPS of each VM on the host at time t.

h(t) = Σ VMn(t)

 

A VM’s IOPS load at time t, equals the sum of the Read IOPS & Write IOPS on that VM at time t.

VM(t) = R(t) + W(t)

 

The Read IOPS are composed of those well formed Reads and not well formed reads.  “Well formed reads” are reads which will not incur a penalty on the back side of the storage array.  “Not well formed reads” will generate anywhere between 2 and 4 additional IOs on the back side of the storage array.

Let r1 = Well formed IOs

Let r2 = IOs which cause 1 additional IO on the back side of the array.

Let r3 = IOs which cause 2 additional IOs on the back side of the array.

Let r4 = IOs which cause 3 additional IOs on the back side of the array.

Let r5 = IOs which cause 4 additional IOs on the back side of the array.

Then

R(t) = ar1(t) + br2(t) + cr3(t) + dr4(t) + er5(t)

Where a+b+c+d+e = 100% and a>0, b>0, c>0, d>0, e>0

and

W(t) = fw1(t) + gw2(t) + hw3(t) + iw4(t) + jw5(t)

Where f+g+h+i+j = 100% and f>0, g>0, h>0, i>0, j>0

Now for the back side IOPS (and I’m ignoring block size here which would just add a factor into the equation of array block size divided by block size).  The difference is to deal with the additional IOs.

R(t) = ar1(t) + 2br2(t) + 3cr3(t) + 4dr4(t) + 5er5(t)

and

W(t) = fw1(t) + 2gw2(t) + 3hw3(t) + 4iw4(t) + 5jw5(t)

Since the array cannot predetermine the values for a-i, it cannot determine the effects of an additional amount of IO.  Likewise it cannot determine if the host(s) are going to be sending sequential or random IO.  It will trend toward the random given n number of machines concurrently writing and the likelihood of n-1 systems being quite while 1 is sending sequential is low.

Visibility into the host side behaviors from the host side is required.

 

Jim – 10/01/14

@itbycrayon

View Jim Surlow's profile on LinkedIn (I don’t accept general LinkedIn invites – but if you say you read my blog, it will change my mind)

IOPS, Spinning Disk, and Performance – What’s the catch?

Black Diamond
For a quick introduction – IOPS means Input Output (operations) per second.  Every hard drive has certain IO performance.  So, forgive the oversimplification, add additional disks, one gets additional IOPS which means one gets better performance.

Now, generally speaking, I hate IOPS as a performance characteristic.  I hate them, because, IOPS can be read or write and sequential or random and of different IO sizes.  Unless one is trying to tune for a specific application and is dedicating specific disk drives to the application, the measurement breaks down as the description of the assumed utilization lacks accuracy.  For instance, assume that it has random reads & writes, but then the backups kick off and that ends up being a huge sequential read for long durations.

But, I digress.

20130625-211646.jpg Every hard drive has an IOPS rating whether SAS, SATA, or FibreChannel or 7200, 10000 or 15000 RPM.  (see wikipedia for a sample).  When a RAID set is established, drives of the same geometry (speed & size) are put together to stripe the data across the drives. For simplicity sake, lets say one uses a RAID5 set with 6 drives:  that is the capacity of 1 drive is used for error (parity) checking and 5 for data.  And continuing the example, assume that these are 1TB (terabyte) drives with 100 IOPS per drive.  So, one has 5 TB of capacity and 500 IOPS.  [Let’s imagine these are read IOPS and not write, so I don’t have to get into parity calculations, etc. etc.].    If I could add a drive to the RAID set, then I get another TB and another 100 IOPS.  Nice and linear.

20130625-211611.jpg
And, my IOPS per TB are constant.  [Again, to simplify, I’m going to assume that it falls in the same RAID set and so I don’t have to consider more parity drive space].  So, none of this should be earth shaking.

20130625-211636.jpg
The huge implication here is:  To increase performance, add more disks.   The more disks, the more IOPS, everyone’s happy.  However, that assumes that consumption (and more importantly IOPS demand) has not increased.  The graph on the right looks consistent with the graphs that we saw earlier.

20130625-211653.jpgThe problem here is that if one adds disks, which adds capacity, and then that capacity is consumed at the same IO rate as the original disk space, the performance curve looks like the graph on the left.  If I’m consuming 100 IOPS per TB and I have 5 TB, that is 500 IOPS of demand.  So, I add a 1TB disk and now I have 600 IOPS w/5TB of used capacity on 6TB of disk.  So, I can spread that out and yippie, those 5TBs can get 120IOPS.  But, if I also say, “hey, I got another TB of disk space” and then consume it, then I’m back to where I started and am still constrained at 100IOPS/TB.  So, what good is this?

20130625-211659.jpgThe assumption is that one is adding to a heterogenous array i.e. multi-purpose (maybe multi-user or multi-system).  So, by being multi-purpose, the usage curve should hopefully become more normalized.  If the usage is more homogenous, e.g. everyone who needs fast performance, so we move them from the slow array to the fast array – well that just means that the fast users are competing with other fast users.

Just like on the NASCAR track for time trials, if I have one race car start and then send another race car when the 1st is half way around the track, I’m probably not going to have contention.  If one customer wants high performance in the evening and the other in the business day, I probably have no contention.

However, on race day after the start, all the cars are congested and some can’t go as fast they want because someone is slow in front of them – gee, and we moved them off the freeway onto the race track for just this reason.   Well, on the storage array, this is like everyone running end-of-the-month reports, well, at the end-of-the month.

I need another analogy for the heterogenous use.  Imagine a road that one guy uses daily, but his neighbor only uses it monthly.  However, the neighbor still needs use of a road, so he pays for the consumption as well.  Overall, there may not be conflict for the road resource – as opposed to, if both used it daily.

So, yes, overall – adding disks does add performance capacity.  And without knowing usage characteristics, the generality of adding disks still holds.  Why?  Because no one complains that the disks are going too fast, they only complain when it is too slow.  There is still the mindset that one buys disk for capacity and not for performance.  And then once performance is an issue, the complaints start.  So, adding disks, to a random workload means that the bell curve should get smoother over all.  This won’t end all the headaches, but should minimize them by minimizing the number of potential conflicts.

Let me know what you think
Jim
@itbycrayon