Problem calculating workloads on Storage, in this case NetApp

Double Black Diamond

With a centralized storage array, there can be front-side limitations (outside of the array to the host or client) and back-side limitations (the actual disk in the storage array).

The problem that occurs is that from the storage array point of view, the workloads at any given moment in time are random and from the array the details of the workloads are invisible.  So, to alleviate load on the array has to be determined from the client side not the storage side.

Take for example a VMware environment with NFS storage on a NetApp array:image

Each ESX host has some number of VMs and each ESX host is mounting the same export from the NetApp array.

 

Let IA = The Storage Array’s front side IOPS load.
Let hn(t) = The IOPS generated from a particular host at time t and n = number of ESX hosts.

 

The array’s front side IOPS load at time t, equals the sum of IOPS load of each ESX host at time t.

IA(t) = Σ hn(t)

 

An ESX host’s IOPS load at time t, equals the sum of the IOPS of each VM on the host at time t.

h(t) = Σ VMn(t)

 

A VM’s IOPS load at time t, equals the sum of the Read IOPS & Write IOPS on that VM at time t.

VM(t) = R(t) + W(t)

 

The Read IOPS are composed of those well formed Reads and not well formed reads.  “Well formed reads” are reads which will not incur a penalty on the back side of the storage array.  “Not well formed reads” will generate anywhere between 2 and 4 additional IOs on the back side of the storage array.

Let r1 = Well formed IOs

Let r2 = IOs which cause 1 additional IO on the back side of the array.

Let r3 = IOs which cause 2 additional IOs on the back side of the array.

Let r4 = IOs which cause 3 additional IOs on the back side of the array.

Let r5 = IOs which cause 4 additional IOs on the back side of the array.

Then

R(t) = ar1(t) + br2(t) + cr3(t) + dr4(t) + er5(t)

Where a+b+c+d+e = 100% and a>0, b>0, c>0, d>0, e>0

and

W(t) = fw1(t) + gw2(t) + hw3(t) + iw4(t) + jw5(t)

Where f+g+h+i+j = 100% and f>0, g>0, h>0, i>0, j>0

Now for the back side IOPS (and I’m ignoring block size here which would just add a factor into the equation of array block size divided by block size).  The difference is to deal with the additional IOs.

R(t) = ar1(t) + 2br2(t) + 3cr3(t) + 4dr4(t) + 5er5(t)

and

W(t) = fw1(t) + 2gw2(t) + 3hw3(t) + 4iw4(t) + 5jw5(t)

Since the array cannot predetermine the values for a-i, it cannot determine the effects of an additional amount of IO.  Likewise it cannot determine if the host(s) are going to be sending sequential or random IO.  It will trend toward the random given n number of machines concurrently writing and the likelihood of n-1 systems being quite while 1 is sending sequential is low.

Visibility into the host side behaviors from the host side is required.

 

Jim – 10/01/14

@itbycrayon

View Jim Surlow's profile on LinkedIn (I don’t accept general LinkedIn invites – but if you say you read my blog, it will change my mind)

Advertisements

NetApp cDOT ssh key config via CLI

Double Black DiamondI had posted prior on how to configure SSH keys on 7-mode.  I’ve been remiss on getting the SSH keys for cDOT (NetApp’s clustered Data OnTap).

Before I get to the steps, let me list the assumptions:

  1. The steps below will be for a non-root user
  2. Root/Administrator privs are available to the user who is setting this up.
  3. The SSH key for the non-root user has already been generated on the client system.
  4. The SSH key can be done with a copy/paste from something reading the file (e.g. xterm or notepad) into a shell window with the CLI login into the filer (e.g. xterm or puTTY)

The methodology is fairly simple (provided one has the admin privs):

  1. Login into filer via CLI with appropriate privileges.
  2. # go to the security/login section
    • login
  3. # allow for ssh for the user
    • create -username <username> -application ssh -authmethod publickey
  4. # enter the public key
    • create -username <username> -publickey "ssh-rsa <public-key> <username>@<ssh client hostname>"

Jim – 09/29/14

@itbycrayon

View Jim Surlow's profile on LinkedIn (I don’t accept general LinkedIn invites – but if you say you read my blog, it will change my mind)

NetApp 7-mode ssh key config via CLI w/o NFS or CIFS

Double Black DiamondConfiguring NetApp to use SSH with keys without having the root volume holding /etc NFS exported or CIFS shared can be convoluted.

Before I get to the steps, let me list the assumptions:

  1. The steps below will be for a non-root user
  2. Root/Administrator privs are available to the user who is setting this up.
  3. The SSH key for the non-root user has already been generated on the client system.
  4. The SSH key can be done with a copy/paste from something reading the file (e.g. xterm or notepad) into a shell window with the CLI login into the filer (e.g. xterm or puTTY)

Basically, the trick is to setup the empty user directories since there isn’t a command to create directories.  Obviously, with NFS or CIFS, the directory can be made fairly easily.

  1. Login into filer via CLI with appropriate privileges.
  2. # go into advanced mode
    • priv set advanced
  3. # find an empty directory using ls – in some cases, /home/http may be empty.
    • ls /home/http
  4. # check ndmpd status
    • ndmpd status
  5. # if ndmp is not on, turn it on.
    • ndmpd on
  6. # When using ndmpcopy, the shortcut of dropping /vol/<root volume> does not work for the destination
    • ndmpcopy /home/http /vol/<root volume>/etc/sshd/<username>
      ndmpcopy /home/http /vol/<root volume>/etc/sshd/<username>/.ssh
  7. # Create the text file with wrfile and cut and Paste key(s) from your other window, and then ctrl-c
    • wrfile /vol/<root volume>/etc/sshd/<username>/.ssh/authorized_keys
  8. # if ndmpd was off, turn it off.
    • ndmpd off
  9. # ndmpd creates a restore_symboltable file.  For cleanliness, need to remove that.
    • rm /vol/<root volume>/etc/sshd/<username>/restore_symboltable
    • rm /vol/<root volume>/etc/sshd/<username>/.ssh/restore_symboltable

Short Cut (if a user has already been setup then their ssh keys and directory structure could be copied which saves some steps).
Warning: Technically, the permissions (unix or Windows ACLs) are going to follow with the ndmpcopy, so there is a security risk here, if /etc is NFS mounted or CIFS shared. Keep that in mind.

  1. # check ndmpd status
    • ndmpd status
  2. # if ndmp is not on, turn it on.
    • ndmpd on
  3. # When using ndmpcopy, the shortcut of dropping /vol/<root volume> does not work for the destination
    • ndmpcopy /vol/<root volume>/etc/sshd/<exist user with ssh keys>/vol/<root volume>/etc/sshd/<new ssh user>
  4. # Create the text file with wrfile and cut and Paste key(s) from your other window, and then ctrl-c
    • wrfile /vol/<root volume>/etc/sshd/<new ssh username>/.ssh/authorized_keys
  5. # if ndmpd was off, turn it off.
    • ndmpd off
  6. # ndmpd creates a restore_symboltable file.  For cleanliness, need to remove that.
    • rm /vol/<root volume>/etc/sshd/<new ssh username>/restore_symboltable

Jim – 11/18/13

@itbycrayon

View Jim Surlow's profile on LinkedIn (I don’t accept general LinkedIn invites – but if you say you read my blog, it will change my mind)

VMware Linked Clones & NetApp Partial writes fun


Double Black Diamond
NetApp OnTap writes data in 4k-blocks. As long as the writes to NetApp are in 4k increments, all is good.

Let me step back. Given all the fun that I’ve experienced recently, I am going to alter my topic schedule to bring this topic forward while it is hot.

One more step back: The environment consists of a VMware ESX hosts getting their storage via NFS from NetApp storage. When data is not written to a storage system in increments matching its block size, then misalignment occurs. For NetApp, that block size is 4k. If I write 32k, that breaks nicely into 8 4k blocks. If I write 10k, that doesn’t as it ends up being 2 and a half.

20130830-222125.jpg

The misalignment problems has been well documented. VMware has a doc. NetApp has a doc. Other vendors (e.g. HP/3PAR and EMC) reference alignment problems in their docs. The problem is well known – and easily googled. With misalignment, more read & write operations are required because the underlying block is not aligned with the block that is going out to storage.

And yay! VMware addresses it in their VMFS-5 file system by making the blocks 1MB in size. That will divide up nicely. And Microsoft, with Windows 2008, they changed the starting block which helped alignment.

So, all our problems are gone, right??

NO.

VMware introduced linked clones which have a grain size of 512 (see Cormac Hogan’s blog)

Once this issue is discovered, you end up reading more of Cormac’s blog, and then maybe some of Duncan Epping‘s, and maybe some of Jack McLeod, not to mention Knowledge Base articles from both VMware & NetApp. The recommended solution is to use VAAI and let the NetApp handle clones on the backend. And these 512-byte writes are technically “partial” writes and not “misaligned”.

If everything is aligned, then the partial writes require 1 disk read operation (of 4k), an instruction to now wedge the 512 packet in appropriately to the 4k, and 1 write back out. If misalignment exists, then it requires twice the IO operations.

However, if you look at nfsstat -d, you’ll notice that there are a whole bunch of packets in the 0-511 range. Wait! I have partial writes, those show up in the 512-1k. What are all these?

At this point, I don’t entirely know (gee Jim, great piece – no answers), but according to VMware KB 1007909 VMware NFS is doing 84-byte (84!?) writes for NFS locking. Given the count in my 1-511 bytes, NFS locking can’t account for all of those – but what does this do to NetApp’s 4K byte blocks?

Jim – 08/30/13
@itbycrayon

View Jim Surlow's profile on LinkedIn (I don’t accept general LinkedIn invites – but if you say you read my blog, it will change my mind)