Simplivity Storage – hunting VMs when low on storage.

With HPE Simplivity, each VM has storage mirrored on another specific host in the cluster. As an example, in a 3 host VMware cluster, a VM on ESXi host #1 will have a mirrored copy of the storage on ESXi host #2, but no other data on ESXi host #3. Unlike other HyperConverged Infrastructure products in the marketplace, Simplivity mirrors the VM as opposed to striping the VM across other nodes in the cluster.

This has its advantages and disadvantages.

One result is that each node in the cluster ends up with different storage consumption, as the blend of VMs on a host vary. In the following example:

  • ESXi Host #1 could have VMs: a, d, e, f
  • ESXi Host #2 could have VMs: a, b, c, e
  • ESXi Host #3 could have VMs: b, c, d, f

As a result, at any given time, 2 nodes will be closer to full than the remainder. “Closer” could be significant or insignificant, but the point is relevent to the script that I’ll include below.

The Simplivity storage environment is controlled by what are call Omnistack Virtual Controllers (“OVCs” for short). Each ESXi host has a dedicated OVC, but the OVCs in a VMware cluster work together.

(I don’t know the Simplivity interface from Hyper-V, so what follows is VMware specific.) From vSphere, it is cumbersome to determine where Simplivity stores copies of a VM. Clearly, if ESXi Host 1 is nearing full, one will see a VMware alarm and one could go to the host and see the VMs with a primary copy of storage on that host – but what about a VM with a secondary (redundant mirrored) copy? There would have to be some deciphering of the DRS rules.

Simplivity offers a command line tool through the OVC which will list all the VMs in the cluster and where the primary and secondary copies are stored and the VM size: dsv-balance-manual

The drawbacks of this tool are: first, one can only see the size of the VM, one cannot see the size of the VM AND all its associated backups. Secondly, it does not report any remote backups copied from another cluster to the hosts, if any.

When storage runs tight, removing backups is the most likely path forward. Having a total size for a VM which would include its associated backups would be very helpful – but with de-duplication and compression across VMs on the host while backups expiring at different intervals this becomes very difficult to calculate and would only be valid in real time.

The first step to gaining space would be to search the backups on the cluster to determine if there are any backups which lack an expiration date, and remove as necessary.

The second step might be to identify which VMs are shared on the 2 most full hosts.

It is fairly easy to eyeball the output to dsv-balance-manual, but when one runs it often and if there are many VMs, human error can kick in. I wrote the following CLI command pipeline to do this:

node=(`sudo /var/tmp/build/dsv/dsv-balance-show --ShowNodeIndex | sed 's/\(.B \)/ \1/;s/\.\([0-9][0-9]\) TB/\10 GB/' | awk '/^\| Node [0-9]/ {print $3,$15}' | sort -nr +1 | awk 'BEGIN {z=10^6}{b=a;a=$1;if ($1 < z) {z=$1}} END {print a-z+1,b-z+1}'`) ; cl=(`svt-federation-show | awk -F"\| " '/Alive/ {if ($3 ~ /^[a-zA-Z]/) {x=$3};{a=x} ; if ($4 ~ /^[a-zA-Z]/) {y=$4};{b=y} ; print a,b,$9}' | grep \`ifconfig eth0 | awk '/inet/ {print $2}'\``) ; sudo /var/tmp/build/dsv/dsv-balance-manual --datacenter ${cl[0]} --cluster ${cl[1]} > /dev/null ; awk -F, -v n=${node[0]} -v m=${node[1]}  '/\]/ {offset=2 ; if (($(n+offset) ~ /s|p/) && ($(m+offset) ~ /s|p/)) print $(NF-2),$(NF-1)}' /tmp/balance/replica_distribution_file_${cl[0]}.csv | sort -n 

Had I known it would be so long, I probably would have written a script, but then the script would have to be pushed to all OVCs on all the hosts and with the Simplivity upgrade procedure, the OVCs would be wiped out and the scripts re-written.

Documentation:

This needs to be run on an OVC in the cluster that is short on space (i.e. won’t work across clusters).

Create the node array – 2 entries with the 2 nodes in the cluster with the least available space.

node=(`

Run Simplivity command as root to determine which nodes lack space, this will include space remaining (as opposed to consumed).

sudo /var/tmp/build/dsv/dsv-balance-show --ShowNodeIndex

Add a space between digits of storage and label of storage, and convert TB to GB by removing the decimal point and adding a zero

| sed 's/\(.B \)/ \1/;s/\.\([0-9][0-9]\) TB/\10 GB/'

Find the lines with only details about the nodes (throw out the headers) and only print the IP address and the storage consumed (the label above is now discarded).

| awk '/^\| Node [0-9]/ {print $3,$15}'

Sort numerically by storage consumed in descending order.

| sort -nr +1

Print the last 2 entries and reset the node number so that it counts from 1 – the output from the earlier Simplivity command depending on retired equipment, might not start from 1. (Unsure how this behaves if run on a cluster with less than 1 host – but one would not need to run this script if there was only 1 host).

| awk 'BEGIN {z=10^6}{b=a;a=$1;if ($1 < z) {z=$1}} END {print a-z+1,b-z+1}'

This completes the array. The contents of the array are 2 numbers reflecting the nodes which have the least space available.

`);

Set the datacenter and cluster variables. This is a lot of code to include what is already known, but will reduce human error of misspellings. Set the “cl” array (cluster).

cl=(`

Run the Simplivity command to show all the nodes in the federation.

svt-federation-show

Find only the lines that include nodes. Given the output from above, if the datacenter field (#3) is empty, print what was in the line before and if the cluster field (#4) is empty, print what was in the line before. Finally, only print datacenter, cluster, and management IP.

| awk -F"\| " '/Alive/ {if ($3 ~ /^[a-zA-Z]/) {x=$3};{a=x} ; if ($4 ~ /^[a-zA-Z]/) {y=$4};{b=y} ; print a,b,$9}'

Search the output of the above, with output of what follows.

| grep \`

Determine the IP of the management IP of this OVC.

ifconfig eth0 | awk '/inet/ {print $2}'\`

Finalize the array with datacenter, cluster, and IP (the latter won’t be used).

`) ;

Run the Simplivity command to list the VMs and add the datacenter and cluster information so that it can run unattended, dump the output to /dev/null, as an output file will be left behind.

sudo /var/tmp/build/dsv/dsv-balance-manual –datacenter ${cl[0]} –cluster ${cl[1]} > /dev/null ;

sudo /var/tmp/build/dsv/dsv-balance-manual --datacenter ${cl[0]} --cluster ${cl[1]} > /dev/null ;

Parse the output file: /tmp/balance/replica_distribution_file_<cluster name>.csv Use the 2 variables, m & n, to represent the 2 nodes to search for. The CSV is offset by 2 other data points before the node data is included and a “p” or “s” for primary or secondary copy. The number of nodes will determine the number of columns. The 2nd to last column is the VM name shown by $(NF-1). The 3rd to last column is the VM size.

awk -F, -v n=${node[0]} -v m=${node[1]} '/\]/ {offset=2 ; if (($(n+offset) ~ /s|p/) && ($(m+offset) ~ /s|p/)) print $(NF-2),$(NF-1)}' /tmp/balance/replica_distribution_file_${cl[0]}.csv

Then sort the output numerically in ascending fashion, so the largest VMs are at the bottom.

| sort -n 

Given all this, it could be scripted to look cleaner and could be made more tidy.

Notes from #NTAPInsight 2014

Green Ball

After a partial week at NetApp 2014 Insight US, here are my thoughts:
(full disclosure:  I was a presenter of one session at the conference)
  1. Keynote thought
  2. OnTap 8.3 announcement
  3. Hybrid Cloud
    1. Data is state-ful, unlike (cloud) computing
  4. Data locality
  5. Different UNIX variants – Different Cloud
  6. Laundry services similar to cloud computing (Jay Kidd / NA CTO)
Tom Mendoza (NetApp Vice Chairman) was fantastic in his keynote.  He focused on culture and wanting to build a culture of trust & candor.  CIOs understand every company is going to have issues, the question will be does the CIO of the customer trust the vendor to be there when there is a problem.
Lots of talk about OnTap 8.3 – though the fact that it is RC1 and not GA is disappointing.   Didn’t hear anyone reference that the 8.3 is a Release Candidate.  8.3 provides full feature parity with 7-mode.  There was little discussion about 7-mode, except for how to move off 7-mode (7-mode transition tool).  7-mode transition still appears to be a large effort.  For, 7MTT, the key term is “tool”.
The key focus in the keynotes was “Hybrid Cloud”.  One of the key takeaways is the need for data locality.  The data is ‘state-ful’ as opposed to cloud computing which is ‘stateless’ — in the sense that the resource need can be metered, but data is not.  So, when moving from on-prem to cloud, data would have to be replicated completely between 2.   Or more so, if you are working between clouds, or maybe between clouds in different countries, the full data set has to be replicated.  The concern is that government entities (Snowden effect) will require data to be housed in respective countries.  This now becomes the digital equivalent of import/export laws and regulations.
With the notion of different clouds, it reminds me of all the different UNIX variants.  We had Solaris boxes and we had HP-UX boxes and we had DEC boxes and we struggled moving data between.  Some were big endian, some little endian.  So, binaries were incompatible.
Finally and irreverently during Jay Kidd’s (NetApp CTO) presentation, my mind wandered when thinking about cloud computing analogies.  Never noticed before how metered cloud computing is so much like washing machines at the laundry mat – pay per use.

 

Jim – 10/30/14 @itbycrayon View Jim Surlow's profile on LinkedIn (I don’t accept general LinkedIn invites – but if you say you read my blog, it will change my mind)

Shellshock / Bashbug quick check

Black Diamond

Given the latest news on the Shellshock aka Bashbug vulnerability, I modified a public command line check.
Backstory:  Unix systems (includes Linux & the Mac OS, OSX) have shells for their command line windows.  Bash is common.  A vulnerability was found and this has fairly large implications.   More detail is available online:

My modification to the command line script is:

Jim – 09/26/14 @itbycrayon View Jim Surlow's profile on LinkedIn (I don’t accept general LinkedIn invites – but if you say you read my blog, it will change my mind)

Lack of Tech Workforce Diversity in Silicon Valley – my $0.02

Green Ball

Earlier today, on a Wall St. Journal tech blog stats were published showing that a large majority of workers at well known Silicon Valley tech companies are white or asian.  This follows some news of the last several weeks where tech companies are acknowledging this.
The question is:  Is this a problem?
And the next:  If so, can it be solved?
And lastly:  If so, what is the one solution or what are the multiple solutions to the problem?
I’d argue that it is a problem.  The world is in a knowledge economy and the more Americans that can participate in the knowledge economy, the better for America.  The lack of diversity reflects a lack of participation in the field and thus portions of the country not participating in the economy, as full as possible.
Yes, there is extrapolation going on here – large companies predominantly housed in Silicon Valley is being used as a proxy for all tech, and tech being a proxy for the best portions of the economy in the nation.
But, when they say that small companies grow the economy, it isn’t someone selling stamps or vitamins, it is companies that have venture capital like the beginnings of Facebook and such.
Tech companies start with some tech guys with an idea.  They borrow.  Then they go for venture capital.  Venture Capitalists want to ensure that the plan is sound and/or that they have some proven leadership.  The companies try to staff up with the best staff they can.
Meanwhile, the tech companies are in fierce competition for talent (except when they collude to keep wages down). So, tech companies in Silicon Valley have glorious headquarters and are willing to shuttle staff down from San Francisco.
So, when selecting candidates from college, what would tech companies look for?  Graduates with STEM degrees, of course.  And what does that diversity look like?  According to this site, , in 2011 75% of grads in Comp Sci were White or Asian.
In addition, those who start college pursuing STEM degrees, under represented minorities are less successful in completing those programs than others.  And this can be tied to how they perform in high school.  Minorities are known not to perform as well. In 2013 it was said, “This year only 15 percent of blacks and 23 percent of Latinos met or exceeded the SAT benchmark for college and career readiness.”
So, this does not really seem to be a problem with the tech companies.  You don’t hear how NFL teams aren’t recruiting enough from the Ivy League.  Going back to the question:  Is this a problem?  Yes.  More specifically, is it the tech companies’ problem?  No.
Can the problem of minority participation in tech be solved?  Maybe.  It needs to be done in earlier years.  In high school and earlier, logic and cause & effect, need to be taught.  Taking on the subject of the problem with public schools is beyond this blog, but the point is that the diversity in tech outcomes are results of issues long before it gets to employers.

Off soapbox,

Jim – 06/19/14 @itbycrayon View Jim Surlow's profile on LinkedIn (I don’t accept general LinkedIn invites – but if you say you read my blog, it will change my mind)

Notes from #EMCWorld 2014

Green Ball

After days at EMC World, here’s what sticks out in my mind:
  1. 3rd Platform Paradigm
  2. ViPR (& VMAXce)
  3. Electricity model
  4. Just like school
  5. The Venetian
  6. VMware
  7. Backups
  8. Other folks

Side note:  There was talk of SAP HANA, Hadoop, & Pivotal.  The place that I typically play in is Storage and not that space, so I’m going to ignore their emphasis there.

  1. 3rd Platform — This is at least the 2nd year that EMCWorld has had mention of this and I really like the tie into the concepts from the Innovator’s Dilemna i.e. that there are technology waves and that mobile users and devices are the 3rd wave of consumers, following the 2nd wave, PC users, and 1st wave mainframes.  Emphasis on this, in my opinion is a strength.

  2. ViPR (&VMAXce) — Lots of talk about ViPR 2.0 their abstraction layer & coding into storage.  One of my unknowns was how does EMC ViPR compare with VMware Orchestrator.  Turns out, ViPR needs to talk northbound into the VMware layer or OpenStack Layer while VMware Orchestrator would need to talk southbound to ViPR or directly to arrays.  Last year, there were lots of mentions of VMAXce, this year I didn’t hear anything.  So, VMAXce makes it easier to provision on predetermined tiers and such (Cloud provisioning portal, etc).  If EMC struggled getting that right, how will smaller firms deal with their own coding to do ViPR?

  3. Electricity model for storage models (Utility model) — Dovetailing to the above, one presenter made analogies to the Electricity model.  Right now, it is a utility 110/220V.  Whereas before AC was standard, companies needed their own electricity generation (Singer sewing machines as an example).  So, if the future is to scale, we need less complexity and thus, fewer options.  This ties back to ViPR and also ties into performance based criteria to storage, not just capacity.  Redundancy & Data Protection would be considered as givens.  But, as we can see from #2, it doesn’t appear as simple as it sounds.

  4. Just like school — Spoke to one new attendee to EMCworld and he compared it to school — racing between classes, information overload.  I think one can add parties at night, concert midweek, sleep or bail on the last day of the week (attendance at breakfast this AM was lower and projected to be so).

  5. Las Vegas Venetian / Sand Convention Center — Every year, I’m blown away with the logistics of feeding 15,000 people for breakfast and lunch.  Herd everyone in for breakfast, clean up, truck the stuff out, truck the lunch stuff in, prep the buffet tables, replace table cloths and herd the lunch crowd in.  Very impressive.  (I said in #4 that breakfast this AM, following last night’s concert, was projected to be less — they cut the dining area in half it appeared).

  6. VMware — I always find it interesting that at EMCworld, EMC touts VMware as an integral part of their company (I think they own 80%).  When you talk to VMware staff, they sound independent — as they wish to be storage agnostic.  However, it seems that the cultural differences between the two and other barriers are coming down a bit more.  Seems that VMware is really more a part of EMC than they used to be.

  7. Backups — During one of the backup presentations, there was a nice slide on the data protection spectrum.  Continuous Availability (immediate w/ VPlex), Replication (seconds, w/RPAs), Snapshots (minutes, w/array based), Backups (hours w/Avamar or Networker), Archive (days w/Atmos).  Seeing with those point solutions added clarity.

  8. Other folks in town — While EMCWorld was at the Venetian, I saw that NetApp was at the Aria & I heard that Symantec was at Caesar’s Palace.  I guess the week in Vegas was a temporary tech conference.

Jim – 05/08/14 @itbycrayon View Jim Surlow's profile on LinkedIn 
(I don’t accept general LinkedIn invites – but if you say you read my blog, it will change my mind)