Tag Archive for hyperconverged

How often do you upgrade your storage array software?

Upgrades are scary!

Having managed and implemented upgrades on highly available systems such as the old Sun StorageTech line of rebranded HDS USP/VSP arrays back in the day, I can tell you that we did not take upgrades lightly!

Unless there was a very compelling reason for an upgrade, the line taken was always “if it ain’t broke, don’t fix it”, but then we were looking after storage in a massively high security environment where even minor changes were taken very seriously indeed. When it came to storage we didn’t have or need anything very fancy at all, just a some high performance LUNs cut from boat loads of small capacity 15K drives, a bit of copy on write snappage to a set of 3rd party arrays and some dual site synchronous replication. Compared to some of the features and configurations of today, that’s actually pretty minimal!


Now this approach meant that the platform was very stable. Great! It also meant that because we only did upgrades once in a blue moon, the processes were not what you might call streamlined, and the changes made by each upgrade were typically numerous, thereby running a pretty decent risk of something breaking. It was also key to ensure that we checked the compatibility matrix for every release to ensure that the 3rd party arrays would continue to function.

They say that software is eating the world. I’d say it seems the same could be reasonably said for the hardware storage vendors we saw at Storage Field Day 8, as they seem to mostly be moving towards more Agile development models. Little and often means lower risk for every upgrade as there are fewer changes. New features and improvements can be released on a more regular basis (especially those taking advantage of flash technologies which are changing by the minute!). A significant number of the vendors we saw had internal release cycles of between 2 and 4 weeks and public release cycles of 2-8 weeks!

In the case of one vendor, Pure Storage, they are not only releasing code every couple of weeks, but customers have obviously taken this new approach on board with vigour! Around 91% of Pure’s customer base is currently using an array software version 8 months old or less. An impressive stat indeed!

This is Hardware. Software runs on it...

This is Hardware. Software runs on it…

This sounds like a relatively risky approach, but they mitigate it to a great extent by using the telemetric data uploaded every 30 seconds to their Pure1 SaaS management platform from customer arrays, building up a picture of both individual customers and their customer base as a whole. They then use their fingerprint engine to proactively pre-check every customer array to find out which may be susceptible to any potential defect in a new software release. Arrays which pass this pre-check have the upgrades rolled out remotely by Pure Storage engineers on a group by group basis to minimise risk. Obviously this is also done in conjunction and agreement with customers change windows etc. You wouldn’t expect your controllers to start failing over without any notice! 🙂

If I’m honest I am torn in two about this approach. The ancient storage curmudgeon in me says an array should just sit in the corner of the room quietly ticking away with minimal risk to availability and data durability (at least to known bugs anyway!). This new style of approach means that it doesn’t matter how many redundant bits of that rusty tin you have, as Scott D Lowe said last week:

That said we need to be realistic, we don’t live in ye olde world any more. Every part of the industry is moving towards more agile development techniques, driven largely by customer and consumer demand. If the “traditional” storage industry doesn’t follow suit, it risks being left behind by newer technologies such as SDS and hyper convergence.

There is one other key benefit to this deployment method which I haven’t mentioned of course; those big scary upgrades of the past now become minor updates, and the processes we wrap around them as fleshy sacks of water become mundane. That does sound quite tempting!

Perhaps upgrades aren’t that scary any more?

I’d love to hear your opinions either way, feel free to fire me a comment on twitter!

Further Reading
Some of the other SFD8 delegates have their own takes on the presentation we saw. Check them out here:

Dan Frithhttp://www.penguinpunk.net/blog/pure-storage-orange-is-the-new-black-now-what/

Scott D. Lowehttp://www.enterprisestorageguide.com/overcoming-new-vendor-risk-pure-storages-techniques

Pure1 Overview at SFD8

Disclaimer/Disclosure: My flights, accommodation, meals, etc, at Storage Field Day 8 were provided by Tech Field Day, but there was no expectation or request for me to write about any of the vendors products or services and I was not compensated in any way for my time at the event.

Tech Startup Spotlight – Hedvig


After posting this comment last week, I thought it might be worth following up with a quick post. I’ll be honest and say that until Friday I hadn’t actually heard of Hedvig, but I was invited along by the folks at Tech Field Day to attend a Webex with this up and coming distributed storage company, who have recently raised $18 million in their Series B funding round, having only come out of stealth in March 2015.

Hedvig are a “Software Defined Storage” company, but in their own words they are not YASS (Yet Another Storage Solution). Their new solution has been in development for a number of years by their founder and CEO Avinash Lakshman; the guy who invented Cassandra at Facebook as well as Amazon Dynamo, so a chap who knows about designing distributed systems! It’s based around a software only distributed storage architecture, which supports both hyper-converged and traditional infrastructure models.

It’s still pretty early days, but apparently has been tested to up to 1000 nodes in a single cluster, with about 20 Petabytes, so it would appear to definitely be reasonably scalable! 🙂 It’s also elastic, as it is designed to be able to shrink by evacuating nodes, as well as add more. When you get to those kind of scales, power can become a major part to your cost to serve, so it’s interesting to note that both x86 and ARM hardware are supported in the initial release, though none of their customers are actually using the latter as yet.

In terms of features and functionality, so far it appears to have all the usual gubbins such as thin provisioning, compression, global deduplication, multi-site replication with up to 6 copies, etc; all included within the standard price. There is no specific HCL from a hardware support perspective, which in some ways could be good as it’s flexible, but in others it risks being a thorn in their side for future support. They will provide recommendations during the sales cycle though (e.g. 20 cores / 64GB RAM, 2 SSDs for journalling and metadata per node), but ultimately it’s the customer’s choice on what they run. Multiple hypervisors are supported, though I saw no mention of VAAI support just yet.

The software supports auto-tiering via two methods, with hot blocks being moved on demand, and a 24/7 background housekeeping process which reshuffles storage at non-busy times. All of this is fully automated with no need for admin input (something which many admins will love, and others will probably freak out about!). This is driven by their philosophy or requiring as little human intervention as possible. A noteworthy goal in light of the modern IT trend of individuals often being responsible for concurrently managing significantly more infrastructure than our technical forefathers! (See Cats vs Chickens).

Where things start to get interesting though is when it comes to the file system itself. It seems that the software can present block, file and object storage, but the underlying file system is actually based on key-value pairs. (Looks like Jeff Layton wasn’t too far off with this article from 2014) They didn’t go into a great deal of detail on the subject, but their architecture overview says:

“The Hedvig Storage Service operates as an optimized key value store and is responsible for writing data directly to the storage media. It captures all random writes into the system, sequentially ordering them into a log structured format that flushes sequential writes to disk.”

Supported Access Protocols
Block – iSCSI and Cinder
File – NFS (SMB coming in future release)
Object – S3 or SWIFT APIs

Working for a service provider, my first thought is generally a version of “Can I multi-tenant it securely, whilst ensuring consistent performance for all tenants?”. Neither multi-tenancy of the file access protocols (e.g. attaching the array to multiple domains for different security domains per volume) nor storage performance QoS are currently possible as yet, however I understand that Hedvig are looking at these in their roadmap.

So, a few thoughts to close… Well they definitely seem to be a really interesting storage company, and I’m fascinated to find out more as to how their key-value filesystem works in detail.  I’d suggest they’re not quite there yet from a service provider perspective, but for private clouds in the the enterprise market, mixed hypervisor environments, and big data analytics, they definitely have something interesting to bring to the table. I’ll certainly be keeping my eye on them in the future.

For those wanting to find out a bit more, they have an architectural white paper and datasheet on their website.

%d bloggers like this: