Tag Archive for Pure1

Pure Storage Diversity – Time for the All Flash Vendor to go All File

It was only a couple of weeks ago I was saying to some colleagues that now Pure have finished with the whole IPO business, I thought they needed to diversify their portfolio a bit beyond the straight forward AFA.

I am very pleased to say they must have pre-read my mind and that’s exactly what they’ve announced today! 🙂

Not only is their new Pure FlashBlade platform designed to provide pretty much every file type (and object) you might require for your applications and users, it is also Scale Out, which is a key feature I am looking for more and more these days when researching new products for my customers.

FlashBlade.png

Not only is this a really interesting change in direction for Pure, but I see it as a pretty nifty bit of kit in and of itself! You would hope so, as Pure have been working on it in secret for the past two and a half years… 😮

For starters Pure have mixed and matched both Intel and ARM chips on every single blade, with different computational tasks being assigned to different chips, and a bit of FPGA technology thrown in for good measure. The latter being primarily used as a programmable data mover between the different elements of the blade, so as future flash technology becomes available, the FPGA can simply be re-coded instead of requiring total redesign / replacement with every generation. This will enable Pure to change out their flash as often as every 6 months in their production plants, taking maximum advantage of the falling prices in the NAND market.

This chip design was created to use the ARM processors as embedded and linked to the FPGAs, which effectively gives you a software overlay / management function, along with other low intensity, multi-threaded processes. The significant computational power of the Intel chips, particularly for single threaded workloads, rounds out the compute. From a nerdy technologists standpoint, all I can say is schweeeet!

The numbers they are suggesting are pretty impressive too! Each 4u appliance is capable of scaling out linearly with the following stats:

  • Up to 15x 8TB or 52TB blades, for a maximum of 1.6PB per 4u chassis
  • Up to 15GB/sec throughput per chassis, though I believe this is 4K 100% read, and real numbers might be around 1/3 of this.
  • 40Gbps ethernet out, with 2x 10Gbps per blade, connected to a broadcom based, custom, resilient backplane / switch layer within each chassis. Scaling to multiple chassis would require you to provide ToR switch ports for east-west traffic between chassis.
  • Overlaying this is Pure’s custom SDN code, which securely separates internal and external traffic, and uses multicast for auto-discovery of new devices.
  • Integrated DRAM and NV-RAM on every blade, along with PCIe access to the NAND.

The blades themselves look something like this:

blade.png

In terms of protocols, it will support NFSv3 out of the box on GA, with SMB and object storage over S3 supported shortly afterward. My understanding is that initial S3 support will be limited to basic commands, PUT, GET, etc, and more advanced feature support is in the pipeline. The initial release seems to be primarily targetted at the filer market, with object being the underlying architecture, but not the main event. As this support is built out later, the object offering could become more compelling.

The data itself is distributed and protected through the use of N+2 erasure coding, using however many blades are in the chassis. For example an 8 blade system would be configured as EC 6+2. As the number of blades in the system increases, garbage collection cycles are used to redistribute data to the new capacity, though I am still not 100% sure how this will work when your existing blades are almost full. The compute within each blade, however, acts independently of the storage and can access data resources across the chassis, so from the moment the additional blade is added, you have immediate access to the compute capacity for data processing.

My only query on this would be why Pure did not offer the ability to choose between Erasure Coding, which is ideal for lower performance requirements, and replicas, which would be handier for very low latency use cases? If you are putting it all on flash in the first place, instead of a hybrid model, there may be times when you want to keep that latency as low as possible.

The software platform they have design to manage this is called Elasticity, and to reduce the need to learn yet another interface, it looks very similar to the existing Pure management interfaces:

elasticity.png

A metadata engine with search functionality will be coming later, which will allow you to gain insights into the types of data you hold, and may potentially able to delve into the content of that data to find things such as social security numbers, etc. There are few details available on this at the time of writing.

As with the other Pure platforms, telemetry data is sent back to base on a regular basis, and Pure take care of all of the proactive maintenance and alerting for you. All of this data is presented through their Pure1 portal, which is pretty fully featured and intuitive.

I have to say I am genuinely surprised to see Pure come out with a solution with such completely bespoke hardware, when the entire industry is going in the direction of commodity + software, but the end result looks really promising. The sooner they can get SMB (not CIFS!) into the product the better, as this will allow them to begin competing properly with the likes of NetApp on the filer front.

As with many new products we tend to see on the market, the data services are not all there yet, but at the rate Pure do code releases, I don’t imagine it will be long before many of those RFP check boxes will be getting checked!

GA is expected during the second half of 2016.

Disclaimer/Disclosure: My accommodation, meals and event entry to Pure Accelerate were provided by Pure Storage, and my flights were provided by Tech Field Day, but there was no expectation or request for me to write about any of the products or services and I was not compensated in any way for my time at the event.

How often do you upgrade your storage array software?

Upgrades are scary!

Having managed and implemented upgrades on highly available systems such as the old Sun StorageTech line of rebranded HDS USP/VSP arrays back in the day, I can tell you that we did not take upgrades lightly!

Unless there was a very compelling reason for an upgrade, the line taken was always “if it ain’t broke, don’t fix it”, but then we were looking after storage in a massively high security environment where even minor changes were taken very seriously indeed. When it came to storage we didn’t have or need anything very fancy at all, just a some high performance LUNs cut from boat loads of small capacity 15K drives, a bit of copy on write snappage to a set of 3rd party arrays and some dual site synchronous replication. Compared to some of the features and configurations of today, that’s actually pretty minimal!

Updates

Now this approach meant that the platform was very stable. Great! It also meant that because we only did upgrades once in a blue moon, the processes were not what you might call streamlined, and the changes made by each upgrade were typically numerous, thereby running a pretty decent risk of something breaking. It was also key to ensure that we checked the compatibility matrix for every release to ensure that the 3rd party arrays would continue to function.

They say that software is eating the world. I’d say it seems the same could be reasonably said for the hardware storage vendors we saw at Storage Field Day 8, as they seem to mostly be moving towards more Agile development models. Little and often means lower risk for every upgrade as there are fewer changes. New features and improvements can be released on a more regular basis (especially those taking advantage of flash technologies which are changing by the minute!). A significant number of the vendors we saw had internal release cycles of between 2 and 4 weeks and public release cycles of 2-8 weeks!

In the case of one vendor, Pure Storage, they are not only releasing code every couple of weeks, but customers have obviously taken this new approach on board with vigour! Around 91% of Pure’s customer base is currently using an array software version 8 months old or less. An impressive stat indeed!

This is Hardware. Software runs on it...

This is Hardware. Software runs on it…

This sounds like a relatively risky approach, but they mitigate it to a great extent by using the telemetric data uploaded every 30 seconds to their Pure1 SaaS management platform from customer arrays, building up a picture of both individual customers and their customer base as a whole. They then use their fingerprint engine to proactively pre-check every customer array to find out which may be susceptible to any potential defect in a new software release. Arrays which pass this pre-check have the upgrades rolled out remotely by Pure Storage engineers on a group by group basis to minimise risk. Obviously this is also done in conjunction and agreement with customers change windows etc. You wouldn’t expect your controllers to start failing over without any notice! 🙂

If I’m honest I am torn in two about this approach. The ancient storage curmudgeon in me says an array should just sit in the corner of the room quietly ticking away with minimal risk to availability and data durability (at least to known bugs anyway!). This new style of approach means that it doesn’t matter how many redundant bits of that rusty tin you have, as Scott D Lowe said last week:

That said we need to be realistic, we don’t live in ye olde world any more. Every part of the industry is moving towards more agile development techniques, driven largely by customer and consumer demand. If the “traditional” storage industry doesn’t follow suit, it risks being left behind by newer technologies such as SDS and hyper convergence.

There is one other key benefit to this deployment method which I haven’t mentioned of course; those big scary upgrades of the past now become minor updates, and the processes we wrap around them as fleshy sacks of water become mundane. That does sound quite tempting!

Perhaps upgrades aren’t that scary any more?

I’d love to hear your opinions either way, feel free to fire me a comment on twitter!

Further Reading
Some of the other SFD8 delegates have their own takes on the presentation we saw. Check them out here:

Dan Frithhttp://www.penguinpunk.net/blog/pure-storage-orange-is-the-new-black-now-what/

Scott D. Lowehttp://www.enterprisestorageguide.com/overcoming-new-vendor-risk-pure-storages-techniques

Pure1 Overview at SFD8

 
Disclaimer/Disclosure: My flights, accommodation, meals, etc, at Storage Field Day 8 were provided by Tech Field Day, but there was no expectation or request for me to write about any of the vendors products or services and I was not compensated in any way for my time at the event.

%d bloggers like this: