Tag Archive for EMC

Does Cloud Provide Infinite Storage Capacity and Retention?

cloud

I wrote last week about the challenges of long-term retention of data, and some of the architectural considerations and decisions we take in designing long-term backup or archive solutions. The follow-up question therefore is, does the cloud provide infinite storage capacity and retention?

“Cloud Integration”

One of the key themes which I have been seeing of late with many (if not all!) modern storage solutions, is some form of cloud integration. It seems to me that many vendors are trying to ensure they can tick the “cloud integration” check box in an RFP or RFI!

I recall one time at a previous organisation, our storage team did an RFP asking for an array which was capable of doing file presentation. The response in the RFP was “Yes”, but when this was dug into a bit further (after the fact), it turned out that this was only possible with an HA pair of custom vendor file gateways. In other words, not much better than building your own file server!

Anyway back to the point, this “RFP checkbox” mentality means that some vendors have a very tight cloud integration with multiple target replication options (such as DC to DC, DC to Cloud, Cloud to DC, Cloud to Cloud, etc), whilst others provide little more than lip service to cloud integration.

The best suggestion I can make in this scenario is to push your vendor for either a demo, a PoC, or a software copy of their array, if they have one. That way you can be absolutely sure that what is claimed, is indeed what you are looking for!

One Possible Solution… EMC Unity

One solution I believe falls more and more into the cloudy camp with each code release, is the new EMC Unity arrays, for which we were provided a briefing at the recent Storage Field Day 13 event.

What I found particularly interesting was that the arrays were natively capable of up to 256 redirect-on-write snapshots per volume, which sounds like a lot, but if you do one every 5 minutes then you will run out pretty fast! By utilising the EMC Cloud Tiering Appliance (a totally separate management interface today, which I really hope EMC fully integrate into the Unity pretty quick, as multiple panes of glass are no fun for anyone!), we can utilise any S3-compatible storage to provide UNLIMITED snapshots.unlimited snapshot retention

This is pretty cool if you have to provide very granular restoration points for your application data, as well as the ability to off-site at relatively low cost in a near-infinite data storage facility!

Sadly, you can’t currently run VMs directly from those snapshots in the cloud, but bearing in mind that EMC has a software only version of the Unity already available, I have a sneaking suspicion that there will be some engineering talent working on this as we speak! This would potentially provide the ability to snap and replicate your entire estate natively to S3 buckets in the cloud, then restore very quickly locally within that IaaS platform. Let’s hope I’m right!

Want to Know More?

EMC’s sessions on Unity, Scale-IO and Isilon were recorded and are now available to stream online:

Some of the other SFD13 delegates had their own thoughts on the session and EMC in general. You can find them here:

Disclaimer/Disclosure: My flights, accommodation, meals, etc, at Storage Field Day 13 were provided by Tech Field Day / Gestalt IT, but there was no expectation or request for me to write about any of the vendors products or services and I was not compensated in any way for my time at the event.

Storage Field Day 13 (SFD13) – Preview

Tech Field Day 12 (TFD12)

For those people who haven’t heard of Tech Field Day, it’s an awesome set of events run by the inimitable Stephen Foskett. The event enables tech vendors and real engineers / architects / bloggers (aka delegates) to sit down and have a conversation about their latest products, along with technology and industry trends.

Ever been reading up on a vendor’s website about their technology and had some questions they didn’t answer? One of the roles of the TFD delegates is to ask the questions which help viewers to understand the technology. If you tune in live, you can also post questions via Twitter and the delegates, who will happily ask them on your behalf!

As a delegate it’s an awesome experience as you get to spend several days visiting some of the biggest and newest companies in the industry, nerding out with like-minded individuals, and learning as much from the other delegates as you do from the vendors!

So with this in mind, I am very pleased to say that I will be joining the TFD crew for the fourth time in Denver, for Storage Field Day 13, from the 14th-16th of June!

As you can see from the list of vendors, there are some really interesting sessions coming up! Having previously met with Primary Data, it will be great to catch up with them and find out about how they have improved in the past couple of years. We also use quite a selection of DellEMC products at my organisation, so it will be really good to meet them and get the latest updates.

Lastly, I am particularly keen to find out what future trends and movements will be from the perspective of SNIA, the Storage Network Industry Association, about some of the most cutting edge developments in the industry.

SFD13 Sounds great! How do I tune in?

If you want to tune in live to the sessions, see the following link:
Storage Field Day 13

If for any reason you can’t make it live, have no fear! All of the videos are posted on YouTube and Vimeo within a day or so of the event.

I Like Big Files and I Cannot Lie

You other vendors, can’t deny,
When an array walks in with an itty bitty waste [-ed capacity],
And many spindles in your face
You get sprung, want to pull up tough,
‘Cause you notice that storage was stuffed!

Ok… I’ll stop now! I’m just a bit sad and always wanted an excuse to to use that as a post opener! 🙂

There is a certain, quite specific type of customer whose main requirements revolve around the storage of large data sets consisting of thousands to millions of huge files. Think media / TV / movie companies, video surveillance or even PACS imaging and genomic sequencing. Ultimately we’re talking petabyte-scale capacities – more than your average enterprise needs to worry about!

How you approach storage of this type of data is worlds apart from your average solution!

The Challenges of “Chunky” Data

Typical challenges involve having multiple silos of your data across multiple locations, with different performance and workload characteristics. Then you have different storage protocols for different applications or phases in their data processing and delivery. Each of those silos then requires different skills to manage, and different capacity management regimes.

Sir Mixalot likes big files

On top of that, for the same reason as we moved away from parity groups in arrays to wide striping, these silos then have IO and networking hotspots, wasted capacity (sometimes referred to as trapped white space) and wasted performance, which cannot be shared across multiple systems.

Finally (and arguably most importantly), how do you ensure the integrity, resilience, and durability of this data, as by its very nature, it typically requires long-term retention?

Ideal Solution

What you really need is a single storage system which can not only scale to multi-petabyte capacities with multiple protocols, but is reasonably easy to manage, even with a high admin to capacity ratio.

You then need to ensure that data can also be protected against accidental, or malicious file modification or deletion.

Finally, you need the system to be able to replicate additional copies to remote sites, as backing up petabytes of data is simply unrealistic! Similarly, you may want multiple replicas or additional pools outside of your central repository which all replicate back to the mothership, for example for ROBO or multi-site solutions where editing large files needs to be done locally.

As my good friend Josh De Jong said recently:

Of course, the biggest drawback of using this approach is that you have one giant failure domain. If something somehow manages to proverbially poison your “data lake”, that’s a hell of a lot of data to lose in one go!

DellEMC Isilon

During our recent Tech Field Day 12 session at DellEMC, I was really interested to see how the DellEMC Isilon scale-out NAS system was capable of meeting many of these requirements, especially as this is a product which can trace its heritage all the way back to 2001! In fact, their average customer on Isilon is around 1PB in size, and their largest customer is using 144PB! Scalability, check!

The Isilon team also confirmed that around 70% of their 8,000+ customers trust the solution sufficiently to not use any external backup solution, trusting in SnapshotIQ, SyncIQ and in some cases SmartLock, to protect their data. That’s a pretty significant number!

One thing I am not so keen on with the Isilon (and to be fair, many other “traditional” /  old guard storage vendor offerings) is the complexity and breadth of the licensing; almost all of the interesting features each have to have their own license. If the main benefit to the data lake is simplicity, then I would far rather have a single price with perhaps one or two uplift options for licenses, than an a la carte menu.

In addition, the limit of 50 security domains provides some flexibility for service providers, but then limits the size of your “data lake” to 50 customers. It would be great to see this limit increased in future.Data Lake

The Tekhead Take

Organisations looking to retain data in these quantities need to weigh up the relative risks of using a single system for all storage, versus the costs of and complexity of multiple silos. Ultimately it is down to each individual organisation to work out what closest matches their requirements, but for the convenience of a single large repository of all of your data, the DellEMC Islion still remains a really interesting proposition.

Further Info

You can catch the full Isilon session at the link below:
Dell EMC Presents at Tech Field Day 12

Further Reading

Some of the other TFD delegates had their own takes on the presentation we saw. Check them out here:

Disclaimer: My flights, accommodation, meals, etc at Tech Field Day 12 were provided by Tech Field Day, but there was no expectation or request for me to write about any of the vendors products or services.

Tech Field Day 12 (TFD12) – Preview

Tech Field Day 12 (TFD12)

For those people who haven’t heard of Tech Field Day, it’s an awesome event run by the inimitable Stephen Foskett. The event enables tech vendors and real engineers / architects / bloggers (aka delegates) to sit down and have a conversation about their latest products, along with technology and industry trends.

Ever been reading up on a vendor’s website about their technology and had some questions they didn’t answer? One of the roles of the TFD delegates is to ask the questions which help viewers to understand the technology. If you tune in live, you can also post questions via twitter and the delegates, who will happily ask them on your behalf!

As a delegate it’s an awesome experience as you get to spend several days visiting some of the biggest and newest companies in the industry, nerding out with like-minded individuals, and learning as much from the other delegates as you do from the vendors!

So with this in mind, I am very pleased to say that I will be joining the TFD crew for the third time in San Jose, for Tech Field Day 12, from the 15th-16th of November!

Tech Field Day 12 (TFD12) Vendors

As you can see from the list of vendors, there are some truly awesome sessions coming up! Having previously visited Intel and Cohesity, as well as written about StorageOS, it will be great to catch up with them and find out about their latest innovations. DellEMC are going through some massive changes at the moment, so their session should be fascinating. Finally, I haven’t had the pleasure of visiting rubrik, DriveScale or Igneous to date, so should be very interesting indeed!

That said, if there was one vendor I am probably most looking forward to visiting at Tech Field Day 12, it’s Docker! Container adoption is totally changing the way that developers architect and deploy software, and I speak to customers regularly who are now beginning to implement them in anger. It will definitely be interesting to find out about their latest developments.

If you want to tune in live to the sessions, see the following link:
Tech Field Day 12

If for any reason you can’t make it live, have no fear! All of the videos are posted on YouTube and Vimeo within a day or so of the event.

Finally, if you can’t wait for November, pass the time by catching some of the fun and highlights from the last event I attended:

Storage Field Day 9 – Behind the Curtain

%d bloggers like this: