Last week, Chris Porter and I did a presentation at the London VMUG on AWS for VMware admins, a simple beginners guide with a few gotchas and tips we’ve picked up along our journeys to the public cloud.
The session itself was pretty well received, and we were lucky enough to have standing room only attendance (it was a very small room! 🙂 ).
To open up proceedings and get a feel for our audience, we ran a quick survey of the 50 or so VMware professionals who were in attendance, and the results were pretty interesting.
To the question “Have you played with AWS before?”, over half of the room raised their hands, though with a caveat from one attendee who asked if creating the account qualifies! (You know who you are!)
“Do you have any AWS experience on the job?” saw a number of hands drop, and perhaps 30-40% of the room said yes.
What was most interesting was that to the penultimate question “Are you using AWS in Production today?”, only a single hand in the room remained raised!
Finally, we asked “Are you planning to do any AWS certs in the next 12 months?” and the resounding answer from over 60%-70% of the audience was a clear yes!
So what conclusions can we draw from this straw poll?
Seemingly most of the attendees were either using AWS for dev/test or simply playing with it in the work lab, however with a clear preference to complete certs this year, the future is certainly looking a lot more cloudy for your average VMware administrator, either in their current organisation, or perhaps even elsewhere!
Anyway, enough jibber jabbing! You can find a copy of the slide deck below:
You other vendors, can’t deny, When an array walks in with an itty bitty waste [-ed capacity], And many spindles in your face You get sprung, want to pull up tough, ‘Cause you notice that storage was stuffed!
Ok… I’ll stop now! I’m just a bit sad and always wanted an excuse to to use that as a post opener! 🙂
There is a certain, quite specific type of customer whose main requirements revolve around the storage of large data sets consisting of thousands to millions of huge files. Think media / TV / movie companies, video surveillance or even PACS imaging and genomic sequencing. Ultimately we’re talking petabyte-scale capacities – more than your average enterprise needs to worry about!
How you approach storage of this type of data is worlds apart from your average solution!
The Challenges of “Chunky” Data
Typical challenges involve having multiple silos of your data across multiple locations, with different performance and workload characteristics. Then you have different storage protocols for different applications or phases in their data processing and delivery. Each of those silos then requires different skills to manage, and different capacity management regimes.
On top of that, for the same reason as we moved away from parity groups in arrays to wide striping, these silos then have IO and networking hotspots, wasted capacity (sometimes referred to as trapped white space) and wasted performance, which cannot be shared across multiple systems.
Finally (and arguably most importantly), how do you ensure the integrity, resilience, and durability of this data, as by its very nature, it typically requires long-term retention?
What you really need is a single storage system which can not only scale to multi-petabyte capacities with multiple protocols, but is reasonably easy to manage, even with a high admin to capacity ratio.
You then need to ensure that data can also be protected against accidental, or malicious file modification or deletion.
Finally, you need the system to be able to replicate additional copies to remote sites, as backing up petabytes of data is simply unrealistic! Similarly, you may want multiple replicas or additional pools outside of your central repository which all replicate back to the mothership, for example for ROBO or multi-site solutions where editing large files needs to be done locally.
Of course, the biggest drawback of using this approach is that you have one giant failure domain. If something somehow manages to proverbially poison your “data lake”, that’s a hell of a lot of data to lose in one go!
During our recent Tech Field Day 12 session at DellEMC, I was really interested to see how the DellEMC Isilon scale-out NAS system was capable of meeting many of these requirements, especially as this is a product which can trace its heritage all the way back to 2001! In fact, their average customer on Isilon is around 1PB in size, and their largest customer is using 144PB! Scalability, check!
The Isilon team also confirmed that around 70% of their 8,000+ customers trust the solution sufficiently to not use any external backup solution, trusting in SnapshotIQ, SyncIQ and in some cases SmartLock, to protect their data. That’s a pretty significant number!
One thing I am not so keen on with the Isilon (and to be fair, many other “traditional” / old guard storage vendor offerings) is the complexity and breadth of the licensing; almost all of the interesting features each have to have their own license. If the main benefit to the data lake is simplicity, then I would far rather have a single price with perhaps one or two uplift options for licenses, than an a la carte menu.
In addition, the limit of 50 security domains provides some flexibility for service providers, but then limits the size of your “data lake” to 50 customers. It would be great to see this limit increased in future.
The Tekhead Take
Organisations looking to retain data in these quantities need to weigh up the relative risks of using a single system for all storage, versus the costs of and complexity of multiple silos. Ultimately it is down to each individual organisation to work out what closest matches their requirements, but for the convenience of a single large repository of all of your data, the DellEMC Islion still remains a really interesting proposition.
Disclaimer: My flights, accommodation, meals, etc at Tech Field Day 12 were provided by Tech Field Day, but there was no expectation or request for me to write about any of the vendors products or services.
As someone who follows the storage industry with interest, this episode was a great insight as to the history and decisions which have led us to where we are with flash storage today, as well as some fascinating facts and figures.
I don’t want to preempt or spoil the episode, but for example did you know:
Nanometer scale productions actually mean working at the scale of millionths of a millimeter
Fabrication plants cost around $8 Billion each to make, and the machines involved in creating the chips cost about $100 million each!
It takes 3 months to create a single chip from the raw materials!
These, and other interesting things can be found on the episode, below – I highly recommend you check it out (and dont forget to subscribe to their podcast)!
I listen to about 20+ podcasts on a regular basis (the one and only good thing about commuting every day to the office!). I need to do an updated article on them, but in the mean time, here is a list of some of my recommendations:
Don’t panic, you’re not imagining things! You did indeed read that title right! If everything goes to plan, Chris Porter and I will be taking the January 2017 London VMUG to a whole new place, with a session on AWS!
Yes, that’s an AWS session at a VMUG! 😮
For those people who have been living in a bunker on the isle of Arran for the past few years, AWS has been taking the IT industry by storm. So much so, at VMworld 2016, VMware announced their new product “VMware Cloud on AWS“!
Whatever the reasons that VMware have decided to do this (and I’m not going to go into my opinions of that right now), it leaves VMware admins in a position where even if they aren’t already doing some AWS today, the likelihood of them doing so in the near future has just jumped by an order of magnitude!
Meanwhile in a parallel universe…
What’s the session about then?
The session is planned to be a quick intro on the key features of AWS, some tips on how to learn more and get certified, as well as some of Chris and my experiences of working with and designing for AWS (which is rather different to doing things in VMware, for sure!).
Hopefully it should be a pretty interesting session, especially if you haven’t had much exposure to AWS yet!
What else can you see at the event?
As always, there will be many awesome speakers at the London VMUG event. Ricky El-Quasem is even doing two by himself!
There will also be a load of other sessions, so check out the agenda below:
Wrapping up the event there will also be the eponymous vBeers event at the Old Bank of England (194 Fleet St, London EC4A 2LT), so make sure you hang around after and join us for what is often the best part of the day!
Lastly, thanks very much to the LonVMUG sponsors, Rubrik, iLand and Stormagic, without whom it would not be possible to hold these events!
I’m in! How do I register?
You can register for the event at the London VMUG workspace here: