Amazon AWS Tips and Gotchas – Part 8 – AWS EC2 Reserved Instances

Continuing in this series of blog posts taking a bit of a “warts and all” view of a few Amazon AWS features, below are a handful more tips and gotchas when designing and implementing solutions on Amazon AWS, including AWS EC2 Reserved Instances.

For the first post in this series with a bit of background on where it all originated from, see here:
Amazon #AWS Tips and Gotchas – Part 1

For more posts in this series, see here:
Index of AWS Tips and Gotchas

AWS Tips and Gotchas – Part 8

Reserved Instances are a great way to save yourself some money for instances you know you will require for a significant period of time (from 12-36 months). One really cool fact which AWS don’t announce enough, in my opinion, is that reserved instances can actually be shared across consolidated billing accounts!

If you wanted to, you could purchase all of your reserved instances from your primary consolidated billing accounts, however, it doing this has some potentially unexpected results:

  1. Reserved instances don’t just provide you with a better price, they also provide you with guaranteed ability to spin up an instance of your chosen type, regardless of how busy the AZ in question actually is.
    If there is an AZ outage, other AWS customers will scramble to spin up additional instances in other AZs in the same region, either manually or via ASGs, and this has the potential to starve the compute resources for one or more instance types!
    Yes, that’s right, even AWS do not have an infinite compute resources!AWS Infinity Reserved InstancesBy using reserved instances, you are still guaranteed to be able to run yours regardless of available capacity for on-demand instances. They are truly reserved.
    If however, you centralise your reserved instances into your CB account, you will get the reservation pricing benefits at the top of the account tree, but you don’t get the capacity reservations as these are account specific.
  2. Reserved instances are specific to individual Availability Zones, so ensure you spread these evenly across your AZs to avoid wasting them (you are of course designing your apps to be resilient across AZs, right?) and give you maximum reserved coverage in the unlikely event of a full AZ outage.
  3. And finally… Reserved instances are a commercial tool applied after-the-fact, not against a specific instance. When using consolidated billing for reserved instances, the reservations are therefore effectively split evenly across all accounts. If you actually want to report back to each business unit / account owner on their billing including reserved instance, this could be tricky.

Find more posts in this series here:
Index of AWS Tips and Gotchas

Amazon AWS Tips and Gotchas – Part 9 – Scale-Up Patching

What I read on my holidays – Uber Geek Edition!

Having only started in my new role at the start of July, I was fortunate enough to sneak in a cheeky week off work at the end of the kids summer holidays. My wife and I have done a fair bit of travelling in the past, but being parents of young children, we do not currently go in for big sightseeing tours. My ears can only survive hearing “my feet hurt” and “I need a wee” so many times before I give in to temptation and leave the kids by the side of the road!…

As I would prefer not to go to prison, instead we had a pretty chilled out week at a resort and I was able to get a wee bit of reading in; which was nice!

readingTypically I like to vary my reading between something for enjoyment, followed by something educational, then rinse and repeat. The former is generally some kind of fiction, especially science fiction / fantasy / humour.

IMO, Terry Pratchett was a true genius and is my favourite author by a huge margin, and he manages to achieve all three of these categories, and then some! Unfortunately, Terry passed away in March last year, leaving millions of fans deeply saddened. The two fiction books below were in fact originally recommended by him, and I would certainly echo this recommendation!

  • openstack-explainedOpenStack Explained – Giuseppe Paternò
    • I was fortunate enough to see Guiseppe present on OpenStack at this year’s Tech Unplugged event in London (see playlist of YouTube vids here and Guiseppe’s session recording is here), at the end of which he gave everyone a copy of his book for nothing, except the ask that we donated some money to charity for it. Very honourable indeed!

      I suggest if you do download the ebook from the above link, you do the same for your favourite charity! If you are struggling to choose one, I suggest Willen Hospice, who provided amazing care to a family member of mine recently (Donation Link Here).Anyway, the session was excellent and Guiseppe gave some insights into the growing adoption of OpenStack in the Enterprise today. In fact it led me to post the following tweet at the time:

      Guiseppe’s book is a great intro to all of the basics elements of OpenStack and what they do; well worth the cost of a donation for a download!

  • leaky4The Leaky Establishment – David Langford (or eBook here)
    • As an ex-press officer in the civil nuclear industry, Pratchett described this as the book he should have written!
      The satirical black comedy focuses around our hero, Roy Tappen, who accidentally smuggles a “pit” (i.e. a nuclear warhead core!) out of the nuclear weapons research facility he (regrettably) works in!

      Needless to say, his wife is none too impressed with him keeping a multi-megatonne explosive source in the house, and hilarity ensues as Roy plots to smuggle it back into work!

      Parts of this book had me in stitches; well worth a read!

  • openstack-cloud-computing-cookbookThe OpenStack Cookbook – Kev Jackson & Cody Bunch
    • I currently have the second edition of their book so it’s not 100% up to date, but as I was on holiday I wasn’t actually running through the labs specifically. Instead, I read the main content in each section to get a better understanding of how each of the OpenStack components connect together.

      The book is very well researched and written, with clear and easy to follow instructions for you to build your own OpenStack homelab. I will definitely be upgrading to the Third Edition when it comes time to build my own lab!

  • evolutionmanThe Evolution Man, Or, How I Ate My Father – Roy Lewis
    • This is one of the strangest books I have read in a long time, but a really enjoyable read! Originally written in 1960, it is a story about a tribe of cavemen of the Pleistocene era, trying to pass through multiple evolutionary leaps within a single generation, and covers everything from their discovery of fire, cooking, improved hunting techniques, domestication of animals, etc, but ultimately it is a story about the friction between progress and those who wish to avoid it!You might be wondering how the author manages any compelling dialogue with prehistoric tribespeople? The good news is, that’s the best bit!

      All of the characters speak as if out of the pages of a 1920’s period drama, or perhaps even the drawing room of Charles Darwin himself! The juxtaposition of the characters and their dialogue is really what makes the book so special in my opinion.

      AFAIK this isn’t available in eBook format, but in this case, I think good old fashion print just adds to the anachronistic experience! 🙂

  • SecondMachineAgeThe Second Machine Age – Erik Brynjolfsson & Andrew McAfee
    • This book blends analysis of the history of technical innovations, with economics. It’s not my usual type of read, but it turned out to be fascinating on multiple levels.

      The geek in me enjoyed reading about the developments in technology and analyses of how they impacted the modern world, along with the predictions about where and how the authors believe technology will change our future.

      The parent in me took a lot of great ideas about how to advise and guide my children when they get to the age that they need to start thinking about their careers and university choices. One of the key recommendations made in the book was how people can remain valuable knowledge workers in the new machine age: “work to improve the skills of ideation, large-frame pattern recognition, and complex communication instead of just the three Rs”. If you want to understand this more either for your children or yourself, I definitely recommend you read this book!

So what’s next on my list I hear you ask? (Well maybe not, but I’m going to tell you anyway!)… The Tin Men by Michael Frayn (another Pratchett recommendation), most likely followed by Google’s recent Site Reliability Engineering publication.

Time to try something new – how about podcasting?

This old blogging malarkey is getting a bit old hat isn’t it? Well, according to some (many?) people, podcasting is the new blogging (or so I hear on the grapevine, hanging around the water cooler and / or when grafting down at the old rumour mill)!

Well, I’m not sure I quite believe it, but either way I am a massive fan of podcasts and have steadily increased my consumption now to the point where I am subscribed to almost twenty! There is huge value in being able to spend the many hours per week commuting or doing other mundane tasks, simultaneously learning and frankly passing the time a lot quicker for it!

As such, when an innocent twitter conversation late one evening with some of the chaps from the London VMUG and Open Homelab, led to a suggestion that we should have a go at creating some vocal content! Well, one thing led to another, and we have subsequently given birth to a disturbing looking love child!

podcastingfaceThe idea of the show is pretty simple; it spun out from the Open Homelab project as we all like to have a great gab about labs and studying. This is the subject that forms the molten core of the show, with a different key subject for each (hopefully monthly if we pull finger) episode. Around this we will wrap a mantle of other interesting bits and bobs (content TBC but perhaps stretching to the business of IT and one or two discussions on key news items of interest!), Finally surrounded by a hard crust of technical and ‘humerical’ linguistics, or indeed whatever else comes out of the minds of myself or co-hosts, Gareth Edwards, Kev Johnson and Amit Panchal!

As such, you can find the virginal fruit of our labours linked below:

Open TechCast – Ep.1:- The NEW beginning…

We massively appreciate any and all constructive feedback, so please fire us a message on Twitter with any comments, give our new Open TechCast Twitter account a follow, or if you have time, you could even leave us a wee review on iTunes or Stitcher!

opentechcast-logoFinally, thanks very much to Gareth and Kev who have done the vast majority of the organising for the cast so far! 🙂

VulcanCast Follow Up – A few thoughts on 60TB SSDs

So last week I was kindly invited to share a ride in Marc Farley‘s car (not as dodgy as it sounds, I promise!).

The premise was to discuss the recent announcements around Seagate’s 60TB SSD, Samsung’s 30TB SSD, their potential use cases, and how on earth we can protect the quantities of data which will end up on these monster drives?!

Performance

As we dug into a little in the VulcanCast, many use cases will present themselves for drives of this type, but the biggest challenge is that the IOPS density of the drives not actually very high. On a 60TB drive with 150,000 read IOPS (and my guess but not confirmed is ~100,000 or fewer write IOPS), the average IOPS per GB is actually only a little higher than that of SAS 15K drives. When you start adding deduplication and compression into the mix, if you are able to achieve around 90-150TB per drive, you could easily be looking at IOPS/GB performance approaching smaller 10K SAS devices!

Seagate 60TB SSD Vulcancast flash is fastThe biggest benefit of course if that you achieve this performance in a minuscule footprint by comparison to any current spindle type. Power draw is orders of magnitude lower than 10/15K, and at least (by my estimates) at least 4x lower than using NL-SAS / SATA at peak, and way more at idle. As such, a chunk of the additional cost of using flash for secondary tier workloads, could be soaked up by your space and power savings, especially in high-density environments.

In addition, the consistency of the latency will open up some interesting additional options…

SAS bus speeds could also end up being a challenge. Modern storage arrays often utilise 12GB SAS to interconnect the shelves and disks, which gives you multiple SAS channels over which to transfer data. With over half a PB of usable storage in just a dozen drives, which could be 1PB with compression and dedupe, and that’s a lot of storage to stick on a single channel! In the long term, faster connectivity methods such as NVMe will help, but in the short-term we may even have to see some interesting scenarios with one controller (and channel) for every few drives, just to ensure we don’t saturate bandwidth too easily.

Seagate 60TB SSD Vulcancast Big DataUse Cases

For me, the biggest use cases for this type of drive are going to be secondary storage workloads which require low(ish) latency, a reasonable number of predominantly Read IOPS, and consistent performance even when a little bit bursty. For example:

  • Unstructured data stores, such as file / NAS services where you may access data infrequently, possibly tiered with some faster flash for cache and big write bursts.
  • Media storage for photo and video sites (e.g. facebook, but there are plenty of smaller ones such as Flickr, Photobox, Funky Pigeon, Snapfish, etc. Indeed the same types of organisations we discussed at the Storage Field Day roundtable session on high performance object storage. Obviously one big disadvantage here, would be the inability to dedupe / compress very much as you typically can’t expect high ratios for media content, which then has the effect of pushing up the cost per usable GB.
  • Edge cache nodes for large media streaming services such as NetFlix where maximising capacity and performance in a small footprint to go in other providers data centres is pretty important,whilst being able to provide a consistent performance for many random read requests.

For very large storage use cases, I could easily see these drives replacing 10K and if the price can be brought down sufficiently, for highly dedupable (is that a word?) data types, starting to edge into competing with NL SAS / SATA in a few years.

Data Protection

Here’s where things start to get a little tricky… we are now talking about protecting data at such massive quantities, failure of just two drives within a short period, has the potential to cause the loss of many hundreds of terabytes of data. At the same time, adding additional drives for protection (at tens of thousands of dollars each) comes with a pretty hefty price tag!

Seagate 60TB SSD Vulcancast data protectionUnless you are buying a significant number of drives, the cost of your “N+1”, RAID, erasure coding, etc is going to be so exorbitant, you may as well buy a larger number of small drives so you don’t waste all of that extra capacity. As such, I can’t see many people using these drives in quantities of less than 12-24 per device (or perhaps per RAIN set in a hyper-converged platform), which means even with a conservatively guestimated cost of $30k per drive, you’re looking at the best part of $350-$700k for your disks alone!

Let’s imagine then, the scenario where you have a single failed drive, and 60TB of your data is now hanging in the balance. Would you want to replace that drive in a RAID set, and based on the write rates suggested so far, wait 18-24 hours for it to resync? I would be pretty nervous to do that myself…

In addition, we need to consider the rate of change of the data. Let’s say our datastore consists of 12x60TB drives. We probably have about 550TB or more of usable capacity. Even with a rate of change of just 5%, we need to be capable of backing up 27TB from that single datastore per night just to keep up with the incrementals! If we were to use a traditional backup solution against something like this, to achieve this in a typical 10-hour backup window will generate a consistent 6Gbps, never mind any full backups!

Ok, let’s say we can achieve these kinds of backup rates comfortably. Fine. Now, what happens if we had failure of a shelf, parity group or pool of disks? We’ve probably just lost 250+TB of data (excluding compression or dedupe) which we now need to restore from backup. Unless you are comfortable with an RTO measured in days to weeks, you might find that the restore time for this, even over a 10Gbps network, is not going to meet your business requirements!!!

This leaves us with a conundrum of wondering how we increase the durability of the data against disk failures, and how do we minimise the rebuild time in the event of data media failure, whilst still keeping costs reasonably low.

Seagate 60TB SSD VulcancastToday, the best option seems to me to be the use of Erasure Coding. In the event of the loss of a drive, the data is then automatically rebuilt and redistributed across many or all of the remaining drives within the storage device. Even with say 12-24 drives in a “small” system, this would mean data being rebuilt back up to full protection in 30-60 minutes, instead of 18-24 hours! That said, this assumes the connectivity on the array bus / backplane is capable of handling the kinds of bandwidth generated by the rebuilds, and that this doesn’t have a massive adverse impact on the array processors!

The use of “instant restore” technologies, where you can mount data direct from the backup media to get up and running asap, then move the data transparently in the background also seems to me to be a reasonable mitigation. In order to maintain a decent level of performance, this will likely also drive the use of flash more in the data protection storage tiers as well as production.

The Tekhead Take

Whatever happens, the massive quantities of data we are beginning to see, and the drives we plan to store them on are going to need to lead us to new (as yet, not even invented) forms of data protection. We simply can’t keep up with the rates of growth without them!

VulcanCast

Catch the video here:

The video and full transcript are also available here:
Huge SSDs will force changes to data protection strategies – with @alexgalbraith

%d bloggers like this: