Tag Archive for S3

Amazon AWS Tips and Gotchas – Part 2 – AWS EBS & RDS MS SQL

Continuing in this series of blog posts taking a bit of a “warts and all” view of a few Amazon AWS features, below are a handful more tips and gotchas when designing and implementing solutions on Amazon AWS, including EBS and MS SQL on RDS.

For the first post in this series with a bit of background on where it all originated from, see here:
http://tekhead.it/blog/2016/02/amazon-aws-tips-and-gotchas-part-1/

For more posts in this series, see here:
Index of AWS Tips and Gotchas

AWS Tips and Gotchas – Part 2 – EBS & RDS
  1. You cannot increase the size of EBS volumes without stopping the instance. If you are designing scale-out / high availability solution then this is not a big issue as you should be able to take some downtime on any individual node, but that downtime is going to be fairly significant, and the larger the volume, the more downtime you will incur. The actual process looks like this (summary below):
    • Stop the instance
    • Snapshot the volume
    • Create a new volume from the snapshot, with your new larger size
    • Detach the old volume
    • Attach the new volume and start the instance back up

    This is one of those features which is bread and butter for a vSphere or Hyper-V admin, and could be done online in seconds with the vast majority of guest operating systems.

    I think it really highlights the key difference between designing for AWS Cloud, and a traditional enterprise virtual infrastructure. In a solution where most of your hosts are ephemeral, this should not be a big issue. If you try to take a traditional enterprise approach, you may find yourself in hot water, having to take service downtime to make simple changes.

    I suggest where possible / appropriate, avoid using EBS and use alternative options such as S3 which can scale on demand.

    UPDATE 13th Feb 2017: Amazon have just released Elastic Volumes, which allow you to scale up EBS volumes on demand! Yay! More info here:
    Amazon EBS Update – New Elastic Volumes Change Everything

  2. Similar to resizing EBS volumes, you cannot hot-resize an instance, or indeed resize them / change their type in place. In order to change instance type you need to detach any EBS volumes (including root volumes if you wish to maintain them too), terminate the instance, create a new one and re-attach your volumes.
    Obviously you cannot re-attach a root volume if you are using instance storage (ephemeral) for this, so make sure you use EBS backed volumes if you want to maintain your root volumes for any scale-up elements of your solutions which cannot simply be re-created from a bootstrap script.
  3. If your application depends on Microsoft SQL, you are going to be in for a fairly unpleasant surprise! It is not currently possible to resize MS SQL volumes on Amazon RDS once they have been deployed! At all. Full stop. Nada.AWS MS SQL - say what nowThe recommendation from AWS is to deploy your estimated future capacity requirement from day one! Not very cloudy at all…Your only growth option when you hit your initial capacity limit is to migrate all the data to a new RDS instance and take some application downtime to fail over.This can be minimised by using things like log shipping from the source instance to get the target as close to up-to-date as possible, but you will still need to shut down and swing your applications, and frankly it’s a risky headache which would be better avoided if possible, and certainly not something you want to be doing on a regular basis.Probably best to design for your estimated growth, and add a percentage on top.

Find more posts in this series here:
Index of AWS Tips and Gotchas

Amazon AWS Tips and Gotchas – Part 3 – S3, Tags and ASG

Tech Startup Spotlight – Hedvig

Hedvig

After posting this comment last week, I thought it might be worth following up with a quick post. I’ll be honest and say that until Friday I hadn’t actually heard of Hedvig, but I was invited along by the folks at Tech Field Day to attend a Webex with this up and coming distributed storage company, who have recently raised $18 million in their Series B funding round, having only come out of stealth in March 2015.

Hedvig are a “Software Defined Storage” company, but in their own words they are not YASS (Yet Another Storage Solution). Their new solution has been in development for a number of years by their founder and CEO Avinash Lakshman; the guy who invented Cassandra at Facebook as well as Amazon Dynamo, so a chap who knows about designing distributed systems! It’s based around a software only distributed storage architecture, which supports both hyper-converged and traditional infrastructure models.

It’s still pretty early days, but apparently has been tested to up to 1000 nodes in a single cluster, with about 20 Petabytes, so it would appear to definitely be reasonably scalable! 🙂 It’s also elastic, as it is designed to be able to shrink by evacuating nodes, as well as add more. When you get to those kind of scales, power can become a major part to your cost to serve, so it’s interesting to note that both x86 and ARM hardware are supported in the initial release, though none of their customers are actually using the latter as yet.

In terms of features and functionality, so far it appears to have all the usual gubbins such as thin provisioning, compression, global deduplication, multi-site replication with up to 6 copies, etc; all included within the standard price. There is no specific HCL from a hardware support perspective, which in some ways could be good as it’s flexible, but in others it risks being a thorn in their side for future support. They will provide recommendations during the sales cycle though (e.g. 20 cores / 64GB RAM, 2 SSDs for journalling and metadata per node), but ultimately it’s the customer’s choice on what they run. Multiple hypervisors are supported, though I saw no mention of VAAI support just yet.

The software supports auto-tiering via two methods, with hot blocks being moved on demand, and a 24/7 background housekeeping process which reshuffles storage at non-busy times. All of this is fully automated with no need for admin input (something which many admins will love, and others will probably freak out about!). This is driven by their philosophy or requiring as little human intervention as possible. A noteworthy goal in light of the modern IT trend of individuals often being responsible for concurrently managing significantly more infrastructure than our technical forefathers! (See Cats vs Chickens).

Where things start to get interesting though is when it comes to the file system itself. It seems that the software can present block, file and object storage, but the underlying file system is actually based on key-value pairs. (Looks like Jeff Layton wasn’t too far off with this article from 2014) They didn’t go into a great deal of detail on the subject, but their architecture overview says:

“The Hedvig Storage Service operates as an optimized key value store and is responsible for writing data directly to the storage media. It captures all random writes into the system, sequentially ordering them into a log structured format that flushes sequential writes to disk.”

Supported Access Protocols
Block – iSCSI and Cinder
File – NFS (SMB coming in future release)
Object – S3 or SWIFT APIs

Working for a service provider, my first thought is generally a version of “Can I multi-tenant it securely, whilst ensuring consistent performance for all tenants?”. Neither multi-tenancy of the file access protocols (e.g. attaching the array to multiple domains for different security domains per volume) nor storage performance QoS are currently possible as yet, however I understand that Hedvig are looking at these in their roadmap.

So, a few thoughts to close… Well they definitely seem to be a really interesting storage company, and I’m fascinated to find out more as to how their key-value filesystem works in detail.  I’d suggest they’re not quite there yet from a service provider perspective, but for private clouds in the the enterprise market, mixed hypervisor environments, and big data analytics, they definitely have something interesting to bring to the table. I’ll certainly be keeping my eye on them in the future.

For those wanting to find out a bit more, they have an architectural white paper and datasheet on their website.

%d bloggers like this: