In the spirit of these new short-form blog posts (see TekBytes: A Blogging Experiment) it’s probably appropriate that I write a quick post on a new short-form podcasting project I am working on; CloudSpotting!
My day job is as a Solutions Architect at Rackspace, where I’m fortunate enough to work for one of the most tech-agnostic global service providers around! A typical week could include me talking about or designing solutions based on VMware, Hyper-V, AWS, Azure, GCP, OpenStack, or even just plain old dedicated servers! Add to that a swathe of security, networking and storage “stuff”, it all adds up to a pretty healthy mix.
Myself and my colleague Sai Iyer thought it would be fun to share some of our learnings and experiences in designing and operating customer solutions. What better way (we thought!), than an easy-to-consume 30 minute monthly podcast for architects and engineers… In the first episode, we discuss scaling applications for peak periods and the insane growth of Kubernetes adoption! We already have episodes planned on phishing, cyber kill-chains, encryption, automation & DevOps along with a host of other topics, so watch this space!
Just to be clear though – No Kool aid, just cool tech! 🙂
For those of you who are also regular Open TechCast listeners, this doesn’t mean I am changing lanes in any way, there will just be more of my dulcet tones available on your favourite podcatcher (which may or may not be a good thing!).
Where can I find it?
If you want to catch the first episode, just search for “CloudSpotting” on iTunes or Stitcher, or catch the show on Soundcloud here:
Continuing in this series of blog posts taking a bit of a “warts and all” view of a few Amazon AWS features, below are a handful more tips and gotchas when designing and implementing solutions on Amazon AWS. This week, we talk about the latest feature of AWS, EFS (aka Elastic File System).
20. Amazon AWS Tips and Gotchas – Part 10 – EFS (Elastic File System)
A big challenge when designing highly available web infrastructures is historically how to provide a centralised content store for static content without wasting resources.
A classic model for this is a pair of web / file servers with either rsync or Gluster to replicate the content between them. In Windows world, this would be something like either a WSFC (failover cluster) or perhaps something evil like a DFS replicated share. This means that not only are you wasting money on multiple virtual machines / instances just to serve file content, but you also add significant risk and complexity in the replication and failover between these machines.
Enter, AWS EFS!At a simple level, EFS is basically an NFS (v4.1) share within the AWS cloud, which is replicated across all AZs in any one region. No need for managing and replicating between instances, or indeed paying for EC2 instances just to create file shares! Great!
As this is still a relatively immature product, there are still a few “features” to be aware of:
There is no native EFS backup solution (yet!). I’m sure this will come very soon. As we have Re:invent coming up, it wouldn’t surprise me if something came out then. In the meantime, your main methods would be either to use Data Pipeline to backup to another EFS store or potentially mount EFS and backup through an EC2 instance using your own tools or scripts. I would be concerned about backing up EFS to EFS (if in the same region), as this is putting all your eggs in one basket. Hopefully, AWS will provide other target options in the future.
There is no native encryption of EFS data as yet. If you need this right now, you could achieve it by simply pre-encrypting the data in your application first, before it is written to EFS. Alternatively, just hold your breath as AWS have already stated that: “Amazon EFS does not currently provide the option to encrypt data at rest, but we will offer this option soon”.
If you have less than about 100GB, then due to the way the performance burst credits work you may not get the performance you need. The more you buy, the more performance you get, so don’t short change your app for the sake of a few dollars!
“Amazon EFS uses a credit system to determine when file systems can burst. Each file system earns credits over time at a baseline rate that is determined by the size of the file system, and uses credits whenever it reads or writes data”.
In early testing, it has been seen that very small filesystems can lead to IO starvation and performance issues. I would recommend you start with 100GB as a minimum (subject to your workload requirements of course!). This is still pretty cheap at only about $30-33 a month; a lot less than even a pair of EC2 instances, never mind the complexity reduction benefits. KISS!
Of course, the more caching you can do on that content, e.g. using CloudFront as a CDN, the lower the IO requirements on your EFS store.
Last week I managed to catch up with the guys from StorageOS, a new container-based storage company, headquartered in London. I found out about them at a London Storage Beers event a few weeks ago, and my first question was, what the hell is container-based storage, and how does it work?!
They started from the premise (yes that’s actually the correct use of the word premise!), that if you want to build a storage system FOR containers, what better way to do it than to build it FROM containers. StorageOS therefore offer what they describe as “full enterprise storage array functionality, delivered by software, on a pay-as-you-go basis”. They also plan to offer a free-forever Developer tier, which includes everything except HA functionality which you would obviously need for production usage!
You can deploy this StorageOS software anywhere from bare metal to containers:
It’s software, so it runs anywhere!
Appliances for some of the larger clouds are in the works, but will not be available on day zero.
They can then consume any back-end storage, from SSD, HDDs and virtual drives, to EBS volumes, object stores, etc. You then pool all of capacity from all devices into a capacity pool, which is deduped, encrypted, and available across all nodes, and carve out volumes to present to systems like Docker through their own native Docker driver, or (slightly oddly) iSCSI / FC!!! They even have VAAI support in development!
Overall, I think it’s a pretty interesting product. At first look it feels a bit like a traditional array in a container package, much like if you containerised an enterprise app, then just utilised as a traditional array with some container plugins, instead of being very targeted and container-specific. StorageOS do have an OS driver to let you mount their volumes direct from containers, but there are other things out there today which do that anyway (e.g. Flocker).
I would say their messaging is a little inconsistent at the moment, and adding things like FC integration early on feels a bit odd if they’re positioning themselves as a container play. They do however state clearly that they’re targeting enterprises and want to make the on-boarding process as simple and friction-less as possible. I do worry that this “all things to all people” approach could be a wee bit risky at this early stage, and being more laser focused in the short to medium term would allow them to differentiate more.
The founders were very specific when they stated that they were building a clustered array with synchronous remote replicas, not a distributed storage array. Async replication is coming, which will be critical to maintaining performance in a hybrid cloud or multi-cloud setup. I really like the fact that you can stretch the same hybrid storage environment between your on-premises and cloud infrastructure using a single storage solution. This same solution can actually be used to span multiple public clouds as well, providing a resilient storage solution between say AWS and Azure, all of which is deduped and encrypted of course! This could be very interesting indeed, as customers look to protect their workloads from large public outages!
Finally, the StorageOS software is built (as you would expect these days) with APIs at the heart of everything. Even the modern GUI is really just based on API calls to the back end.
The Tekhead Take
Anyway, enough gabbing… It’s still early days, but the storage experience of the founders is certainly solid! Who better than ex-storage admins to provide a product that works well for storage admins?! I’d say there’s a good chance of this becoming a pretty cool product in the future, so definitely one to watch!