Tag Archive for AWS

Index of Tekhead.it Blog Posts on Amazon AWS

I wrote my first blog post on AWS in February 2016 and the series is growing pretty quickly, so I thought it was worthwhile indexing all of the current posts and providing an updated list as this grows.

Hopefully, this should make these posts a little easier for people to find in the future!

Anyway, enough gabbing, on with the posts and links:


Podcasts

I was kindly invited by Scott Lowe to join him on the Full Stack Journey podcast, to discuss learning AWS and cloud architecture. The episode can be accessed here:

AWS Certification

bill was study Certified SysOps Administrator

AWS Tips and Gotchas Series

Random AWS and Cloud Related Posts

Also, just in case I forget to keep this page updated:
http://tekhead.it/blog/category/aws/

AWS Tips and Gotchas Blog Posts

Does a Serverless Brexit mean goodbye to infrastructure management problems?

Last week I was able to get myself along to the London CloudCamp event at the Crypt on the Green, for an evening the theme of “We’ve done cloud, what’s next?”. For those of you unfamiliar with the event, CloudCamp is an “unconference” where early adopters of Cloud Computing technologies exchange ideas. As you can probably guess from the theme title, many of the discussions were around the concept of “serverless” computing.

So, other than being something which seems to freak out my spell check function, what is “serverless” then?

I think Paul Johnston of movivo summed it up well, as “scaling a single function / object in your code instead of an entire app”, which effectively means a microservices architecture. In practical terms, it’s really just another form of PaaS, where you upload your code to a provider (such as AWS Lambda), and they take care of managing all of the underlying infrastructure including compute, load balancing, scaling, etc, on your behalf.

The instances then simply act upon events (i.e. they are event driven), which could be anything from an item hitting a queue, to a user requesting a web page, and when not required, they are not running. AWS currently supports a limited subset of languages, specifically Node.js, Java, and Python.

serverless introduction

There are of course other vendors who provide similar platforms, including Google Cloud Functions, IBM Bluemix OpenWhisk, etc. They tend to support a similarly small pool of languages, however some are more agnostic and will even allow you to upload Docker containers as well. Iron.io also allows you to do serverless using your own servers, which seems a bit of an oxymoron! 🙂

Anyway, the cool thing about serverless is that you can therefore “vote to leave” your managed or IaaS infrastructure (yes, I know, seriously tenuous connection!), and just concentrate on writing your applications. This is superb for developers who don’t necessarily have the skills or the time to manage an IaaS platform once it has been deployed.

Serverless Introduction - Tenuous doesn't even come close!

The Case for Remain

Much like the Brexit vote however, it does come with some considerations and challenges, and you may not get exactly what you expected when you went to the polling booth! For example:

  • You may believe you are now running alone, but you are ultimately still dependent on actual servers! However, you no longer have access to those servers, so basic things like logging and performance monitoring suddenly become a lot trickier.
  • Taking this a step further, testing and troubleshooting becomes more challenging. When a fault occurs, how can you trace exactly where it occurred? This is further exacerbated if you are integrating with other SaaS and PaaS platforms, such as Auth0 (IAM), Firebase (DB), etc. This is already a very common architectural pattern for serverless designs.
    You therefore need to start introducing centralised logging and error trapping systems which will allow you to see what’s actually going on, which of course sounds a lot like infrastructure management again!
  • It’s still early days for serverless, so things like documentation and support are a lot more scarce. If you plan to be an early serverless adopter, you had better know your technical onions!
  • As with any microservices architecture, with great flexibility, comes great complexity! Instead of managing just a handful of interacting services, you could now be managing many hundreds of individual functions. You can understand each piece easily, but looking at the big picture is not so simple!Serverless and Microservices Complexity
  • Another level of complexity is in billing of course. Serverless services such as AWS Lambda charge you per 100ms of compute time, and per 1 million requests. If you are paying for a server and some storage, even in a cloud computing model, it’s reasonably easy to understand how much your bill will be at the end of the month.
    Paying for transactions and processing time however is could potentially provide a few nasty surprises, especially if you come under heavy load or even a DoS attack.
  • Finally, the biggest and most obvious concern about serverless is vendor lock-in. Indeed this is potentially the ultimate lock-in as once you pick a vendor and write your application specific to their cloud, moving that bad boy is going to mean some major refactoring and re-writes!
    As long as that vendors pricing is competitive, this shouldn’t matter too much (after all, every single vendor is lock-in to some varying degree), but if that vendor manages to take the lions share of the market they could easily change that pricing and you are almost powerless to react (at least not without significant additional investment).
The Case for Leave

If you understand and mitigate (or ignore!) the above however, serverless can be quite a compelling use case. For example:

  • From an environmental perspective, you will probably never find a more efficient or greener computing paradigm. It minimises the number of extraneous operating systems, virtual or physical machines required, as this is truly multi-tenant computing. Every serverless host could undoubtedly be run at 70-90% utilisation, rather than the 10-50% you typically see in most enterprise DCs today! If you could take every workload in the world and switch it to serverless overnight, based on those efficiency levels, how many data centres, how much power and how many thousands of tonnes of metals could you save? Greenpeace should be refactoring their website as we speak!Serverless Computing is green!
  • Although you do have to introduce a number of tools to help you track what is actually going on with your environment, you can move away from doing a whole load of the mundane management tasks such as patching, OS management etc, and move up the stack to spend your resources on more productive and creative activities; actually adding business value (Crazy idea! I thought in IT we just liked patching for a living?)!
  • The VM sprawl we have today would be reduced as workloads are rationalised. That said, you just end up with replacing this with container or function sprawl, which is even harder to manage! 🙂
  • You gain potentially massive scalability for your applications. Instead of scaling entire applications, you just scale the bottleneck functions, which means your application becomes more efficient overall. Definitely time to read The Goal by Goldratt and understand the Theory of Constraints before you go down this route!
  • Finally you can potentially see significant cost savings. If there are no requests, then there is no charge! If you were running some form of event driven application or trigger, instead of paying tens or hundreds of pounds per month for a server, you might only be paying pennies! Equate this to dev/test platforms which might only be needed to run workloads for a few hours a day, or production platforms which only need to process transactions when customers are actually online, it really starts to add up, even more than auto-scaling IaaS platforms.
    Taking that a step further, if you have are running a startup, why pay hundreds or thousands a month for compute you “might” need but which often sits idle, over-throwing your functions into a scalable platform which will only charge you for actual use! I know where I would be putting my money if I were a VC…

Serverless Computing is hot!

Closing Thoughts

Serverless is a really interesting technology move for the industry which (as always) comes with it’s own unique set of benefits and challenges. I can’t see it ever being the defacto standard for everything (for the same reasons we still use mainframes and physical servers today), however there are plenty of brilliant use cases for it. If devs and startups are comfortable with the vendor lock-in and other risks, why wouldn’t they consider using it?

StorageOS – An array based on containers? It’s like storage for millenials!

Last week I managed to catch up with the guys from StorageOS, a new container-based storage company, headquartered in London. I found out about them at a London Storage Beers event a few weeks ago, and my first question was, what the hell is container-based storage, and how does it work?!

They started from the premise (yes that’s actually the correct use of the word premise!), that if you want to build a storage system FOR containers, what better way to do it than to build it FROM containers. StorageOS therefore offer what they describe as “full enterprise storage array functionality, delivered by software, on a pay-as-you-go basis”. They also plan to offer a free-forever Developer tier, which includes everything except HA functionality which you would obviously need for production usage!

StorageOS Announcement

So the good news is, today (Monday 20th June 2016) StorageOS are announcing the release of their Beta at DockerCon, so you can now download and test out their new storage platform.

The StorageOS Stack

The StorageOS Stack

 

You can deploy this StorageOS software anywhere from bare metal to containers:

StorageOS - It's software, so it runs anywhere!

It’s software, so it runs anywhere!

Appliances for some of the larger clouds are in the works, but will not be available on day zero.

They can then consume any back-end storage, from SSD, HDDs and virtual drives, to EBS volumes, object stores, etc. You then pool all of capacity from all devices into a capacity pool, which is deduped, encrypted, and available across all nodes, and carve out volumes to present to systems like Docker through their own native Docker driver, or (slightly oddly) iSCSI / FC!!! They even have VAAI support in development!

Overall, I think it’s a pretty interesting product. At first look it feels a bit like a traditional array in a container package, much like if you containerised an enterprise app, then just utilised as a traditional array with some container plugins, instead of being very targeted and container-specific. StorageOS do have an OS driver to let you mount their volumes direct from containers, but there are other things out there today which do that anyway (e.g. Flocker).

I would say their messaging is a little inconsistent at the moment, and adding things like FC integration early on feels a bit odd if they’re positioning themselves as a container play. They do however state clearly that they’re targeting enterprises and want to make the on-boarding process as simple and friction-less as possible. I do worry that this “all things to all people” approach could be a wee bit risky at this early stage, and being more laser focused in the short to medium term would allow them to differentiate more.

StorageOS Cloud

The founders were very specific when they stated that they were building a clustered array with synchronous remote replicas, not a distributed storage array. Async replication is coming, which will be critical to maintaining performance in a hybrid cloud or multi-cloud setup. I really like the fact that you can stretch the same hybrid storage environment between your on-premises and cloud infrastructure using a single storage solution. This same solution can actually be used to span multiple public clouds as well, providing a resilient storage solution between say AWS and Azure, all of which is deduped and encrypted of course! This could be very interesting indeed, as customers look to protect their workloads from large public outages!

Finally, the StorageOS software is built (as you would expect these days) with APIs at the heart of everything. Even the modern GUI is really just based on API calls to the back end.

The Tekhead Take

Anyway, enough gabbing… It’s still early days, but the storage experience of the founders is certainly solid! Who better than ex-storage admins to provide a product that works well for storage admins?! I’d say there’s a good chance of this becoming a pretty cool product in the future, so definitely one to watch!

You can find a link to their website and beta sign up here:
http://storageos.com/index.php/product/

StorageOS hipster-approved storage

Amazon AWS Tips and Gotchas – Part 4 – Direct Connect & Public / Private VIFs

Continuing in this series of blog posts taking a bit of a “warts and all” view of a few Amazon AWS features, below are a handful more tips and gotchas when designing and implementing solutions on Amazon AWS, specific to Direct Connect.

For the first post in this series with a bit of background on where it all originated from, see here:
Amazon #AWS Tips and Gotchas – Part 1

For more posts in this series, see here:
Index of AWS Tips and Gotchas

Tips and Gotchas – Part 4
10. VPC Private / Public Access Considerations

If you have gone out and bought a shiny new Direct Connect to your AWS platform, you might reasonably assume that all of the users and applications on your MPLS will automatically start using this for accessing S3 content and other AWS endpoints. Unfortunately, this is not so simple!

At a high level, here is a diagram showing the two primary Direct Connect configurations, Public and Private:

AWS Direct Connect Public and Private VIFMore Info on Direct Connect here:
AWS Direct Connect by Camil Samaha

A key point to note about Direct Connect is that it supports multiple VIFs per 1Gbps or 10Gbps link:

aws2If you are not a giant enterprise and don’t need this kind of bandwidth, you can buy single VIFs from your preferred network provider, but you will pay for it on a per-VIF basis and as such multiple VPCs Direct Connect access to public endpoints will bump up your costs a bit.

The question therefore becomes, what is the cost-effective and simple solution to access service endpoints (such as S3 in the examples below), when you also want to access your private resources in your own VPCs?

This is not always a straight forward answer if you are on a tight budget.

Accessing S3 via your Direct Connect

As I understand it, the S3 endpoint acts very much like VPC peering, only it is from your VPC to S3, and is therefore subject to similar restrictions. Specifically, the S3 endpoint documentation has a very key statement:

Endpoint connections cannot be extended out of a VPC. Resources on the other side of a VPN connection, a VPC peering connection, an AWS Direct Connect connection, or a ClassicLink connection in your VPC cannot use the endpoint to communicate with resources in the endpoint service”.

Basically this means for every VPC you want to communicate with directly from your MPLS, you need another VIF, and hence another connection from your service provider. If you want to access S3 services and other AWS public endpoints directly, you will also need an additional connection dedicated to that. This assumes your requirements are not enough to justify buying a 1Gbps / 10Gbps pipe for your sole use, and are using a partner to deliver it. If you can buy 1Gbps or above then you can subdivide your pipe into multiple VIFs for little / no extra cost.

Here are four example / potential solutions for different use cases, but they are definitely NOT all recommended or supported.

  • Assuming you are using a Private VIF, then by default, the content in S3 is actually accessed over the internet (e.g. using HTTPS if you bucket is configured as such):
    This may come as a surprise to people, as you would expect to buy a connection and access any AWS service.AWS Direct Connect Private VIF
  • If you have a Direct Connect from your MPLS into Amazon as a Public connection / VIF you can then route to the content over your Direct Connect, however this means you are bypassing your VPC and going straight into Amazon.
    This is a bit like having a private internet connection, so accessing VPCs etc securely would still require you run an IPsec VPN over the top of your “public” connection. This will work fine and will mean you can maximise the utilisation of the bandwidth on your direct connect, reduce your Direct Connect costs by sharing one between all VPCs. This is OK, but frankly not brilliant as you are ultimately still depending on VPNs to secure your data. If you want very secure, private access to your VPCs, you should really just spend the money! 🙂AWS Direct Connect Public VIF
  • If you have a Direct Connect from your MPLS into Amazon as a Private connection / VIF, you could proxy the connectivity to S3 via an EC2 instance. The content is requested by your instance using the standard S3 API and forwarded back to your clients. This means your EC2 instance is now a bottleneck to your S3 storage, and if you want to avoid it becoming a SPoF, you need at least a couple of them.
    It is worth specifically noting that although technically possible, this method would be strictly against all support and recommendations from AWS! S3 Endpoints and VPC peers are for accessing content from your VPCs, they are NOT meant to be transitive.AWS Direct Connect Private VIF
  • Lastly, Amazon’s primary recommended method is to run multiple VIFs, mixing both public and private. This biggest downside here is that each VIF will likely have a specific amount of bandwidth associated with it and you will have to procure multiple connections from your provider (unless you are big enough to need to buy a minimum of 1 Gbps!).AWS Direct Connect Public and Private VIFs

As this scales to many accounts, many VPCs and many VIFs, things also start to get a bit complex when it comes to routing (especially if you want many or all of the VPCs in question to be able to route to eachother), and I will cover that in the next post.

Until then…

AWS Direct Connect VIF networkingFind more posts in this series here:
http://www.tekhead.org/tag/awsgotchas/

Amazon AWS Tips and Gotchas – Part 5 – Managing Multiple VPCs

%d bloggers like this: