Tag Archive for architecture

Amazon AWS Tips and Gotchas – Part 5 – Managing Multiple VPCs

Continuing in this series of blog posts taking a bit of a “warts and all” view of a few Amazon AWS features, below are a handful more tips and gotchas when designing and implementing solutions on Amazon AWS, based around VPCs and VPC design.

For the first post in this series with a bit of background on where it all originated from, see here:
Amazon #AWS Tips and Gotchas – Part 1

For more posts in this series, see here:
Index of AWS Tips and Gotchas

AWS Tips and Gotchas – Part 5

11. Managing Multiple VPCs & Accounts

Following on from the previous post, let us assume that instead of just talking about public services endpoints (e.g. S3, Glacier, etc), and instead we are talking about environments with multiple VPCs, possibly multiple accounts, and the potential addition of Direct Connect on top.

AWS VPC VPCs

Why would you do this? Well, there are numerous reasons for logically separating things such as your dev/test and production environments from a security and compliance perspective. The one that people sometimes get hung up on is why would I want more than one account? As it goes, some AWS customers run many tens or even hundreds of accounts! Here are a few examples:

  • The simplest answer to this is so that you can avoid being “CodeSpaced” by keeping copies of your data / backups in a second account with separate credentials!
  • Separation of applications which have no direct interaction, or perhaps minimal dependencies, to improve security.
  • Running separate applications for different business units in their own accounts to make for easier LoB billing.
  • Allowing different development teams to securely work on their own applications without risking impact to any other applications or data.
  • With the mergers and acquisitions growth strategy which many companies adopt, it is fairly common these days for companies to be picked up and bring their AWS accounts and resources with them.
  • Lastly, a very common design pattern for compliance is to use a separate account to gather all of your CloudTrail and other audit logs in a single account, inaccessible to anyone except your security team, and therefore secure from tampering.

The great thing is that with consolidated billing, you can have as many accounts as you like whilst still receiving a single monthly bill for your organisation!

We will now look at a few examples of ways to hang together your VPCs and accounts, and in the majority of cases, you can effectively consider the two as interchangeable in so far as the scope of this post.

Scenario A – Lots of Random VPC Peering and a Services VPC

This option is ok for small solutions but definitely does NOT scale and is also against best practice recommendations from AWS. As mentioned in the previous section, transitive peering is also not possible unless you are somehow proxying the connections, so if you are looking to add Direct Connect to this configuration, this just simply isn’t going to fly.

Imagine that all of the blue dotted arrows in the following diagram were VPC peering connections! Aaaaargh!

AWS VPC VPCs

Option B – Bastion Server in Services VPC

If each of your VPCs is independent, and you only need to manage them remotely (i.e. you are not passing significant traffic between many different VPCs, or from AWS to your MPLS, then a services VPC with a bastion server may be a reasonable option (hub and spoke):

http://docs.aws.amazon.com/AmazonVPC/latest/PeeringGuide/peering-configurations-full-access.html

In this example, you could push a Direct Connect VIF into VPC A and via your bastion server, manage servers in each of your other VPCs. This would NOT be appropriate if your other servers / clients on premises wanted to access those resources directly, however, and is more likely in the scenario where each VPC hosts some form of production or dev/test platform which is internet facing, and this is effectively your management connection in the back door.

You might also potentially aggregate all of your security logs etc into the bastion VPC.

AWS VPC VPCs

Scenario C – Full Mesh

This is like a neater version of Scenario A. Holy moly! Can you imagine trying to manage, support or troubleshoot this?

AWS VPC VPCs

Even something as simple as managing your subnets and route tables would become a living, breathing nightmare! Then what happens every time you want to add another VPC? shudder

If you require this level of inter-VPC communication, then my first question would be why are you splitting the workloads across so many dependent VPCs, and where is the business benefit to doing so? Better to look at rationalising your architecture than try to maintain something like this.

Scenario D – Lollipop Routing

If you absolutely must allow every VPC to talk to most or even every other VPC, and the quantity of VPCs is significant then it may be worthwhile looking at something more scalable and easy to manage.

This one is more scalable from a management perspective, but if I am honest, I am not massively keen on it! It feels a bit like AWS absolving themselves of all responsibility when it comes to designing and supporting more complex network configurations. It could potentially also work out rather expensive as you could end up needing a fairly hefty amount of Direct Connect bandwidth to support the potential quantity of traffic at this scale, as well as adding a load of unnecessary latency.

I would prefer that AWS simply allowed some form of auto configured mesh with a simple tag/label assigned to each VPC to allow traffic to route automatically. If only such a technology existed or could be used as a design template!?! (sarcasm mode off – MPLS anyone?)

I am confident that at the rate AWS are developing new services, providing automation of VPC peering won’t be miles off (as suggested by the word “presently” in the following slide from an AWS presentation available on slideshare from last July (2015):

AWS VPC VPCs

In the meantime, we are left with something that looks a bit like this:

AWS VPC VPCs

When reaching this kind of scale, there are also a few limitations you want to be aware of:

AWS VPC VPCs

And Finally… NOTE: Direct Connect is per-Region

When you procure a direct connect, you are not procuring a connection to “AWS”, you are procuring a connection to a specific region. If you want to be connected to multiple AWS regions, you will need to procure connections to each region individually.

To an extent I can see that this makes some logical sense. Let’s say they allowed access through one region to others, if you have connections to a single region and that region has a major issue, you could end up losing access to all regions.

What would be good though would be the ability to connect to two regions, which would then provide you with region resilient access to the entire AWS network of regions. Whether this will become a reality is yet to be seen, but I have heard rumblings that there may be some movement on this in the future.

Wrapping Things Up

As you can see, getting your VPC peering and Direct Connect working appropriately, especially at scale, is a bit of a minefield.

I would suggest that if you are seriously looking at using Direct Connect, and need some guidance you could do worse than have a chat with your ISP, MSP or hosting provider of choice. They can help you to work out a solution which is best for your businesses requirements!

Find more posts in this series here:
Index of AWS Tips and Gotchas

Further Reading

Here are links to a few resources used in the writing of this post, worthwhile reading if you want to understand the subject more thoroughly:

Amazon AWS Tips and Gotchas – Part 6 – AWS Dedicated VPCs

7 Reasons Why You Should Read The Phoenix Project

The Phoenix Project

I began reading The Phoenix Project with no preconceptions, other than having been told that it is a great book, and hearing it mentioned many times on Eric Wright‘s GC On Demand podcast.

Written by Gene Kim, Kevin Behr, and George Stafford, it is told as a first-person narrative from the perspective of Bill, a middleware team manager who is promoted into a senior IT management role for a business in jeopardy. Through his experiences and a guiding hand from another key character, together we work through the problems facing the business, the IT department and the individuals within.

The story is told in an easy to read, informal style, and I made quick work of it over the course of just a few days. I really enjoyed it on numerous levels:

  1. I recognised every single character in the book as somebody I have worked with (or indeed currently work with!). I guarantee you will feel the same!
  2. The book was pretty well written, and the story arc itself was compelling. I was really rooting for Bill to succeed in his endeavours! (But did he? You will have to read the book to find out!)
  3. The authors obviously have a great sense of humour! Quotes such as “Show me a dev who isn’t crashing production systems, and I’ll show you one who can’t fog a mirror. Or more likely, is on vacation.” had me laughing out loud on the train in front of other passengers!
  4. The book is approachable and not elitist. You could pick it up as a cable monkey or an IT director (or maybe even a Sales person!!!), and relate to the concepts and methods described.
  5. I learned a huge amount about different methods for handling and improving processes around WIP (Work in Progress), such as the Theory of Constraints or the use of Kanban boards (I am currently testing this with my pre-sales customer workloads using Trello, but I’m told Kanbanize is also very good). Resilience Engineering (think Netflix Simian Army) and numerous other techniques are also covered, along with the overarching “Three Ways” (very Zen!).
  6. I actually picked up a few key tips which could be applied directly to my pre-sales design and requirements gathering workshops with my customer stakeholders.
  7. Finally, it didn’t feel “preachy”, which is always a risk when trying to sell an idea / concept as your main theme and I was initially concerned that the book would be ramming DevOps culture down my neck throughout. This could not be farther from the truth, and the full DevOps concepts do not come into play until the story is almost complete. There are many lessons to be learned throughout the story, which could be applied to any organisation!

The Phoenix Project Cover

Here are another few choice quotes from The Phoenix Project, both humorous and insightful:

“The only thing more dangerous than a developer is a developer conspiring with Security. The two working together gives us means, motive, and opportunity.”

“How can we manage production if we don’t know what the demand, priorities, status of work in process, and resource availability are?”

“You just described ‘technical debt’ that is not being paid down. It comes from taking shortcuts, which may make sense in the short-term. But like financial debt, the compounding interest costs grow over time. If an organization doesn’t pay down its technical debt, every calorie in the organization can be spent just paying interest, in the form of unplanned work.”

“On the other hand, if a resource is ninety percent busy, the wait time is ‘ninety percent divided by ten percent’, or nine hours. In other words, our task would wait in queue nine times longer than if the resource were fifty percent idle.”

In case you hadn’t felt like I was positive enough about The Phoenix Project yet, I would say that this book should be provided as mandatory training to every person working in every IT department today, from the guys plugging in cables to the CIO!

If you do read and enjoy the book, I highly recommend also reading The Goal by Eliyahu M. Goldratt. I was a little surprised, to say the least, that this appears to be a very similar story, following a similar arc and some almost identical characters to The Phoenix Project. That said, I am half way through it at the moment and still thoroughly enjoying it, though I am not too worried about missing the movie version!

The Goal by Eli Goldratt CoverThe Goal delves even deeper into the Theory of Constraints and explains some of the tools we can use to mitigate, bypass or remove constraints in a system. All of these tools and methods can be applied as easily to IT as they can to production lines, which (without stating the bleeding obvious) is exactly the point of The Phoenix Project!

Anyway, if you want to do yourself a favour both in terms of your career development, but also a really compelling story and a thoroughly decent book, you could do a lot worse than spending £5 on the Kindle Edition of The Phoenix Project!

Where To Get Them

For anything technical, I like to buy ebooks these days for both portability and the fact that I wont be chopping down trees needlessly. Both of the above titles are available very inexpensively on Kindle:

And Finally…

Sincerest apologies for one of the most click bait-y blog titles I’ve ever posted! Even worse than this one. Honestly, I feel ashamed!

I’ll get my coat…

Amazon AWS Tips and Gotchas – Part 4 – Direct Connect & Public / Private VIFs

Continuing in this series of blog posts taking a bit of a “warts and all” view of a few Amazon AWS features, below are a handful more tips and gotchas when designing and implementing solutions on Amazon AWS, specific to Direct Connect.

For the first post in this series with a bit of background on where it all originated from, see here:
Amazon #AWS Tips and Gotchas – Part 1

For more posts in this series, see here:
Index of AWS Tips and Gotchas

Tips and Gotchas – Part 4
10. VPC Private / Public Access Considerations

If you have gone out and bought a shiny new Direct Connect to your AWS platform, you might reasonably assume that all of the users and applications on your MPLS will automatically start using this for accessing S3 content and other AWS endpoints. Unfortunately, this is not so simple!

At a high level, here is a diagram showing the two primary Direct Connect configurations, Public and Private:

AWS Direct Connect Public and Private VIFMore Info on Direct Connect here:
AWS Direct Connect by Camil Samaha

A key point to note about Direct Connect is that it supports multiple VIFs per 1Gbps or 10Gbps link:

aws2If you are not a giant enterprise and don’t need this kind of bandwidth, you can buy single VIFs from your preferred network provider, but you will pay for it on a per-VIF basis and as such multiple VPCs Direct Connect access to public endpoints will bump up your costs a bit.

The question therefore becomes, what is the cost-effective and simple solution to access service endpoints (such as S3 in the examples below), when you also want to access your private resources in your own VPCs?

This is not always a straight forward answer if you are on a tight budget.

Accessing S3 via your Direct Connect

As I understand it, the S3 endpoint acts very much like VPC peering, only it is from your VPC to S3, and is therefore subject to similar restrictions. Specifically, the S3 endpoint documentation has a very key statement:

Endpoint connections cannot be extended out of a VPC. Resources on the other side of a VPN connection, a VPC peering connection, an AWS Direct Connect connection, or a ClassicLink connection in your VPC cannot use the endpoint to communicate with resources in the endpoint service”.

Basically this means for every VPC you want to communicate with directly from your MPLS, you need another VIF, and hence another connection from your service provider. If you want to access S3 services and other AWS public endpoints directly, you will also need an additional connection dedicated to that. This assumes your requirements are not enough to justify buying a 1Gbps / 10Gbps pipe for your sole use, and are using a partner to deliver it. If you can buy 1Gbps or above then you can subdivide your pipe into multiple VIFs for little / no extra cost.

Here are four example / potential solutions for different use cases, but they are definitely NOT all recommended or supported.

  • Assuming you are using a Private VIF, then by default, the content in S3 is actually accessed over the internet (e.g. using HTTPS if you bucket is configured as such):
    This may come as a surprise to people, as you would expect to buy a connection and access any AWS service.AWS Direct Connect Private VIF
  • If you have a Direct Connect from your MPLS into Amazon as a Public connection / VIF you can then route to the content over your Direct Connect, however this means you are bypassing your VPC and going straight into Amazon.
    This is a bit like having a private internet connection, so accessing VPCs etc securely would still require you run an IPsec VPN over the top of your “public” connection. This will work fine and will mean you can maximise the utilisation of the bandwidth on your direct connect, reduce your Direct Connect costs by sharing one between all VPCs. This is OK, but frankly not brilliant as you are ultimately still depending on VPNs to secure your data. If you want very secure, private access to your VPCs, you should really just spend the money! 🙂AWS Direct Connect Public VIF
  • If you have a Direct Connect from your MPLS into Amazon as a Private connection / VIF, you could proxy the connectivity to S3 via an EC2 instance. The content is requested by your instance using the standard S3 API and forwarded back to your clients. This means your EC2 instance is now a bottleneck to your S3 storage, and if you want to avoid it becoming a SPoF, you need at least a couple of them.
    It is worth specifically noting that although technically possible, this method would be strictly against all support and recommendations from AWS! S3 Endpoints and VPC peers are for accessing content from your VPCs, they are NOT meant to be transitive.AWS Direct Connect Private VIF
  • Lastly, Amazon’s primary recommended method is to run multiple VIFs, mixing both public and private. This biggest downside here is that each VIF will likely have a specific amount of bandwidth associated with it and you will have to procure multiple connections from your provider (unless you are big enough to need to buy a minimum of 1 Gbps!).AWS Direct Connect Public and Private VIFs

As this scales to many accounts, many VPCs and many VIFs, things also start to get a bit complex when it comes to routing (especially if you want many or all of the VPCs in question to be able to route to eachother), and I will cover that in the next post.

Until then…

AWS Direct Connect VIF networkingFind more posts in this series here:
http://www.tekhead.org/tag/awsgotchas/

Amazon AWS Tips and Gotchas – Part 5 – Managing Multiple VPCs

Secondary can be just as important as Primary

There can be little doubt these days, that the future of the storage industry for primary transactional workloads is All Flash. Finito, that ship has sailed, the door is closed, the game is over, [Insert your preferred analogy here].

Now I can talk about the awesomeness of All Flash until the cows come home, but the truth is that flash is not now, and may never be as inexpensive for bulk storage as spinning rust! I say may as technologies like 3D NAND are changing the economics for flash systems. Either way, I think it will still be a long time before an 8TB flash device is cheaper than 8TB of spindle. This is especially true for storing content which does not easily dedupe or compress, such as the two key types of unstructured data which are exponentially driving global storage capacities through the roof year on year; images and video.

With that in mind, what do we do with all of our secondary data? It is still critical to our businesses from a durability and often availability standpoint, but it doesn’t usually have the same performance characteristics as primary storage. Typically it’s also the data which consumes the vast majority of our capacity!

AFA Backups

Accounting needs to hold onto at leat 7 years of their data, nobody in the world ever really deletes emails these days (whether you realise or not, your sysadmin is probably archiving all of yours in case you do something naughty, tut tut!), and woe betide you if you try to delete any of the old marketing content which has been filling up your arrays for years! A number of my customers are also seeing this data growing at exponential rates, often far exceeding business forecasts.

Looking at the secondary storage market from my personal perspective, I would probably break it down into a few broad groups of requirements:

  • Lower performance “primary” data
  • Dev/test data
  • Backup and archive data

As planning for capacity is becoming harder, and business needs are changing almost by the day, I am definitely leaning more towards scale-out solutions for all three of these use cases nowadays. Upfront costs are reduced and I have the ability to pay as I grow, whilst increasing performance linearly with capacity. To me, this is a key for any secondary storage platform.

One of the vendors we visited at SFD8, Cohesity, actually targets both of these workload types with their solution, and I believe they are a prime example of where the non-AFA part of the storage industry will move in the long term.

The company came out of stealth last summer and was founded by Mohit Aron, a rather clever chap with a background in distributed file systems. Part of the team who wrote the Google File System, he went on to co-found Nutanix as well, so his CV doesn’t read too bad at all!

Their scale-out solution utilises the now ubiquitous 2u, 4-node rack appliance physical model, with 96TB of HDDs and a quite reasonable 6TB of SSD, for which you can expect to pay an all-in price of about $80-100k after discount. It can all be managed via the console, or a REST API.

Cohesity CS2000 Series

2u or not 2u? That is the question…

That stuff is all a bit blah blah blah though of course! What really interested me is that Cohesity aim to make their platform infinitely and incrementally scalable; quite a bold vision and statement indeed! They do some very clever work around distributing data across their system, whilst achieving a shared-nothing architecture with a strongly consistent (as opposed to eventually consistent), 2-phase commit file system. Performance is achieved by first caching data on the SSD tier, then de-staging this sequentially to HDD.

I suspect the solution being infinitely scalable will be difficult to achieve, if only because you will almost certainly end up bottlenecking at the networking tier (cue boos and jeers from my wet string-loving colleagues). In reality most customers don’t need infinite as this just creates one massive fault domain. Perhaps a better aim would be to be able to scale massively, but cluster into large pods (perhaps by layer 2 domain) and be able to intelligently spread or replicate data across these fault domains for customers with extreme durability requirements?

Lastly they have a load of built-in data protection features in the initial release, including instant restore, and file level restore which is achieved by cracking open VMDKs for you and extracting the data you need. Mature features, such as SQL or Exchange object level integration, will come later.

Cohesity Architecture

Cohesity Architecture

As you might have guessed, Cohesity’s initial release appeared to be just that; an early release with a reasonable number of features on day one. Not yet the polished article, but plenty of potential! They have already begun to build on this with the second release of their OASIS software (Open Architecture for Scalable Intelligent Storage), and I am pleased to say that next week we get to go back and visit Cohesity at Storage Field Day 9 to discuss all of the new bells and whistles!

Watch this space! 🙂

To catch the presentations from Cohesity as SFD8, you can find them here:
http://techfieldday.com/companies/cohesity/

Further Reading
I would say that more than any other session at SFD8, the Cohesity session generated quite a bit of debate and interest among the guys. Check out some of their posts here:

Disclaimer/Disclosure: My flights, accommodation, meals, etc, at Storage Field Day 8 were provided by Tech Field Day, but there was no expectation or request for me to write about any of the vendors products or services and I was not compensated in any way for my time at the event.

%d bloggers like this: