Archive for Cloud

Docker – State of the Nation (aka Observations of a Brit)

Docker Logo

It may surprise you to learn that Docker is actually quite old now (at least in Startup terms!), having released the first version of their very cool software in March 2013!

Throughout that time, Docker (the company) have moved at a fairly rapid pace in terms of feature and etween

ug releases, with an average of a point release about every quarter and minor releases every month (or more)!

Whilst sitting here awaiting my flight to VMworld Europe 2017, where there are MANY MANY MANY (MANY!) sessions on Docker, Photon, Kubernetes, etc on the session schedule, I am prompted to consider Docker’s rise to popularity, and finish off a post I begun a few months back after Tech Field Day 12!

Well come on Galbraith… get on with it then!

My experience in UK IT industry over the last (nearly) 15 years has taught me a few things, one of which is that whenever new technologies begin serious adoption in the US, it usually becomes popular in the UK within 2-3 years. That said, this number has been squeezed down a little in the past few years as companies move towards more agile development and deployment methods. Fail fast is becoming the mantra of many more organisations, though some people I speak to still wake up with night sweats at even the thought!

The first time a customer asked me about Docker in the UK was over 3 years ago, yet in all that time, people I talk to outside of the social media bubble many of us live in have been virtually silent about it; that is until now. Docker is becoming a weekly conversation topic now with a lot of organisations I talk to, with a many people wanting to jump on board the band wagon. The switch from an operating system-centric view of the world, to a more application and service-oriented (or should that be microservice-oriented) view of the world is becoming far more prevalent in my experience.Docker Swarm

Drivers to Docker Adoption

So what is it about this Docker stuff which seems to be catching the attention of people I talk to? A few common themes I hear are:

Automation of code deployment pipelines (CI/CD) to increase business agility
I think this is probably the number one driver to Docker adoption for people I talk to. Automation of CI/CD pipelines has become so common now, it is almost becoming the norm. Yes, it is tricky to do this with more traditional applications, but it certainly isn’t impossible. Using containers as the delivery mechanism for your application provides very consistent and repeatable outcomes. I mean, you can even get Oracle DB in a container now?!?!

That said, once you dockerise your applications there are many further challenges you will run into, including something as simple as how to apply your current security tooling, policies and proceedures to these new environments.

Maturity of the platform
The Docker code base and third party ecosystem has finally reached a point of maturity where many of the networking and storage issues of the past are beginning to reduce to within acceptable risk boundaries.

Improved cross-industry support
Following this maturity model, a swathe of vendors have put their names behind the Docker ecosystem; from VMware to Openstack, AWS to Azure, Google to Cloud Foundary, everyone is getting on board! You no longer have to buy support direct from Docker (the company), but can instead get it from your cloud vendor, along with a managed orchestration tier too, such as Docker Swarm, Kubernetes or Mesos!

Because Cloud
Yes, you can Dockers your existing applications for use on premises, but many organisations I speak to are using Docker as a method to allow their developers to write code on premises, test in their dev environments on prom or in the cloud, then deploy in a consistent fashion to their brand spanking new Production cloud platforms. PaaS solutions such as Azure WebApps and AWS Elastic Beanstalk are becoming a good option for customers who just want to write code, but for those who want that little bit more control, Docker gives them flexibility and consistency.to the cloud

CIO/CTO CV Padding
I hate to play the cynic, but I think there is definitely a significant percentage of CIOs/CTOs who are doing “digital transformations through containerisation and cloud” specifically to pad out their CVs and help them get a better gig.

This is otherwise known as a “Resume-driven IT Strategy”!

I am aware of one CIO who deliberately went to a cloud platform, even though it was significantly more expensive than a traditional managed hosting solution of a similar spec, when their business case and steady workload drew few, if any discernible benefits from the use of cloud.
CIO CV Padding When I hear people refer to technologies such as VMware vSphere as “Legacy” it really drives home to me the shift we are all going through, yet again, in the industry. This is another reason though which CIOs/CTOs/Heads of IT tell me they want cloud and containers. That said, I still struggle to find a single person who doesn’t have at least one physical server in their infrastructure, so just like the mainframe before it, I don’t think the hypervisor is going away any time yet!

The Tekhead Take

As expected the lag of a couple of years from the US to the UK in adoption of containers was apparent, but now is most definitely the time! Despite both positive and negative reasons for integrating it, Docker has become the part of the information technology zeitgeist in the UK…

Want to Know More?

I was fortunate enough to meet with the product team from Docker at Tech Field Day 12 towards the end of last year. It was a really interesting session which covered many of the enterprise networking and security features recently introduced to the platform, along with Docker’s new support offerings. I highly recommend checking it out!

Docker Presents at Tech Field Day 12

Some of the other TFD12 delegates had their own thoughts on the session and Docker as a whole. You can find them here:

Disclaimer/Disclosure: My flights, accommodation, meals, etc, at Tech Field Day 12 were provided by Tech Field Day / Gestalt IT, but there was no expectation or request for me to write about any of the vendors products or services and I was not compensated in any way for my time at the event.

Does Cloud Provide Infinite Storage Capacity and Retention?

cloud

I wrote last week about the challenges of long-term retention of data, and some of the architectural considerations and decisions we take in designing long-term backup or archive solutions. The follow-up question therefore is, does the cloud provide infinite storage capacity and retention?

“Cloud Integration”

One of the key themes which I have been seeing of late with many (if not all!) modern storage solutions, is some form of cloud integration. It seems to me that many vendors are trying to ensure they can tick the “cloud integration” check box in an RFP or RFI!

I recall one time at a previous organisation, our storage team did an RFP asking for an array which was capable of doing file presentation. The response in the RFP was “Yes”, but when this was dug into a bit further (after the fact), it turned out that this was only possible with an HA pair of custom vendor file gateways. In other words, not much better than building your own file server!

Anyway back to the point, this “RFP checkbox” mentality means that some vendors have a very tight cloud integration with multiple target replication options (such as DC to DC, DC to Cloud, Cloud to DC, Cloud to Cloud, etc), whilst others provide little more than lip service to cloud integration.

The best suggestion I can make in this scenario is to push your vendor for either a demo, a PoC, or a software copy of their array, if they have one. That way you can be absolutely sure that what is claimed, is indeed what you are looking for!

One Possible Solution… EMC Unity

One solution I believe falls more and more into the cloudy camp with each code release, is the new EMC Unity arrays, for which we were provided a briefing at the recent Storage Field Day 13 event.

What I found particularly interesting was that the arrays were natively capable of up to 256 redirect-on-write snapshots per volume, which sounds like a lot, but if you do one every 5 minutes then you will run out pretty fast! By utilising the EMC Cloud Tiering Appliance (a totally separate management interface today, which I really hope EMC fully integrate into the Unity pretty quick, as multiple panes of glass are no fun for anyone!), we can utilise any S3-compatible storage to provide UNLIMITED snapshots.unlimited snapshot retention

This is pretty cool if you have to provide very granular restoration points for your application data, as well as the ability to off-site at relatively low cost in a near-infinite data storage facility!

Sadly, you can’t currently run VMs directly from those snapshots in the cloud, but bearing in mind that EMC has a software only version of the Unity already available, I have a sneaking suspicion that there will be some engineering talent working on this as we speak! This would potentially provide the ability to snap and replicate your entire estate natively to S3 buckets in the cloud, then restore very quickly locally within that IaaS platform. Let’s hope I’m right!

Want to Know More?

EMC’s sessions on Unity, Scale-IO and Isilon were recorded and are now available to stream online:

Some of the other SFD13 delegates had their own thoughts on the session and EMC in general. You can find them here:

Disclaimer/Disclosure: My flights, accommodation, meals, etc, at Storage Field Day 13 were provided by Tech Field Day / Gestalt IT, but there was no expectation or request for me to write about any of the vendors products or services and I was not compensated in any way for my time at the event.

Startup Spotlight: Re-skill, Pivot or Get Squashed

spotlight

The subject matter of this post is a startup of sorts and was triggered by a conversation I had with an industry veteran a few months back. By veteran of course, I mean an old bugger! 😉

It is an entity which begins its journey sourcing a target market in the tech industry and spends day and night pursuing that market to the best of its ability.

It brings in resources to help meet the key requirements of the target market; some of those resources are costly, and others not so much.

Occasionally it hits a bump in the road with funding and potentially needs to find other sources of investment, and may go through several rounds of funding over the course of a number of years. Eventually it gets to a point where the product is of a decent quality and market value.

Then it does a market analysis and discovers that the market has shifted and if the entity does not pivot or indeed re-skill, they will become irrelevant within a few short years.

Eh?

I am of course talking about the career of an IT professional.

Though I may be slightly exaggerating on the becoming irrelevant quite so fast, we certainly all made the choice to follow a career in one of the fastest moving industries on the planet. We have no choice but to continue to develop and maintain our knowledge, in order to keep driving our careers forward.

As a self-confessed virtual server hugger with a penchant for maintaining a pretty reasonable home lab, I enjoy understanding the detailed elements of a technology, how they interact, and acknowledging where the potential pitfalls are. The cloud, however, is largely obfuscated in this respect; to the point where many cloud companies will not even divulge the location of their data centres, never mind the equipment inside them and configuration thereof!

Obfuscation

Obfuscation

That said, those of you with a keen eye may have noticed a shift in my twitter stream in the past year or so, with subjects tending towards a more public cloudy outlook… Talking to a huge range of customers in various verticals on a regular basis, it feels to me that a great many organisations are right on the tipping point between their current on-premises / dedicated managed services deployment models, and full public cloud adoption (or at the very least hybrid!).

It’s hard to believe that companies like AWS have actually been living and breathing public cloud for over ten years already; that’s almost as long as my entire career! In that time they have grown from niche players selling a bit of object storage, to the Behemoth-aaS they are today. To a greater or lesser extent (and for better or worse!), they are now the yardstick upon which many cloud and non-cloud services are measured. This is also particularly the case when it comes to cost, much to the chagrin of many across the industry!

To me, this feels like the optimum time for engineers and architects across our industry (most definitely including myself) to fully embrace public and hybrid cloud design patterns. My development has pivoted predominantly towards technologies which are either native to, or which support public cloud solutions. Between family commitments, work, etc, we have precious little time to spend in personal development, so we need to spend it where we think we will get the most ROI!

charge

So what have I been doing?

Instead of messing about with my vSphere lab of an evening, I have spent recent months working towards certified status in AWS, Azure, and soon, GCP. This has really been an eye opener for me around the possibilities of designs which can be achieved on the current public cloud platforms; never mind the huge quantity of features these players are likely to release in the coming 12 months, or the many more after that.

Don’t get me wrong, of course, everything is not perfect in the land of milk and honey! I have learned as much in these past months about workloads and solutions which are NOT appropriate for the public cloud, as I have about solutions which are! Indeed, I have recently produced a series of posts covering some of the more interesting AWS gotchas, and some potential workarounds for them. I will be following up with something similar for Azure in the coming months.

Taking AWS as an example, something which strikes me is that many of the features are not 100% perfect and don’t have every feature and nerd knob under the sun available. Most seem to have been designed to meet the 80/20 rule and are generally good enough to meet the majority of design requirements more than adequately. If you want to meet a corner use case or a very specific requirement, then maybe you need to go beyond native public cloud tooling.

Perhaps the same could be said about the mythical Full Stack Engineer?

Good Enough

Anyhow, that’s enough rambling from me… By no means does this kind of pivot imply that everything we as infrastructure folks have learned to date has been wasted. Indeed I personally have no intention to drop “on premises” skills and stop designing managed dedicated solutions. For the foreseeable future there will likely be a huge number of appropriate use cases, but in many, if not most cases I am being engaged to look at new solutions with a publicly cloudy mindset!

Indeed, as Ed put it this time last year:

Downtime sucks! Designing Highly Available Applications on a Budget

HA Minions

Downtime sucks.

I write this whilst sitting in an airport lounge, having been disembarked from my plane due to a technical fault. I don’t really begrudge the airline in question; it was a plumbing issue! This is a physical failure of the aircraft in question and just one of those things (unless I find out later they didn’t do the appropriate preventative maintenance of course)! Sometimes failures just happen and I would far rather it was just a plumbing issue, not an engine issue!

What is not excusable, however, is if the downtime is easily preventable; for example, if you are designing a solution which has no resilience!

This is obviously more common with small and medium sized businesses, but even large organisations can be guilty of it! I have had many conversations in the past with companies who have architected their solutions with significant single points of failure. More often than not, this is due to the cost of providing an HA stack. I fully appreciate that most IT departments are not swimming in cash, but there are many ways around a budgetary constraint and still provide more highly available, or at least “Disaster Resistant” solutions, especially in the cloud!HA Austin Powers Meme

Now obviously there is High Availability (typically within a single region or Data Centre), and Disaster Recovery (across DCs or regions). An ideal solution would achieve both, but for many organisations it can be a choice between one and the other!

Budgets are tight, what can we do?

Typically HA can be provided at either the application level (preferred), or if not, then at the infrastructure level. Many solutions to improvise availability are relatively simple and inexpensive. For example:

  • Building on a public cloud platform (and assuming that the application supports load balancing), why not test running twice as many instances with half the specification each? In most cases, unless there are significant storage quantities in each instance, the cost of scaling out this way is minimal.
    If there is a single instance, split it out into two instances, immediately doubling your availability. If there are two instances, what about splitting into 4? The impact of a node loss is then only 25% of the overall throughput capacity for the application, and can even bring down the cost of HA for applications where the +1 in N+1 is expensive!
  • Again in cloud, if there are more than two availability zones in a region (e.g. on AWS), then take advantage of them! If an application can handle 2 AZs, then the latency of adding a third shouldn’t make much, if any difference, and costs will only increase slightly with a small amount of extra inter-AZ bandwidth or per-AZ services (e.g NAT gateways).
    Again, in this scenario the loss of an AZ will only take out 33% of the application servers, not 50%, so it is possible to reduce the number of servers which are effectively there for failover only.
  • If you can’t afford to run an application as multi-AZ or multi-node, consider putting it in an auto-scaling group or scale-set with a minimum and maximum of 1 server. That way if an outage occurs or int he case of AWS, an entire AZ goes down, an instance will automatically be regenerated in an alternative AZ.HA Oliver
What if my app doesn’t like load balancers?

If you have an application which cannot be load balanced, you probably shouldn’t be thinking about running it in the cloud (not if you have any serious availability requirements anyway!). It amazes me how many business critical applications and services are still running in single servers all over the world!

  • If your organisation is dead set on using cloud for a SPoF app, then making it as ephemeral as possible can help. Start by splitting the DBs from the apps, as these can almost always be made HA by some means (e.g. master/slave replication, mirroring, log shipping, etc). Failover nodes also often don’t attract a license fee from many vendors (e.g. MS SQL), so always check your license documentation to see what you can achieve on the cheap.
  • Automate! If you can deploy application server(s) from a script, even if the worst happens, the application can be redeployed very quickly, in a consistent fashion.
    The trend at the moment is moving towards a more agile deployment process and automated CI/CD pipelines. This enables companies to recover from an outage by rebuilding their environments and redeploying code rapidly (as long as they have a replica of the data or a highly available datastore!).
  • If it’s not possible to script or image the code deployment, then taking regular backups (and snapshots where possible) of application servers, and testing them often is an option! If you don’t want to go through the inflexible, unreliable and painful nightmare of doing system state restores, then take image-based backups (supported by the vast majority of backup vendors nowadays). Perhaps even syncing of application data to a warm standby server which can be brought online reasonably swiftly, or even use an inexpensive DR service such as Azure Site Recovery, to provide an avenue of last resort!
  • If maybe cloud isn’t the best place to locate your application, then provide HA at the infrastructure layer by utilising the HA features of your favourite hypervisor!
    For example, VMware vSphere will have an instance back up and running within a minute or two of the failure of a host using the vSphere HA feature (which comes with every edition except Essentials!). On the assumption/risk that the power cycle does not corrupt OS, applications or data, you minimise exposure to hardware outages.
  • If the budget is not enough to buy shared storage and all VMs are running on local storage in the hypervisor hosts (I have seen this more than you might imagine!), then consider using something like vSphere Replication or Hyper-V Replicas to copy at least one of each critical VM role to another host, and if there are multiple instances, then spread them around the hosts.

Finally, make sure whatever happens there is some form of DR, even if it is no more than a holding page or application notification and a replica or off-site backup of critical data! Customers and users would rather see something telling them that you’re working to resolve the problem, than getting a spinning wheel and a timeout! If you can provide something which is of limited functionality or performance, then it’s better than nothing!

HA ServersTLDR; High Availability on a Budget

There are a million and one ways to provide more highly available applications; these are just a few. The point is that providing highly available applications is not as expensive as you might initially think.

With a bit of elbow grease, a bit of scripting and regular testing, even on the smallest budgets you can cobble together more highly available solutions for even the crummiest applications! 🙂

Now go forth and HA!

%d bloggers like this: