Site icon Tekhead.it

Downtime sucks! Designing Highly Available Applications on a Budget

HA Minions

Downtime sucks.

I write this whilst sitting in an airport lounge, having been disembarked from my plane due to a technical fault. I don’t really begrudge the airline in question; it was a plumbing issue! This is a physical failure of the aircraft in question and just one of those things (unless I find out later they didn’t do the appropriate preventative maintenance of course)! Sometimes failures just happen and I would far rather it was just a plumbing issue, not an engine issue!

What is not excusable, however, is if the downtime is easily preventable; for example, if you are designing a solution which has no resilience!

This is obviously more common with small and medium sized businesses, but even large organisations can be guilty of it! I have had many conversations in the past with companies who have architected their solutions with significant single points of failure. More often than not, this is due to the cost of providing an HA stack. I fully appreciate that most IT departments are not swimming in cash, but there are many ways around a budgetary constraint and still provide more highly available, or at least “Disaster Resistant” solutions, especially in the cloud!

Now obviously there is High Availability (typically within a single region or Data Centre), and Disaster Recovery (across DCs or regions). An ideal solution would achieve both, but for many organisations it can be a choice between one and the other!

Budgets are tight, what can we do?

Typically HA can be provided at either the application level (preferred), or if not, then at the infrastructure level. Many solutions to improvise availability are relatively simple and inexpensive. For example:

What if my app doesn’t like load balancers?

If you have an application which cannot be load balanced, you probably shouldn’t be thinking about running it in the cloud (not if you have any serious availability requirements anyway!). It amazes me how many business critical applications and services are still running in single servers all over the world!

Finally, make sure whatever happens there is some form of DR, even if it is no more than a holding page or application notification and a replica or off-site backup of critical data! Customers and users would rather see something telling them that you’re working to resolve the problem, than getting a spinning wheel and a timeout! If you can provide something which is of limited functionality or performance, then it’s better than nothing!

TLDR; High Availability on a Budget

There are a million and one ways to provide more highly available applications; these are just a few. The point is that providing highly available applications is not as expensive as you might initially think.

With a bit of elbow grease, a bit of scripting and regular testing, even on the smallest budgets you can cobble together more highly available solutions for even the crummiest applications! 🙂

Now go forth and HA!

Exit mobile version