Disclosure: I may earn affiliate revenue or commissions if you purchase products from links on my website. The prospect of compensation does not influence what I write about or how my posts are structured. The vast majority of articles on my website do not contain any affiliate links.

Load balancing is an integral component of any service oriented architecture. When deploying microservices, which can be thought of as a specialized subset of SOA, the need is even more apparent.

Let’s say I have a service called Tim’s Petstore running at locally and exposed at /Petstore on this domain. It becomes popular. It turns out that thousands of people dig the way I purvey pets.

The first problem is that my server response times slow because my lone web application can’t handle the traffic. My second problem becomes clear when I try to implement quick hacks to squeeze as much performance as I can out of the service. To deploy changes into production, I have to temporarily stop running my app. This necessitates downtime, however brief. Then, when there are connectivity issues specific to one server, the fact that my app is a single point of failure becomes apparent, and I’m left scrambling to bolster uptime. Some of my customers, steadfast in their desire to buy pets but flaky in their allegiance to my website, move to my competitor’s petstore at http://gdpotter.com. His pets are inferior, to be sure, but his site is reliable!

These three points form the basis of the argument of why high availability is always necessary. Whether a small side project, a fledgling business, or a lesser-known service deployed by a large company, you can often get away without highly-available load balanced services until, one day, your world comes crashing down. The implications, obviously, scale with the business criticality of a service, but you would not believe how many products and websites became popular simply because the competitors that they were formally overshadowed by failed miserably when it came to availability.

There are more–many more–reasons why load balancing is great. But when it comes to implementation of software-based (this is somewhat of a misnomer, and mainly refers to manually-configured open source solutions) load-balancers, you encounter a quandary. If deploying a load balancer solves the single point of failure problem, what’s to stop that load balancer from becoming a single point of failure itself?

Thus the question–do you need to load balance your load balancers? What is the omega, biblically speaking, in this equation? In other words: where is it safe to stop, what additional technologies are required, and why is it so?

High Availability is relative to each component in your architecture. When people say that their service is highly reliable, they might just be referring to running HAProxy in front of the server-side application. On some of the apps I’ve worked on, it’s sufficient to stop there. The application itself is no longer a single point of failure and so can maintain uptime through change windows and some software bugs by having at least one redundant instance. The weak link then becomes HAproxy, but it’s much less likely to fail because it doesn’t (read: ever) change as dynamically as the software it balances.

Databases can be highly available, too. Databases are generally less likely to fail than software but an outage–or loss of data–can potentially be worse than the application being unavailable. For relational databases where data throughput, accessibility, and integrity is paramount, using something like an MySQL cluster is often the best solution. For others with a higher (but not in any way lax) threshold for failure, basic master/slave replication works. For those who care even less, a single MySql database with frequent backups is enough to get the job done. If you’re lucky enough to use a NoSQL database, cluster scaling is a natural extension of the machinations of the technology and involves no tradeoffs.

Another point that is important to make is that redundant instances running on the same server then have their availability bound by the server, switch, network, or even geographical region that they exist on. This, along with optimal data transfer routes, is why companies with consumer-facing apps will spread their data centers across the globe.

It turns out that solving the SPOF load balancer problem is easy and it only takes one additional technology. First of all, to have highly-available load balancers, you do indeed need another load balancer. Then you have to set up keepalived. Keepalived has a configuration file that lives on each server that you run load balancers on that can intelligently switch between load-balancers by healthchecking each IP at a regular interval. As a result, you are insulated from any source of failure. If you implement every one of these steps, your application will be highly available in the purest sense

In my next post on this topic, I will show how easy it is to set up a highly available, bulletproof microservice using Docker.

Load Balancing Load Balancers