If you’ve spent much time around data centers, you’re likely familiar with load balancing. But if you haven’t, it’s a vital concept to understanding how to keep your infrastructure available to all of your users, while also maximizing the efficient use of your computing resources. Even if you’re familiar with load balancing, there are some recently released tools and similar concepts that we’ll cover in order to share your workloads across your servers — and even across the country.
Load balancing is the distribution of computer workloads across multiple resources, whether that means computers, clusters, servers, network links, or storage drives. The goal is to maximize the use of available resources, avoid overloading one particular node with more work than it can handle, add redundant components, and encourage faster response times. In other words, load balancing shares loads around in order to achieve higher performance.
Hardware or software methods can be used to achieve these goals. Load balancing is similar to another concept, channel bonding, but it focuses on the network layer instead of the physical interfaces at the packet layer of data link layer (see more information on the OSI layer model).
For a common use scenario, imagine we are delivering a web-based application from virtualized cloud servers. This application includes a database, web site, and File Transfer Protocol. With software load balancing, the program monitors the external network ports for incoming traffic and forwards them on to the backend servers running the workload. If the backend server does not reply, additional instances are used to meet the demand. Back up nodes might also be used, which are kept inactive except in the case of failure.
Before cloud resources, each of these instances ran on their own physical server. With virtual servers, it became much easier to scale out the solution, with the possibility to even have dozens of instances on a single server. Load balancing virtual environments did come with its own share of problems to address, however.
There are occasions when it is important to send the same client to the same server in order to maintain their current state within the application. The most common example is a shopping cart in eCommerce, where sending a user to a new server may cause them to lose their saved information. Load balancers can be set up with session persistence in order to keep requests from a single client on the same server.
Fault isolation is one of them. If a single node or instance fails, the shared network, storage, or compute resources could be compromised. That means all the virtual machines, each with their own instances, could also be compromised. Performance can also drop just because the amount of loads on a single server has reached the limit.
Redundancy is therefore key, for all infrastructure components, and you should also be running several virtualized instances across different physical servers so your load balancer can move to a new physical host if necessary.
Finally, when configuring your load balancing rules to autoscale, you must consider how to control where and when new instances are placed in order to maximize efficiency and avoid overloading a single server or storage unit. It’s unlikely, but left to defaults, a load balancer could place each instance on its own machine, wasting your resources. Or it could do the opposite, placing instances so that a single point of failure results in downtime.
One similar concept in VMware vSphere virtualization is vMotion, which allows the live migration of virtual machines between cloud resource pools and even between data centers. The VMs move with their configuration settings intact, so you don’t need to reconfigure network or storage.
vMotion in fact contains some load balancing technology itself, sharing the network traffic caused by pushing VMs around between different network adaptors. This helps reduce the time it takes to move a VM, especially one with a large memory configuration.
There are three main types of load balancing:
Round robin: this method simply moves incoming requests in a sequential order, where request 1 will go to server 1 and down the line.
Least connections: this configuration will send incoming requests to the server that currently has the least connections and also the lowest compute load.
IP Hash: this transforms the IP address of incoming traffic into a hashed code, which is then algorithmically examined to determine which server will receive the request.
While load balancing helps keep your applications available to users and scales out additional resources automatically to meet demand, it must be set up carefully in order to avoid failure. Green House Data offers managed load balancing services for all cloud environments and can also assist in setting them up for collocated infrastructure.