Load-balancing is a critical tool for managing the distribution of work. Most often used in the context of resiliency so your services continue serving without users being impacted, or in some cases even knowing there was a failure in the first place. This means that when the load becomes too high on one server, it can be distributed.

The pool is typically virtual or physical servers. This means if one part of the pool fails and it has requests being directed to it, the requests are then sent to the rest of the server to execute. This service allows for things like bottlenecks due to resource usage to be mitigated.

Load balancing is crucial to things like cloud environments and server setups with expected high usage.

Common Types of Load Balancing Methods

  • Round Robin – This is the most common of the load balancing techniques because of its simplicity. The way this works is that traffic is directed to the first available server then that server is placed into the bottom of the pool thus rotating out equal traffic to each of the servers. This method works best when you have multiple servers of the same compute value.
  • Least Connections – This method involves balancing out traffic to those servers that have the lowest number of connections. This works well because it helps balance the overall compute load across servers of varying workloads and computational power.
  • Least Response Time – This method involves analyzing the overall performance of the nodes and how long it takes on average to process its traffic. With this it will prioritize using more powerful nodes in the pool as often as it can.