Load Balancing

  • Balance multiple requests to our servers to handle load
  • Examples:
    • HAProxy
    • NGINX
  • Types:
    • Layer 4: (Transport)
      • Uses: source, destination IP addresses, and ports in the header
    • Layer 7: (Application)
      • Uses: contents of the header, message, and cookies
  • Advantages:
    • Preventing requests from going to unhealthy servers
    • Preventing overloading resources
    • Helping eliminate single points of failure
    • No need to install X.509 certificate on each server
    • Hence SSL will be handled only at load balancer not on each server, request will be decrypted and response is encrypted via load balancer
  • Disadvantage:
    • Load balancer itself becomes single point of failure
    • Added complexity in implementation

Health Check

  • Periodically checks the health of servers
  • If server fails the health check, it is removed from the pool

Load Balancing Algorithms

  • Static methods
    • Round Robin method
      • When servers are equal specs
      • Used when less number of persistent connections
    • Weighted Round Robin method
      • Based on each server’s spec, weight is added
    • IP Hash
      • Based on request IP
  • Dynamic methods
    • Least connection method
      • Fewest active connections
      • Used in large number of persistent connections
    • Least response time method
      • Fewest active connections and lowest response time
    • Least bandwidth method
      • Least bandwidth measured in Mbps

Load Balancer Fault Tolerance

  • Second load balancer can be added to form cluster
  • Each LB monitors health of the other
  • When one fails other takes over

Load Balancers in Cloud

Consistent Hashing

  • Way of evenly distributing load of Caches such as CDN
  • We need to map request ID to different servers
  • The request ID generally encapsulates user ID and hence less likely to change for a user
  • We can cache stuff for particular user in a particular server
  • But if we cache then we need to make sure even if we increase servers, the request should hit the same server again, hence consistent hashing is needed irrespective of increase in number of servers
  • It does not work well for databases
  • Uses
    • Distributed Caches
    • Load Balancers

Implement consistent hashing

  • We can choose M and hash function h()
  • we can create a ring of [0,1,...,M-1] points which are circular
  • we map h(server_id) % M and mark points on the circle
  • For each request_id, we calculate h(request_id) % M and mark point on circle
    • we go clock-wise and try to find nearest h(server_id) % M
    • The request is served by the nearest server clockwise
  • This should make the load distributed equally in 1/N manner
  • Any change like adding/deleting the server should not cause much difference for the requests being served by each server
  • But practically it does not happen and we may end up with skewed distributions