-
Notifications
You must be signed in to change notification settings - Fork 14
Home
The load balancing service dynamically handles the list of machines behind a given DNS alias to allow scaling and improve availability by allowing several nodes to be presented behind a single name. It is one of the technologies enabling deployment of large scale applications on cloud resources.
The Domain Name System (DNS) is a naming system for computers or other resources. DNS is essential for Internet. It is an internet-standard protocol that allows to use names instead of IP addresses. Load balancing is an advanced function that can be provided by DNS, to distribute requests across several machines running the same service by using the same DNS name.
The LBD DNS Load Balancer has been developed as a cost-effective way to handle applications accepting the DNS timing constraints and not requiring affinity (also known as persistence or sticky sessions). It is currently (March 2019) used by 761 services on the site with two small VMs acting as LBD master and slave. The alias member nodes have configured a Simple Network Management Protocol (SNMP) agent that communicates with the lbclient program. The proposal is to provide a load metric number used by the LBD server to determine the subset of nodes from the set whose IP is to be presented. Let see an overview of the service:
The LBD server periodically gets a load metric from the lbclient in the alias member nodes using SNMP and uses the information to update the A (IPV4) and AAAA (IPV6) records for a DNS delegated zone that corresponds to the alias using Dynamic DNS (see RFC2136). The period ("polling_interval") is 5 minutes by default.
The LBD slave does like the LBD master, i.e: periodically gets a load metric from the alias member nodes. However, it only updates the DNS delegated zone when it loses contact with the LBD master. This is verified by trying to get a file with a "heartbeat" from a web server on the LBD master.
The lbclient provides a built-in load metric. Alternative load metrics can be configured by combining several Collectd metrics and constants. Health monitoring checks can also be configured for the alias members to be taken out of the alias when certain condition is triggered. A typical example is the check of the Roger state so that the node is taken out when the appstate is not 'production'. As well as several built-in checks you may also configure additional ones using Collectd metrics. You can also use the return code of an arbitrary program (or script) as a check. If the node is in working state the load metric is an integer greater than 0. If the load metric is 0 or lower than 0, it means that the machine is not available.
We have produced a LBaaS interface with a self-service GUI to facilitate alias creation and management.
How we define DNS load balanced aliases and how to use them is in the next section.