Representing the side of “Everyone” is Gerald Broflovski, the lawyer from South Park who plans to make quite a commission. Representing the side of “Everyone Else” is Gerald Broflovski. So whatever the outcome, things look very bright for Kyle’s dad.
South Park, Episode 306, Panda Sexual Harassment
If you have already launched projects with multiple frontends, you probably know how to distribute traffic. Hosting providers like Amazon AWS or Google Clouds have plenty of tools. But what if you are still not happy with them? Or what if you like working with bare metal servers, not clouds? At Getintent, we love bare metal equipment and can talk about the advantages for hours. Check out some highlights below:
A lot of dedicated providers have implemented virtual networks feature so server farm expansion in them is no problem. A popular exception is Hetzner where you still need to reserve units and allocate space. Let us hope it will soon come up with virtual networking too.
Where it even makes sense to discuss balancing with the provider, they will most likely offer a modern version of this:
This hardware is pretty rare, therefore costs a fortune and requires specific expertise for configuration.
And then it is time to think about …
Chef, what’s for lunch today?
Your farms must already be using some kind of automation, which means making changes at scale is cheap. As an HTTP front-end, you are most likely using nginx on the same servers that serve the application. This makes it easy to build a balancing scheme, where each server is a balancer and a host for the app that responds to the request.
First, configure the server pool in gdnsd:
; Let’s check with a command whether the server ; with a specific IP-address is alive service_types => { adserver => { plugin => extmon, cmd => ["/usr/local/bin/check-adserver-node-for-gdnsd", "%%IPADDR%%"] down_thresh => 4, ; Failed to respond to the test fourfold? ; It’s dead. ok_thresh => 1, ; Responded at least once, it’s alive. interval => 20, ; Check every 20 seconds. timeout => 5, ; Consider a failure to respond ; within 5 seconds as no-response. } ; adserver } ; service_typesplugins => { multifo => { ; If less than one third of the hosts are alive, ; DNS will return the whole list of the servers ; as there must be some problem, e.g. an application overload. ; Disabling the broken servers can only make things worse. up_thresh => 0.3, adserver-eu => { service_types => adserver, www1-de => 192.168.93.1, www2-de => 192.168.93.2, www3-de => 192.168.93.3, www4-de => 192.168.93.4, www5-de => 192.168.93.5, } ; adserver-eu } ; multifo } ; plugins
If less than one third of the hosts are alive, DNS will return the whole list of the servers as there must be something wrong, like an application overload. Disabling the broken servers can only make things worse.
Use this configuration in the DNS zone file:
eu.adserver.sample. 60 DYNA multifo!adserver-eu
Then create a pool of nginx servers and deploy the configuration across the cluster:
upstream adserver-eu { server localhost:8080 max_fails=5 fail_timeout=5s; server www1-de.adserver.sample:8080 max_fails=5 fail_timeout=5s; server www2-de.adserver.sample:8080 max_fails=5 fail_timeout=5s; server www3-de.adserver.sample:8080 max_fails=5 fail_timeout=5s; server www4-de.adserver.sample:8080 max_fails=5 fail_timeout=5s; server www5-de.adserver.sample:8080 max_fails=5 fail_timeout=5s; keepalive 1000; }
If there are more than 25 servers in the rotation, make sure to split them into groups of 25 under one name and dynamically select synonyms (DYNC) for gdnsd instead of A-records.
The list of servers can be filled in automatically based on the data from gdnsdor automation system.
At Getintent, we use Puppet software and a special cron script keeps the lists of the several types of servers registered on the puppetmaster up to date. The localhost line keeps us safe in case the list is mistakenly empty. As long as each nginx serves at least itself, no crash should occur.
So, what we’ve got:
There is no such thing as a free lunch. So you have to pay for free balancing, specifically for sticky sessions. However, a free version of nginx offers IP hash balancing, while modern NoSQL databases like Aerospike can be used as a reliable storage for shared sessions with fast response. Finally, you may switch to HAProxy that can route users on multiple balancer sheets by a single rule. So when we need sticky sessions, we will use these tools available.
Now each of our 20-core servers processes up to 15,000 requests per second, which is close to the upper limit for our Java application. Across all clusters, we balance and process in peak more than half a million requests per second, 5 Gbps of incoming and 4 Gbps of outgoing traffic excluding CDN. Yes, the specifics of DSP is that incoming traffic surpasses the outgoing one, as there are more bids than relevant responses to them.
Getintent uses cookies for marketing purposes and to better understand how you use our website. By clicking OK or continuing to browse the website, you consent to the use of cookies.