High availability load balancing using HAProxy PART-1

HAProxy is a free, very fast and reliable solution offering high availability, load balancing and proxying for TCP and HTTP-based applications. It is particularly suited for web sites crawling under very high loads while needing persistence or Layer7 processing. Supporting tens of thousands of connections is clearly realistic with todays hardware. Its mode of operation makes its integration into existing architectures very easy and riskless, while still offering the possibility not to expose fragile web servers to the Net,such as below:

haproxy-pmode

In this post I will show you how to easily setup load balancing for your web application. Imagine you currently have your application on one webserver called Webserver1.

1But traffic has grown and you’d like to increase your site’s capacity by adding more webservers (WebServer2 and WebServer3), aswell as eliminate the single point of failure in your current setup (if web01 has an outage the site will be offline).

2

In order to spread traffic evenly over your three web servers, we could install an extra server to proxy all the traffic an balance it over the webservers. In this post we will use HAProxy, an open source TCP/HTTP load balancer. (see: http://haproxy.1wt.eu/) to do that:

3

So our setup now is:

  • Three webservers, WebServer1 (192.168.0.1), WebServer2 (192.168.0.2 ), and WebServer3 (192.168.0.3) each serving the application
  • A new server (LoadBalancer-1, ip: (192.168.0.100 )) with Ubuntu installed.

Allright, now let’s get to work:

Start by installing haproxy on your loadbalancing machine:

LoadBalancer01$ ``sudo apt-get ``install haproxy

Now let’s backup the original haproxy configuration file and create a new one with our config which will tell haproxy to listen for incoming http requests on port 80 and balance them between the three webservers:

loadb01$ ``sudo mv /etc/haproxy/haproxy``.cfg ``/etc/haproxy/backup_haproxy``.cfg

loadb01$ ``sudo vi /etc/haproxy/haproxy``.cfg

Paste the following configuration there:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

global

        ``maxconn 4096

        ``user haproxy

        ``group haproxy

        ``daemon

** **

defaults

        ``log     global

        ``mode    http

        ``option  httplog

        ``option  dontlognull

        ``retries 3

        ``option  redispatch

        ``maxconn 2000

        ``contimeout      5000

        ``clitimeout      50000

        ``srvtimeout      50000

** **

listen webcluster *:80

        ``mode    http

        ``stats   ``enable

        ``stats   auth us3r:passw0rd

        ``balance roundrobin

        ``option httpchk HEAD / HTTP``/1``.0

        ``option forwardfor

        ``cookie LSW_WEB insert

        ``option httpclose

        ``server web01 192.168.0.1:80 cookie LSW_WEB01 check

        ``server web02 192.168.0.2:80 cookie LSW_WEB02 check

        ``server web03 192.168.0.3:80 cookie LSW_WEB03 check

Enable HAproxy by editing the /etc/default/haproxy file``````````

loadb01$ ``sudo nano ``/etc/default/haproxy

and setting ENABLED to 1

1

2

3

4

# Set ENABLED to 1 if you want the init script to start haproxy.

ENABLED=1

# Add extra flags here.

#EXTRAOPTS="-de -m 16"

Then, start HAProxy:

1

loadb01$ ``sudo /etc/init``.d``/haproxy start

Now open your webbrowser and browse to http://192.168.0.100/ (or whatever IP you have set for loadb01), you should be served a file from one of the webservers! The loadbalancing is now working, but let’s take a closer look at some of the things we configured in the HAProxy configuration:

1

listen webcluster *:80

Listen for incoming connections on all interfaces, port 80 (the * can also be replaced with a single ip address)

1

2

stats   ``enable

stats   auth us3r:passw0rd

This enables HAProxy’s statistics interface which you can access by browsing to http://192.168.0.100/haproxy?stats login with the username and password given and you should see a nice statistics report like this:

4

The first line in this block enables the use of cookies, basically, when a user reaches the webcluster group, the cookie LSW_WEB will be created and the server id (LSW_WEB01, LSW_WEB02, LSW_WEB03) will be stored in it. For all next requests in the same session, HAProxy will look at the cookie and redirect that user to the same webserver (unless it’s down).

The last three lines define the backend webservers which HAProxy will use, you can easily add more lines here as the infrastructure grows.

Allright the loadbalancing is working and we are almost there, just one thing left to do in this article and that’s fixing your webserver logs on the web01/web02/web03 servers. Since requests now changed from:

1

user --> webserver

To:

1

user --> HAProxy --> webserver

You will see the loadbalancer’s ip in the access log on your webservers. In order to fix this when you are using Apache webserver open your /etc/apache2/apache2.conf file and replace this line:

1

LogFormat ``"%h %l %u %t "%r" %>s %O "%{Referer}i" "%{User-Agent}i"" combined

By

1

2

#LogFormat "%h %l %u %t "%r" %>s %O "%{Referer}i" "%{User-Agent}i"" combined

LogFormat ``"%{X-Forwarded-For}i %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i"" combined

Then restart/reload apache and the logging should be fixed, it will now include the IP address which is send in the X-Forwarded-For header (This header contains a value representing the client’s IP address.) that HAProxy includes in all requests to the backend webserver. We enabled that earlier by setting the

1

option forwardfor

option in the HAPRoxy configuration.

If there’s anything else you’d like to cover, or if you have any questions please leave a comment!