Sunday, October 9, 2011

Tomcat: Clustering and Load Balancing with HAProxy under Ubuntu 10.04 - Part 1

Introduction


In this article we will explore how to setup a simple Tomcat cluster and load balancing using HAProxy. Our environment will consists of two Tomcat (latest version) instances running under Ubuntu Lucid (10.04 LTS). We will use sample applications from the built-in Tomcat package to demonstrate various scenarios. Later in the tutorial, we will study in-depth how to configure HAProxy and how to setup logging.

What is HAProxy?
HAProxy is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications. It is particularly suited for web sites crawling under very high loads while needing persistence or Layer7 processing - http://haproxy.1wt.eu/

What is Tomcat?
Apache Tomcat is an open source software implementation of the Java Servlet and JavaServer Pages technologies... Apache Tomcat powers numerous large-scale, mission-critical web applications across a diverse range of industries and organizations - http://tomcat.apache.org/

Table of Contents


  1. Setting-up the Environment
    • Download Tomcat
    • Configure Tomcat
    • Run Tomcat
    • Download HAProxy
    • Configure HAProxy
  2. Load Balancing
    • Default Setup
    • Sharing Sessions
    • Configure Tomcat to Share Sessions
    • Retest Session Sharing
    • Session Sharing Caveat
    • Sharing Sessions
  3. HAProxy Configuration
    • Configuration File
    • Logging

Frequently Asked Questions (FAQ)


Q: Why is this tutorial in Linux instead of Windows?
A: By default the source code and pre-compiled binaries for HAProxy is tailored for Linux/x86 and Solaris/Sparc.

Q: Why Ubuntu 10.04 instead of another Linux distribution?
A: My local machine is running Ubuntu 10.04 LTS.

Q: How do I install HAProxy in Windows?
A: You can install it via Cygwin. Check this link for more info.

An Overview


Before we start with actual configuration and development, let's get a visual overview of the whole setup. The diagram below depicts our simple load balancing architecture and the typical flow of data:

1. A client visits our website.
2. HAProxy receives the request and performs load balancing.
3. Request is redirected to a Tomcat instance.
4. Response is returned back to HAProxy and back to the client.


Notice in the backend we are sharing the same IP address (127.0.0.1), the localhost. This is useful for testing purposes, but in production we will normally assign each server in its own machine with its own IP address.

Since we have three servers that share the same IP address, we have to assign different ports to each as follows:
ServerPort
HAProxy80
Tomcat 18080
Tomcat 28090

The client should only have access to the IP address and port where HAProxy resides. If we let the client bypass HAProxy by directly connecting to any of the Tomcat instances, then we have defeated the purpose of load balancing.

Requirements


When this article was written, the environment and tools I'm using are as follows:
NameVersionOfficial Site
HAProxyStable 1.4.18http://haproxy.1wt.eu/
Tomcat7.0.22http://tomcat.apache.org/
Ubuntu10.04http://www.ubuntu.com/

The HAProxy and Tomcat versions are the latest stable versions as of this writing (Oct 9 2011). This tutorial should work equally well on Tomcat 6. For the operating system, I recommend Ubuntu 10.04 because that's where I tested and setup this tutorial. In any case, you should be able to apply the same steps to your favorite Linux distro. For Windows users, use Cygwin instead (see FAQs).

Setting-up the Environment

To ensure we're on the same page, I'm providing a walkthrough for configuring and installing of Tomcat and HAProxy. We will test a basic setup to verify that we have setup our environment correctly.

Download Tomcat

To download Tomcat visit its official page. Alternatively, you can visit this link directly: http://tomcat.apache.org/download-70.cgi

Select a core binary distribution. For my system, I opted for the zip version (the first option).

Extract the download to your file system. In my case, I extracted the zip file to /usr/local and rename the extracted folder to tomcat-7.0.21-server1.

Copy and paste this folder to the same location /usr/local and rename the folder to tomcat-7.0.21-server2. The final result should be similar to the following:

Your Tomcats are installed.

Configure Tomcat


If we examine the server.xml inside the Tomcat conf folder, we will discover that Tomcat uses the following ports by default:

Tomcat Default Ports
ElementPort
Shutdown8005
HTTP Connector8080
AJP Connector8009

We have two Tomcat instances. If we run both, we'll encounter a port conflict since both instances are using the same port numbers. To resolve this conflict we will edit one of the server.xml files. In our case, we will choose Tomcat instance 2.

Go to tomcat-7.0.21-server2/conf and open-up server.xml. Find the following lines and replace them accordingly:



Save the changes. At the end, your Tomcat instances should be configured as follows:

Tomcat 1 & 2 Ports
Tomcat 1 PortsTomcat 2 Ports
Shutdown80058105
HTTP Connector80808180
AJP Connector80098109

Run Tomcat

After configuring our Tomcat installations, let's run them and verify that they're running according to the specified ports.

Tomcat 1
Since I'm using Ubuntu, I can run Tomcat 1 by issuing the following command in the terminal:
sudo /usr/local/tomcat-7.0.21-server1/bin/startup.sh

If in case you get a permission error, make the startup.sh executable first. To verify that Tomcat 1 is running, open-up a browser and visit the following link:
http://localhost:8080

Here's the resulting page:


Tomcat 2
To run Tomcat 2, follow the same steps earlier. This time we'll execute the following command:
sudo /usr/local/tomcat-7.0.21-server2/bin/startup.sh

Open another browser and visit the following link:
http://localhost:8180

Here's the resulting page:


We've successfully setup two Tomcat instances. Next, we will download and setup HAProxy.

Download HAProxy


To download HAProxy, visit its official page and download either the pre-compiled binaries or the source. Alternatively, you can install via apt-get (however if you want the latest version, you might need to tinker with sources.list to update your sources).

For this tutorial, we will build and compile from the source (which I believe is faster and simpler).

Open up a terminal and enter the following commands:


This should download the latest HAProxy, extract, and install it. If you get a permission error, make sure to prepend a sudo in each command. If you have difficulty installing from the source, I suggest you do some Googling on this topic. There are plenty of resources on how to install HAProxy from the source (albeit some are outdated though but may still apply).

After HAProxy has been installed, verify that it's indeed installed! Open up a terminal and type the following command: haproxy.

You should see the following message:

Configure HAProxy


In order for HAProxy to act as a load balancer, we need to create a custom HAProxy configuration file where we will declare our Tomcat servers.

I'll present you first a basic configuration to jump-start our exposure to HAProxy. In part 3, we'll study this configuration and explain what's happening per line.

I created a configuration file and saved it at /etc/haproxy/haproxy.cfg:


Run HAProxy


After configuring HAProxy, let's verify that it's running and communicating properly with our Tomcat instances.

Open up a terminal and run the following command:
sudo haproxy -f /etc/haproxy/haproxy.cfg

Now open up a browser and visit the following link:
http://localhost/admin?stats

Your browser should show the following page:


Based on this page, tomcat1 and tomcat2 are both down. That's because they are not running yet. Let's run both Tomcat instances, and the stats page should automatically update.

To start tomcat1, run the following command:
sudo /usr/local/tomcat-7.0.21-server1/bin/startup.sh

To start tomcat2, run the following command:
sudo /usr/local/tomcat-7.0.21-server2/bin/startup.sh

Here's the result:


Notice both Tomcats are now ready.

Next Section


We've completed setting-up our environment for Tomcat clustering and load balancing using HAProxy. In the next section, we will explore various load balancing scenarios to learn more about Tomcat and HAProxy. Click here to proceed.
StumpleUpon DiggIt! Del.icio.us Blinklist Yahoo Furl Technorati Simpy Spurl Reddit Google I'm reading: Tomcat: Clustering and Load Balancing with HAProxy under Ubuntu 10.04 - Part 1 ~ Twitter FaceBook

Subscribe by reader Subscribe by email Share

14 comments:

  1. shame..unable to find haproxy for windows :(

    ReplyDelete
  2. HEY CAN YOU PROVIDE US A NETBEANS VERSION FOR THE SAME?

    ReplyDelete
  3. Netbeans version?? Ha ha wtf man?? krams I noticed you changed the AJP ports there in the configuration since you are not using it in the haproxy config is it safe to assume you just changed it to stop bind exceptions from occurring?

    ReplyDelete
  4. @TJ, honestly, I forgot why. I don't see any reason why the ports needed to be changed. I remember when I was writing this guide, I was planning to write a guide also about Tomcat clustering with Apache httpd, and I remember I had to change the AJP ports. Unfortunately, I wasn't able to write them yet. Thank you for noticing that.

    ReplyDelete
    Replies
    1. Yeah I've done it with httpd and mod_jk. Apache httpd server communicates with tomcat using AJP protocol so if you are running both servers you need to change the AJP ports. No problem. I was just wondering why did you change it. Anyway nice post. Very helpful.

      Delete
  5. I had some trouble when using this configuration to load balance a production app in the QA environment. The following configuration however fixed the problem.

    global
    log 127.0.0.1 local0
    log 127.0.0.1 local1 notice
    maxconn 4096
    daemon

    defaults
    log global
    mode http
    option httplog
    option dontlognull
    retries 3
    option redispatch
    maxconn 2000
    contimeout 5000
    clitimeout 50000
    srvtimeout 50000

    frontend http-in
    bind :1015
    default_backend servers

    backend servers
    option httpclose
    option redispatch

    cookie JSESSIONID prefix
    server tomcat1 ct-erlqa01:7070 cookie JSESSIONID_SERVER_1 check inter 5000
    server tomcat2 ct-erlqa03:7070 cookie JSESSIONID_SERVER_2 check inter 5000

    ReplyDelete
  6. Awesome article.Saved my day.

    ReplyDelete
  7. haproxy
    localhost/admin?stats
    require user login , account is ?

    ReplyDelete
    Replies
    1. sorry , i see this
      stats uri /admin?stats
      stats realm Haproxy\ Statistics
      stats auth admin:pass

      Delete
  8. What if Tomcat Instances are configured for Authentication (form-based or basic), will the load balancing using Session Sharing work in this case?

    ReplyDelete