Debugging connection pool leak in Apache HTTP Client

Posted on June 17, 2011
Tags: java

I recently had an issue using the Apache HTTP Client pooling library where after a while threads would just block when trying to open connections. It didn’t take too much to figure out that the thread pool was being exhausted, a quick thread dump revealed all the threads were waiting for a connection from the pool. The issue was, which method was not releasing the connection?

In the Apache HC library a connection is released to the pool when the response.getEntity().getContent() is closed (provided it isn’t already null). This is what EntityUtils.consume does - it checks if getContent() is null and if not then it calls .close().

I had a couple of strategies for trying to identify the culprit. First I tried looking at a heap dump to see if I could identify the last URL that each response had accessed, but this proved problematic. I’m not sure if it was because the old responses had been garbage collected or something, but there was no viable attributes to inspect on the objects that were lying around. The second approach was to switch on some logging in my code, and the HC library. What I wanted to see was my code being called, a connection being obtained, and then being released.

This approach worked perfectly, and to illustrate have a look at the following code blocks. First on live 5 of the log4j.properties see the logging being turned on for the connection manager. Then, in the test client code I have 2 methods: one that properly closes the connection, and one that doesn’t. In this example I call the good method twice, then the bad, then the good again. Finally, see in the log output, see the Get connection then Release connection… repeated twice (the 2 good calls) then two Get connection messages, relating to the bad call followed by the good. It is clear in this example that the getBad call is responsible for leaking the connections.

In my more complex, multi-threaded code is was equally easy given that I could group the messages by thread name, then follow these groups through to identify the faulty method. However, this should probably be diagnosed by looking at the sequential unit test logs as they exercise every method.

And just to round off, my leak was in a method that called POST, processed the response and then called another POST. I was cleaning up the second POST response, but not the first one as I was reusing the response variable.

log4j.properties

log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} [%t] %-5p %c.%M %x - %m%n
log4j.rootLogger=INFO, stdout
log4j.logger.org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager=DEBUG

ServiceClient.java

public class ServiceClient {
	
	public static void main(String[] args) {
		ServiceClient client = new ServiceClient();
		String url = "http://localhost/";
		if (args.length > 0) url = args[0];
		client.getGood(url);
		client.getGood(url);
		client.getBad(url);
		client.getGood(url);
	}
	
	private HttpClient client;
	
	public ServiceClient() {
		this(2);
	}
	
	public ServiceClient(final int poolSize) {
		// Set up the multithreaded http client.
		SchemeRegistry schemeRegistry = new SchemeRegistry();
		schemeRegistry.register(
		         new Scheme("http", 80, PlainSocketFactory.getSocketFactory()));
		schemeRegistry.register(
		         new Scheme("https", 443, SSLSocketFactory.getSocketFactory()));

		ThreadSafeClientConnManager cm = new ThreadSafeClientConnManager(schemeRegistry);
		cm.setDefaultMaxPerRoute(poolSize);
		cm.setMaxTotal(200);
		
		HttpParams params = new BasicHttpParams();
		params.setParameter(CoreConnectionPNames.SO_TIMEOUT, 60000);
		params.setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, 60000);
		client = new DefaultHttpClient(cm, params);
	}
	
	public String getGood(String url) {
		HttpGet get = new HttpGet(url);
		HttpResponse response;
		try {
			response = client.execute(get);
			if (response.getStatusLine().getStatusCode() != HttpStatus.SC_OK) {
				EntityUtils.consume(response.getEntity());
				return null;
			}
			
			Reader is = new InputStreamReader(response.getEntity().getContent());
			StringBuffer buf = new StringBuffer();
			
			while (is.ready()) {
				char[] b = new char[1024];
				int c = is.read(b);
				buf.append(b);
			}
			is.close();
			
			return buf.toString();
		} catch (ClientProtocolException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
			get.abort();
			return null;
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
			get.abort();
			return null;
		}
	}

	
	public String getBad(String url) {
		HttpGet get = new HttpGet(url);
		HttpResponse response;
		try {
			response = client.execute(get);
			if (response.getStatusLine().getStatusCode() != HttpStatus.SC_OK) {
				EntityUtils.consume(response.getEntity());
				return null;
			}
			
			Reader is = new InputStreamReader(response.getEntity().getContent());
			StringBuffer buf = new StringBuffer();
			
			while (is.ready()) {
				char[] b = new char[1024];
				int c = is.read(b);
				buf.append(b);
			}
			
			return buf.toString();
		} catch (ClientProtocolException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
			get.abort();
			return null;
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
			get.abort();
			return null;
		}
	}

Log output:

2011-06-17 21:59:07 [main] INFO  org.nigelsim.examples.httpleak.ServiceClient.getGood  - Getting http://www.abc.net.au
2011-06-17 21:59:07 [main] DEBUG org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager.getConnection  - Get connection: HttpRoute[{}->http://www.abc.net.au], timeout = 60000
2011-06-17 21:59:07 [main] DEBUG org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager.releaseConnection  - Released connection is reusable.
2011-06-17 21:59:07 [main] INFO  org.nigelsim.examples.httpleak.ServiceClient.getGood  - Getting http://www.abc.net.au
2011-06-17 21:59:07 [main] DEBUG org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager.getConnection  - Get connection: HttpRoute[{}->http://www.abc.net.au], timeout = 60000
2011-06-17 21:59:08 [main] DEBUG org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager.releaseConnection  - Released connection is reusable.
2011-06-17 21:59:08 [main] INFO  org.nigelsim.examples.httpleak.ServiceClient.getBad  - Getting http://www.abc.net.au
2011-06-17 21:59:08 [main] DEBUG org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager.getConnection  - Get connection: HttpRoute[{}->http://www.abc.net.au], timeout = 60000
2011-06-17 21:59:08 [main] INFO  org.nigelsim.examples.httpleak.ServiceClient.getGood  - Getting http://www.abc.net.au
2011-06-17 21:59:08 [main] DEBUG org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager.getConnection  - Get connection: HttpRoute[{}->http://www.abc.net.au], timeout = 60000
2011-06-17 21:59:08 [main] DEBUG org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager.releaseConnection  - Released connection is reusable.