Timing Out Waiting For An HTTP Connections
HTTP Connection Timeout
As software engineers we always run into some interesting and frustrating issues. I ran into this one just a few days ago while load testing an application.
The issue doesn't occur during function tests. Right around 30 to 40 minutes into load tests we start seeing this error and from that point on all request to the target service fails.org.apache.http.conn.ConnectionPoolTimeoutException : Timeout waiting for connection from pool
Environment/setup
- Java 1.8
- Dropwizard 1.1.1
- Apache httpclient 4.5.3
- javax.ws.rs 2.0.1
We looked at the usual suspects:
- Determine the size of the connection pool and increase it.
- Determine the current timeout setting and increase it.
- Find out what state the current connections are in and see if they can be returned to the pool
Connection Pool size:
We don't set this value. It turns out that when the property
jersey.config.client.async.threadPoolSize
is not set in the ClientProperties according to the docs: We are using the CachedThreadPool, it will continue to create new connections for each request. Naturally this can't go on forever unless connections are closed, as there are limited resources on the client and on the server.If the property is absent then thread pool used for async requests will be initialized as default cached thread pool, which creates new thread for every new request, seeExecutors
. When a value > 0 is provided, the created cached thread pool limited to that number of threads will be utilized.A default value is not set.
Connection Timeout setting:
This was set to 50000 ms (50 seconds) which is very generous. Note that this is not the actual response time from the request. This timeout is just to establish the connection with the target server. There is no point in increasing this limit, we needed to locate the real cause of the problem. So this was ruled out as a possible solution.
State of the current connections
A quick check with "netstat" command below showed that there are 1023 connectionsnetstat -a | grep CLOSE_WAIT | wc -lThese connections are just awaiting close as CLOSE_WAIT is a terminal state and this connection can not be used to data transfer. Now I see the problem,