Grails, DBCP & Stale Connections

July 13, 2009

Does your application break after every few hours of inactivity even though you have enough database connections in the pool? It is a common problem with database connection pooling (and idle sockets connections in general).

I have been running a few Grails app with PostgreSQL database using Apache Commons DBCP connection pooling. Most of these apps are pretty busy, and are working quite well so far. But I have one critical app that doesn’t get used as much. Recently after a bug report, I was watching through the logs and I realized that this app was complaining about Socket Connection Exception after every hour of idle time. Try it again, and it would work. So why was it rejecting the DB connection the first time? Creepy!

I checked out the other apps I have, and all of them were suffering from the same problem – depending on how idle they were. I couldn’t ignore any longer.

I started off with the basics.

Basic configuration:

I have my datasource defined in the DataSource.groovy file under grails-app/conf. I have enabled connection pooling and I am using the appropriate PostgreSQL-JDBC driver. Grails comes with Apache DBCP connection pooling by default. So it just works.

environments {
     production {
          dataSource {
               pooled = true
               driverClassName = "org.postgresql.Driver"
               dbCreate = "update"
               url = "jdbc:postgresql://sacharya.com:5432/sacharya_prod_db"
               username = "myuser"
               password = "mypassword"
               logSql=true
          }
     }
}

This is what my production configuration looks like – running whatever the default behavior that Grails comes with. I printed the datasource object to see what DBCP configurations my app is using by default.

dataSource.properties.each { println it }
minIdle=0
initialSize=0
maxIdle=8
numIdle=0
maxActive=8
numActive=0

As the result shows, it is using the default values as defined in Apache DBCP

Problem:

Using netstat, I started watching if the application has any connections to the database open:

$ netstat -na | grep 5432
tcp4       0      0  24.25.26.27.50095     98.129.169.246.5432     ESTABLISHED

So DBCP says it has no connections open (active or idle) to the database yet in the pool. But netstat shows there is a TCP connection established to the database. Where does this connection come from – loading & initializing the Driver?

Now lets use the application for a while. Once you start asking the pool for connections, existing connections idle in the pool are at your service, or if there no connections in the pool, new connections are created and given to you, which get returned back to the pool after you complete the query. So at any given time, there will be minIdle to maxIdle ( 0 to 8 in our case) connections in the pool.

$ netstat -na | grep 5432
tcp4       0      0  24.25.26.27.51308     98.129.169.246.5432     ESTABLISHED
tcp4       0      0  24.25.26.27.50095     98.129.169.246.5432     ESTABLISHED

Now with some connections in the pool, I left the app idle for an hour and then I tried to access the app (netstat still shows 2 TCP connection). The first query got the following exception:

org.springframework.dao.DataAccessResourceFailureException: could not execute query;
nested exception is org.hibernate.exception.JDBCConnectionException: could not execute query

Caused by: org.hibernate.exception.JDBCConnectionException: could not execute query
at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:97)

Caused by: org.postgresql.util.PSQLException: An I/O error occured while sending to the backend.
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:217)

Caused by: org.hibernate.exception.JDBCConnectionException: could not execute query
at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:97)

Caused by: org.postgresql.util.PSQLException: An I/O error occured while sending to the backend.
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:217)

Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)

The subsequent query succeeds. There was one connection (ignore the other TCP connection used during Driver initialization) in the pool, but the database responded by saying that the connection was not valid.

Reasoning:

At this point it was clear that the TCP connection that was sitting idle was already broken, but our app still assumed it to be open. By idle connections, I mean connections in the pool that aren’t in active use at the moment by the application. After some search, I came to the conclusion that the network firewall between my app and the database is dropping the idle/stale connections after 1 hour. It seemed to be a common problem that many people have faced.

By default, DBCP holds the pooled connections open for infinite time. But a database connection is essentially a socket connection, and it doesn’t come for free. The host OS, database host, and firewall have to allocate a certain amount of memory and other resources for each socket connection. It makes sense to those devices not to hold onto idle connections for ever. So the idea is to make sure that you don’t have stale connections in your pool that would otherwise be silently dropped by OS or firewall. The system has no way of knowing if the connection is broken unless is sends a packet and waits for an acknowledgement. So even when the connection is timed out or closed by one side, the other side may still think the connection is open.

While there may not be a firewall between your server and database, even the OS has a timeout on TCP connections. You could probably increase the TCP keepalive of the OS itself, but that will affect the whole system, and yet you are only postponing the problem.

Solution:
Now lets try to modify some of the DBCP settings for the dataSource.

1. Validating Connections: DBCP allows you do define a validation query and do a sanity check of the connection before you actually use it in your application.
By default,

validationQuery=null
testOnBorrow=false
testOnReturn=false
testWhileIdle=false

The validation query must return at least one row, and using the query you can have DBCP test the connection for you while its idle, before you borrow and before you return it.
So lets change it to:

validationQuery="SELECT 1"
testOnBorrow=true
testOnReturn=true
testWhileIdle=true

If any of the connection object fails to validate, it will be dropped from the pool. There might be some performance implications of running these three SQLs (which I am not worried at the momet), and hence you might just want to try testOnBorrow.

2. Evicting Idle Connections: DBCP can run an idle object evictor at a regular interval and evict any connections that are older than some threshold. By default this behavior is turned off since timeBetweenEvictionRunsMillis is set to -1.

timeBetweenEvictionRunsMillis=-1
numTestsPerEvictionRun=3
minEvictableIdleTimeMillis=1000 * 60 * 30

Now lets run the evictor every 30 minutes and evict any connections older than 30 minutes.

timeBetweenEvictionRunsMillis=1000 * 60 * 30
numTestsPerEvictionRun=3
minEvictableIdleTimeMillis=1000 * 60 * 30

It turns out that you cannot change the DBCP settings from the DataSource.groovy file. The datasource object injected in the DataSource.groovy file is an instance of the javax.sql.DataSource. I can however do so by overriding the default DataSource from the BootStrap.groovy, which sets up the settings during start up.

import org.codehaus.groovy.grails.commons.ApplicationAttributes
class BootStrap {

     def init = { servletContext ->

          def ctx=servletContext.getAttribute(
                       ApplicationAttributes.APPLICATION_CONTEXT)
          def dataSource = ctx.dataSource

          dataSource.setMinEvictableIdleTimeMillis(1000 * 60 * 30)
          dataSource.setTimeBetweenEvictionRunsMillis(1000 * 60 * 30)
          dataSource.setNumTestsPerEvictionRun(3)

          dataSource.setTestOnBorrow(true)
          dataSource.setTestWhileIdle(false)
          dataSource.setTestOnReturn(false)
          dataSource.setValidationQuery("SELECT 1")

          dataSource.properties.each { println it }
     }
}

You can do the same from grails-app/conf/spring/Resource.groovy:

import org.apache.commons.dbcp.BasicDataSource
beans = {
     dataSource(BasicDataSource) {       
          minEvictableIdleTimeMillis=1800000
          timeBetweenEvictionRunsMillis=1800000
          numTestsPerEvictionRun=3

          testOnBorrow=true
          testWhileIdle=true
          testOnReturn=true
          validationQuery="SELECT 1"
     }
}

This seems to have solved the problem for me. Since my firewall was dropping the socket connections at 60 minutes, all I did was proactively run the idle object evictor every half 30 minutes, flush connections that are idle for more than 30 minutes and regenerate new connections in the pool. I also did sanity check over the connections in the pool.

37 Comments on Grails, DBCP & Stale Connections

Respond | Trackback

  1. Sridhar says:

    Thanks for this excellant article. I was working on a similar task from past couple of days, and this article could not have come at a better time for me.
    I am also dealing with resetting/cleaning the stale connections in the connection pool using grails. I add following lines in BootStrap.groovy, it did not quite work for me. The stale connection were not being validated.

    import org.codehaus.groovy.grails.commons.ApplicationAttributes;
    
    class BootStrap {   
    	def init = { servletContext ->
    		def ctx = servletContext.getAttribute(ApplicationAttributes.APPLICATION_CONTEXT)
    		def dataSource = ctx.dataSource
    		dataSource.setTestOnBorrow(true);
    		dataSource.setValidationQuery("select 1 from dual")
    	} 
    	
    	def destroy = {
    	}  
    }
    
    

    Alternatively I added following lines in “grails-app/conf/spring/resources.groovy”, and it works.

        dataSource(org.apache.commons.dbcp.BasicDataSource) { 
    	    driverClassName=''
                        url = ''
    	    username = 'user'
    	    password = 'user123'
        	testOnBorrow=true 
        	validationQuery="SELECT 1 FROM DUAL"
        } 
    

    Tried with only the properties testOnBorrow and validationProperty, it did not work. It was trying to create a new dataSource Object and complained about driver /url missing.

    The above dataSource configuration is in addition to the DataSource.groovy, please advice me if you see any issue with this. Would this still inject in the original DataSource object or create a new DataSource object.

    Thanks

  2. admin says:

    Looks good to me. You can print the dataSource object in your controller, and see how the values change after each call.

  3. Chris says:

    Knocked this one out of the park. I really enjoyed it.

  4. sebastian (pornstorm) says:

    thanks for taking the time to write it up…

  5. Eric says:

    Nice post. Well written and explained, thanks for sharing your solution.

  6. Jet says:

    Just ran into this and found this article. Will give it a try.

    Many thanks

  7. [...] Этот мужчина говорит что для postgresql похожая ситуация. [...]

  8. Pail says:

    Hi

    I am creating the database config values (url, userid, password, driver) during build time into a separate file and referencing this file using grailsApplication. This is necessary as i dont know the production passwords. This is supplied during the deployment by the admins. In order to access the params i am using groovy.sql.SQL to create the datasource in one of the controllers.

    If we are using ctx.dataSource in Bootstrap, how does one pass the values from this prop file to this datasource?

    Thanks

  9. Stuart says:

    Hi,

    great information, and very well explained. Solved the problem I was having.

    Thanks for writing up.

  10. Common Issues…

    Issues \ Description \ Status \ Action Item\ Owner\ Platform code stops executing threads under performance test \ A possible issue of stale connections in the DB pool \ Explanation…

  11. Werner says:

    thanks a lot for writing that up (so well)!

    that nasty problem has given me quite some headaches so far
    and now I’ve just declared you my personal hero for today!

    stay well & propsper!

  12. Tim Yates says:

    With Grails 1.3 RC2, and the move to Spring 3, you need to add targetDataSource in to the bootstrap code, so;

    dataSource.setMinEvictableIdleTimeMillis(1000 * 60 * 30)

    becomes

    dataSource.targetDataSource.setMinEvictableIdleTimeMillis(1000 * 60 * 30)

  13. Mihai Cazacu says:

    @Time: you can use “ctx.dataSourceUnproxied”

    Cheers!

  14. Mihai Cazacu says:

    @Tim: you can use “ctx.dataSourceUnproxied”

    Cheers!

  15. amit says:

    Thanks a lot for sharing this!!!

  16. Thank you, solved the issue in no-time!! All tests-passed. :-)

  17. John Rellis says:

    Hey,

    Maybe you should update the post to include the dataSourceUnproxied issue in Grails 1.3 >

    It is a little hidden in the comments.

    Also here’s the nabble link

    http://grails.1312388.n4.nabble.com/Odd-upgrade-issue-1-3-related-td2195660.html#a2195708

    Great post!

    John

  18. Luis says:

    Great article! It saved me a lot of work.

  19. Oleh says:

    You’re awesome!

    Thanks a lot, great post!

  20. Markus says:

    Hi,
    for me, it was easily possible to pout those properties into DataSource.groovy so it looks like

    environments {
        development {
            dataSource {
                //configure DBCP 
                properties {
                    maxActive = 50
                    maxIdle = 25
                    minIdle = 1
                    initialSize = 1
                    minEvictableIdleTimeMillis = 60000
                    timeBetweenEvictionRunsMillis = 60000
                    numTestsPerEvictionRun = 3
                    maxWait = 10000
    
                    testOnBorrow = true
                    testWhileIdle = true
                    testOnReturn = false
    
                    validationQuery = "SELECT 1"
                }
            }
        }
    }
    

    got that from http://www.grails.org/doc/latest/guide/3.%20Configuration.html#3.3%20The%20DataSource

    also interesting and related to this article are
    http://grails.1312388.n4.nabble.com/Odd-upgrade-issue-1-3-related-td2195660.html#a2195708 (mentioned above) and http://www.sylvioazevedo.com.br/?p=56

    thank you for the detailed explanations!
    great post!

  21. Thomas says:

    Very helpful.
    However in grails 1.3 I had to modify the way to get the dataSource (in BootStrap):

    def ctx=servletContext.getAttribute(ApplicationAttributes.APPLICATION_CONTEXT)
    def dataSource = ctx?.dataSource?.targetDataSource
    dataSource?.setMinEvictableIdleTimeMillis(1000 * 4)

  22. [OTT-664] Connector/J closes connection after 8hours…

    Connector/J sends a time out to the database connection after 8 hours (default value). The first person to access the tron ui login page will get a System Exception telling that the Connection is closed. This is because Hibernate doesn’t recognize the…

  23. [...] are several ways to solve this, including changing to a different connection pool, but Sudarshan Acharya has a blog post from 2009 that explains a fairly easy way to configure DBCP to res…. Of course, there’s a “gotcha” here, or I wouldn’t be blogging about [...]

  24. [...] Grails默认使用的是Apache DBCP的连接池。这个连接池存在的问题是,当它闲置一段时间后重新被访问时,会产生一个错误。通常在经过一个晚上的空闲后,早上第一个用户访问程序时会看到一个HTTP 500的内部程序错误。后续的用户访问时,看到的结果都会是正常的。 解决这个问题的方法之一是使用其他连接池。如果还是希望使用Apache DBCP。你可以采用下面的方法(参考Sacharya的文章): [...]

  25. LJQ says:

    you are a legend

  26. Satish says:

    Very nice and well written article.

  27. Jeff says:

    I’m using the Datasources plugin. Will anything need to be added to BootStrap.groovy to account for the additional datasources?

  28. Jan says:

    Thanks a lot for sharing this. You just solved our problem that was becoming very serious issue for us.

    Thanks one more time.

  29. Gunjan says:

    Concise and well-explained, great post! Solved the problem that I was having.
    Thanks!

  30. Grails Tipps…

    QuickLinks Groovy Tipps IntelliJ GORM Gotchas http…

  31. Grails Tipps…

    QuickLinks Groovy Tipps IntelliJ GORM Gotchas http…

  32. Hugo says:

    Excellent post. Thanks

  33. Grails Tips…

    Table of Contents QuickLinks IntelliJ GORM Gotchas…

  34. Prachi says:

    Excellent post, thank you. Saved me a lot of time.

  35. [...] Grails默认使用的是Apache DBCP的连接池。这个连接池存在的问题是,当它闲置一段时间后重新被访问时,会产生一个错误。通常在经过一个晚上的空闲后,早上第一个用户访问程序时会看到一个HTTP 500的内部程序错误。后续的用户访问时,看到的结果都会是正常的。 解决这个问题的方法之一是使用其他连接池。如果还是希望使用Apache DBCP。你可以采用下面的方法(参考Sacharya的文章): [...]

  36. Arpit says:

    I tried to use to above code in the resources.groovy, but it gives the following exception.

    Code:
    dataSource(BasicDataSource)
    {
    minEvictableIdleTimeMillis=1800000
    timeBetweenEvictionRunsMillis=1800000
    numTestsPerEvictionRun=3

    testOnBorrow=true
    testWhileIdle=true
    testOnReturn=true
    validationQuery=”SELECT 1″
    }

    Exception:
    Exception sending context initialized event to listener instance of class org.codehaus.groovy.grails.web.context.GrailsContextLoaderListener
    Message: Error creating bean with name ‘transactionManagerPostProcessor’: Initialization of bean failed; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name ‘transactionManager’: Cannot resolve reference to bean ‘sessionFactory’ while setting bean property ‘sessionFactory’; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name ‘sessionFactory’: Cannot resolve reference to bean ‘lobHandlerDetector’ while setting bean property ‘lobHandler’; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name ‘lobHandlerDetector’: Invocation of init method failed; nested exception is org.springframework.jdbc.support.MetaDataAccessException: Could not get Connection for extracting meta data; nested exception is org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is org.apache.commons.dbcp.SQLNestedException: Cannot create JDBC driver of class ” for connect URL ‘null’

  37. Webservice Email…

    TheEmail webservice is deployed as a war-file in o…

Respond

Comments

Comments