Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

javax.sql.ConnectionPoolDataSource support #54

Closed
dimzon opened this issue Mar 31, 2014 · 20 comments
Closed

javax.sql.ConnectionPoolDataSource support #54

dimzon opened this issue Mar 31, 2014 · 20 comments
Labels

Comments

@dimzon
Copy link

dimzon commented Mar 31, 2014

Current HikariCP solution contains of 2 parts

  1. Codegen/Proxy related stuff to wrap java.sql.Connection to something like javax.sql.PooledConnection
  2. Pool management code itself (inc thread syncronization, lifetime management etc)

Some JDBC drivers provide javax.sql.ConnectionPoolDataSource out-of the box. So bytecode generation isn't requred. I propose to split HikariCP for 2 lose-coupled parts:

  1. Codegen/Proxy related stuff to generate javax.sql.DataSource to exact javax.sql.ConnectionPoolDataSource adapter
  2. Pool management code to deal directly with javax.sql.ConnectionPoolDataSource api.

In this case you can use both parts if JDBC driver doesn't support javax.sql.ConnectionPoolDataSource and only second part otherwise...

@brettwooldridge
Copy link
Owner

The HikariCP generated proxies do quite a bit more than simply making a java.sql.Connection look like a javax.sql.PooledConnection. The PooledConnection API was a well-intended interface that ultimately lacks the richness necessary for advanced pooling. Basically with a PooledConnection you register a ConnectionEventListener which gets you two things ... it tells you when a connection is closed, connectionClosed(ConnectionEvent event), so you can clean things up and return it to the pool, or it tells you when an error occurred, connectionErrorOccurred(ConnectionEvent event).

The HikariCP generated ConnectionProxy intercepts not only close(), but also about half a dozen other methods, like setTransactionIsolationLevel(), setCatalog(), and setAutoCommit(). By intercepting and tracking these states, when a connection is returned to the pool and needs to be "reset" before recycling for another user, HikariCP can make informed decisions to questions such as "Does the transaction isolation level need to be reset, or was it never modified by the user?".

For example, if you blindly reset the transaction isolation level, even if it was not modified some drivers will still send traffic over the wire to the database -- and is an expensive operation for many databases. Resetting 5 or 6 such attributes can generate a lot of chatter to the database which decreases the pool throughput (sometimes substantially).

If HikariCP forgoes generating proxies, and simply relies on the two provided listener methods, essentially all of its value as a high-performance pool and its entire reason for existence is lost.

@dimzon
Copy link
Author

dimzon commented Apr 2, 2014

The HikariCP generated ConnectionProxy intercepts not only close(), but also about half a dozen other methods, like setTransactionIsolationLevel(), setCatalog(), and setAutoCommit(). By intercepting and tracking these states, when a connection is returned to the pool and needs to be "reset" before recycling for another user, HikariCP can make informed decisions to questions such as "Does the transaction isolation level need to be reset, or was it never modified by the user?".

are you 100% shure this "reset" is not a part of PooledConnection implementation?

    public static void main(String[] args)  throws Throwable{
        DB2ConnectionPoolDataSource ds = new DB2ConnectionPoolDataSource();
        ds.setServerName("db2linux");
        ds.setPortNumber(50001);
        ds.setDatabaseName("SAMPLE");
        ds.setUser("db2inst1");
        ds.setPassword("zaq1xsw2");
        ds.setDriverType(4);

        PooledConnection pooledConnection = ds.getPooledConnection();
        Connection cn = pooledConnection.getConnection();
        boolean initialAutoCommit = cn.getAutoCommit();
        System.out.printf("initialAutoCommit=%s%n", initialAutoCommit);
        cn.setAutoCommit(!initialAutoCommit);
        System.out.printf("now cn.getAutoCommit() is: %s%n", cn.getAutoCommit());
        cn.close();
        System.out.println("connection `released`");
        cn = pooledConnection.getConnection();
        System.out.println("connection `reobtained`");
        System.out.printf("cn.getAutoCommit()==initialAutoCommit : %s%n", cn.getAutoCommit()==initialAutoCommit);
        cn.close();
    }

will produce:

initialAutoCommit=true
now cn.getAutoCommit() is: false
connection `released`
connection `reobtained`
cn.getAutoCommit()==initialAutoCommit : true

Seems like every connection.close() performs "reset". I really believe such code must be placed in PooledConnection rather than in PoolImplementation.

According SoC (Separation of Concerns) PoolImplementation must deal with threading (locks/semaphors etc) and size/lifetime management.

@dimzon
Copy link
Author

dimzon commented Apr 2, 2014

btw. "reset" correctness may be broken just by using SQL instead Connection methods
for example you can change current schema using

Connection.setSchema("foo")

method or by executing

SET CURRENT_SCHEMA=foo

So your attempts to perform 100% correct reset just by intercepting Connection methods is obsolete... The only one who have a chance to deal with such cause is "native" PooledConnection implementation since it can (in theory) recieve some additional notify messages from database server as a part of SQL execution result reply.

@brettwooldridge
Copy link
Owner

Ok, I'll investigate it. BTW, HikariCP documentation states that using SQL to modify connection state rather than the JDBC-defined methods will result in unpredictable connection state management.

@dimzon
Copy link
Author

dimzon commented Apr 3, 2014

HikariCP documentation states that using SQL to modify connection state rather than the JDBC-defined methods will result in unpredictable connection state management

unfortunately this is not 100% controllable when you using third-party components...

@dimzon
Copy link
Author

dimzon commented Apr 3, 2014

generic object pooling benchmarks:
https://github.com/chrisvest/object-pool-benchmarks
like to see your ConcurrentBag vs BlazePool :)

@dimzon
Copy link
Author

dimzon commented Apr 3, 2014

StormpotBlazePool is 20x faster!

# VM invoker: /usr/lib/jvm/java-7-oracle/jre/bin/java
# VM options: -Dpool.size=4
# Threads: 32 threads, will synchronize iterations

Benchmark                                    Mode   Samples         Mean   Mean error    Units
o.s.ClaimRelease.ConcurrentBag.cycle               thrpt        20        0.925        0.016   ops/us
o.s.ClaimRelease.StormpotBlazePool.cycle    thrpt        20       19.798        0.219   ops/us

@brettwooldridge
Copy link
Owner

Re: connection state management, we don't particularly care if HikariCP solves every use case for every user, there are many pools for users to choose from if HikariCP does not work in their environment. Solutions that try to be all things to all people will by their nature offer mediocre performance. If someone is using a third-party component that, while being JDBC-based, is explicitly not using JDBC APIs to modify connection state, I'd say either stop using that component, or choose a pool that will blindly reset all connection state.

@dimzon
Copy link
Author

dimzon commented Apr 3, 2014

or choose a pool that will blindly reset all connection state.

You reject people to use your perfect solution just bcz this one feature. Isn't better to make it configurable?

@dimzon
Copy link
Author

dimzon commented Apr 3, 2014

btw I'm playing now with different object pools, this one https://gist.github.com/dimzon/9958603#file-easypool-java is very easy to read/understand and outperforms ConcurrentBug ten times (32 concurrent threads, pool size=4)

@brettwooldridge
Copy link
Owner

Please post a public fork of https://github.com/chrisvest/object-pool-benchmarks which contains your code addition of the ConcurrentBag.

@brettwooldridge
Copy link
Owner

Just a note on ConnectionPoolDataSource, I looked at the PostgreSQL driver's implementation of ConnectionPoolDataSource, and the returned PooledConnections do not perform any state reset at all other than rollback(). This means in no way could we rely on drivers doing the right thing, which without proxies would leave us performing a full state reset when recycling a connection. Doesn't seem like a win at all to me.

@brettwooldridge
Copy link
Owner

BTW, running JMH with more threads than CPU cores will generally generate less reliable results, unless you actually have 32-cores:

# Threads: 32 threads, will synchronize iterations

@dimzon
Copy link
Author

dimzon commented Apr 3, 2014

Please post a public fork of https://github.com/chrisvest/object-pool-benchmarks which contains your code addition of the ConcurrentBag.

https://github.com/dimzon/object-pool-benchmarks

Benchmark                                     Mode   Samples         Mean   Mean error    Units
o.s.ClaimRelease.CommonsPool2.cycle          thrpt        20        0.172        0.007   ops/us
o.s.ClaimRelease.CommonsPoolGeneric.cycle    thrpt        20        0.016        0.003   ops/us
o.s.ClaimRelease.CommonsPoolStack.cycle      thrpt        20        0.442        0.039   ops/us
o.s.ClaimRelease.ConcurrentBag1.cycle        thrpt        20        0.748        0.182   ops/us
o.s.ClaimRelease.EasyPoolTest.cycle          thrpt        20        7.271        0.793   ops/us
o.s.ClaimRelease.Furious.cycle               thrpt        20        3.181        0.046   ops/us
o.s.ClaimRelease.StormpotBlazePool.cycle     thrpt        20       16.563        2.414   ops/us
o.s.ClaimRelease.StormpotQueuePool.cycle     thrpt        20        0.101        0.070   ops/us
o.s.ClaimRelease.ViburObjectPool.cycle       thrpt        20        2.407        0.124   ops/us

Just a note on ConnectionPoolDataSource, I looked at the PostgreSQL driver's implementation of ConnectionPoolDataSource, and the returned PooledConnections do not perform any state reset at all other than rollback(). This means in no way could we rely on drivers doing the right thing, which without proxies would leave us performing a full state reset when recycling a connection. Doesn't seem like a win at all to me.

this really doesn't matter since your generated PooledConnection will do proper reset.
this means developers CAN CHOOSE appropriate variant:

  1. to use your generated Connection->PooledConnection (with proper reset etc) adapter OR
  2. to rely on original PooledConnection implementation from JDBC driver
    at other side developers CAN CHOOSE from
    A. to use your "ConcurrentBag" pool management OR
    B. use some other object pool management (since PooledConnection is easy to use api to maintain ConnectionPooling)

this allow multiple combinations like 1+A, 2+A, 1+B, 2+B - the freedom of choice
this is how SoC (Separation of Concerns) works

@dimzon
Copy link
Author

dimzon commented Apr 3, 2014

So I propose to split by 2 parts

  1. HikariConnectionPoolDataSource - DataSorce to ConnectionPoolDataSource adapter. All bytecode goes here. Reset code goes here

  2. HikariPoolingDataSource - implements plain DataSource, consumes any ConnectionPoolDataSource and realize connection pooling on it

developers can use any part separatly or together - at their choice

I looked at the PostgreSQL driver's implementation of ConnectionPoolDataSource, and the returned PooledConnections do not perform any state reset at all other than rollback().

this means you need to create a ticket|issue for PostgreSQL driver's developers.
JDBC drivers has bugs. Period. Even JDBC interfaces feels different. I'm cross-platform ORM developer (DB2|Oracle|MSSQL|MySQL|H2|PostgreSQL) wrote my own wrappers around JDBC interfaces to make them feel same way... Different wrappers for different drivers...

@brettwooldridge
Copy link
Owner

I'm sorry, I don't like the idea of splitting the implementation. Ordinary users cannot make an informed choice between the implementations. As I did, user's would need to understand whether their JDBC driver ConnectionPoolDataSource implementation properly resets connection state or not in order to choose it. While this is possible with open source drivers, it is not possible with closed source drivers. And in the end, what is the "win"? The "win" is simply the removal of generated proxies. Frankly, I don't care. It's not worth it to me either in terms of code complexity, or dealing with user issues because they misconfigured their pool. Javassist is practically ubiquitous in server stack dependencies these days, and we haven't had any users clamoring for its removal.

@brettwooldridge
Copy link
Owner

Re: Stormpot. While it is quite impressive, it is also unsuitable for HikariCP. The BlazePool uses ThreadLocals like HikariCP's ConcurrentBag, but it does so with a hard reference. Because application servers and containers like Tomcat need to be able to unload the entire application, and expect the ClassLoader and all classes to be garbage collected, you cannot use hard references in ThreadLocals. So, rather than a TheadLocal<Foo> (which is what BlazePool would have), you need ThreadLocal<WeakReference<Foo>>. See our issue #39 for the gory details.

That being said, it is possible that we can borrow concepts from the BlazePool implementation and bring them into HikariCP. Those will need to be investigated.

@dimzon
Copy link
Author

dimzon commented Apr 4, 2014

What about EasyPool?
04.04.2014 5:30 ÐÏÌØÚÏ×ÁÔÅÌØ "Brett Wooldridge" [email protected]
ÎÁÐÉÓÁÌ:

Re: Stormpot. While it is quite impressive, it is also unsuitable for
HikariCP. The BlazePool uses ThreadLocals like HikariCP's ConcurrentBag,
but it does so with a hard reference. Because application servers and
containers like Tomcat need to be able to unload the entire application,
and expect the ClassLoader and all classes to be garbage collected, you
cannot use hard references in ThreadLocals. So, rather than a
TheadLocal (which is what BlazePool would have), you need
ThreadLocal<WeakReference>. See our issue #39https://github.com/brettwooldridge/HikariCP/issues/39for the gory details.

That being said, it is possible that we can borrow concepts from the
BlazePool implementation and bring them into HikariCP. Those will need to
be investigated.

Reply to this email directly or view it on GitHubhttps://github.com//issues/54#issuecomment-39523919
.

@dimzon
Copy link
Author

dimzon commented Apr 4, 2014

You still misunderstood me. Keep generated proxy just make them implement
PooledConnection instead of your own api.
04.04.2014 5:22 ÐÏÌØÚÏ×ÁÔÅÌØ "Brett Wooldridge" [email protected]
ÎÁÐÉÓÁÌ:

I'm sorry, I don't like the idea of splitting the implementation. Ordinary
users cannot make an informed choice between the implementation. As I did,
user's would need to understand whether their JDBC driver
ConnectionPoolDataSource implementation properly resets connection state or
not in order to choose it. While this is possible with open source drivers,
it is not possible with closed source drivers. And in the end, what is the
"win"? The "win" is simply the removal of generated proxies. Frankly, I
don't care. It's not worth it to me either in terms of code complexity, or
dealing with user issues because they misconfigured their pool.

Reply to this email directly or view it on GitHubhttps://github.com//issues/54#issuecomment-39523603
.

@ViliusS
Copy link

ViliusS commented Nov 14, 2024

I'm sorry, I don't like the idea of splitting the implementation. Ordinary users cannot make an informed choice between the implementations. As I did, user's would need to understand whether their JDBC driver ConnectionPoolDataSource implementation properly resets connection state or not in order to choose it. While this is possible with open source drivers, it is not possible with closed source drivers. And in the end, what is the "win"? The "win" is simply the removal of generated proxies. Frankly, I don't care. It's not worth it to me either in terms of code complexity, or dealing with user issues because they misconfigured their pool. Javassist is practically ubiquitous in server stack dependencies these days, and we haven't had any users clamoring for its removal.

I would like to weight in on that and share recent experience of my own. One of our Java applications use HikariCP and MS SQL JDBC driver. Application just calls stored procedures on MS SQL and returns the result to the user. As mentioned in related mssql-jdbc link above, the driver itself doesn't "reset" the connection before every call. And as mentioned in this ticket, HikariCP doesn't implement ConnectionPoolDataSource API. So if there is orphaned transaction on SQL side (for whatever reason), connection gets reused and all hell breaks loose. Temporary tables are not cleared, SELECT queries use exclusive locking, etc. etc. In short, the experience is terrible and debugging such issues takes forever.

You could say that driver itself should implement connection "reset". Probably, but that would be very inefficient because without Pooling API returned from pooling implementation they don't know when the connection is reused. They would need then to send "reset" after every SQL call. Some older JDBC drivers did exactly that, but I see the trend that now more and more drivers start to rely on ConnectionPoolDataSource API instead, and this is understandable.

That's why I kindly ask to reconsider this issue. ConnectionPoolDataSource API implementation is needed, there is no question about that. The question is how to correctly implement it in HikariCP, so its benefits won't be lost.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants