-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix performance regression of writes in JDBC connectors #8559
Conversation
@@ -161,6 +174,7 @@ else if (javaType == Slice.class) { | |||
PreparedStatement statement = this.statement) { | |||
if (batchSize > 0) { | |||
statement.executeBatch(); | |||
connection.commit(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@findepi Do we assume that any connection pool or the driver will ensure that the connection's autoCommit
state it reset before giving the connection to a new client?
Should I explicitly save (in constructor) and restore (here) the autocommit state?
I think this will cause failures with Oracle UCP (it doesn't reset things AFAIK) - let's see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ask because I remember that for DELETEs Grzegorz is using autocommit (which makes sense because it's always a single statement there) and if the pool gives a stale connection DELETEs will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, it failed as expected.
And it absolutely has to do with pool returning non-reset connections because if I remove anything that uses the PageSink by applying the below diff then the test passes.
I'm fixing this by changing the OraclePoolConnectionFactory to always return connections with autoCommit enabled.
Click here for diff
diff --git a/plugin/trino-oracle/src/test/java/io/trino/plugin/oracle/TestOraclePoolConnectorSmokeTest.java b/plugin/trino-oracle/src/test/java/io/trino/plugin/oracle/TestOraclePoolConnectorSmokeTest.java
index 30d6108283..e8a38c8c75 100644
--- a/plugin/trino-oracle/src/test/java/io/trino/plugin/oracle/TestOraclePoolConnectorSmokeTest.java
+++ b/plugin/trino-oracle/src/test/java/io/trino/plugin/oracle/TestOraclePoolConnectorSmokeTest.java
@@ -13,15 +13,20 @@
*/
package io.trino.plugin.oracle;
+import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableMap;
import io.airlift.testing.Closeables;
import io.trino.testing.QueryRunner;
+import io.trino.testing.sql.TestTable;
import org.testng.annotations.AfterClass;
+import org.testng.annotations.Test;
import java.io.IOException;
import static io.trino.plugin.oracle.TestingOracleServer.TEST_PASS;
import static io.trino.plugin.oracle.TestingOracleServer.TEST_USER;
+import static io.trino.testing.TestingConnectorBehavior.SUPPORTS_CREATE_TABLE;
+import static io.trino.testing.TestingConnectorBehavior.SUPPORTS_DELETE;
public class TestOraclePoolConnectorSmokeTest
extends BaseOracleConnectorSmokeTest
@@ -44,7 +49,7 @@ public class TestOraclePoolConnectorSmokeTest
.put("oracle.connection-pool.enabled", "true")
.put("oracle.remarks-reporting.enabled", "false")
.build(),
- REQUIRED_TPCH_TABLES);
+ ImmutableList.of());
}
@AfterClass(alwaysRun = true)
@@ -54,4 +59,16 @@ public class TestOraclePoolConnectorSmokeTest
Closeables.closeAll(oracleServer);
oracleServer = null;
}
+
+ @Test
+ public void testDeleteAllDataFromTable()
+ {
+ skipTestUnless(hasBehavior(SUPPORTS_CREATE_TABLE) && hasBehavior(SUPPORTS_DELETE));
+ try (TestTable table = new TestTable(oracleServer::execute, "test_delete", "AS SELECT 'good' foo FROM DUAL")) {
+ // not using assertUpdate as some connectors provide update count and some do not
+ assertQuery("SELECT count(*) FROM " + table.getName(), "VALUES 1");
+ getQueryRunner().execute("DELETE FROM " + table.getName());
+ assertQuery("SELECT count(*) FROM " + table.getName(), "VALUES 0");
+ }
+ }
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When was playing with Oracle, I noticed that changing a state of the connection survives in a connection in the pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we assume that any connection pool or the driver will ensure that the connection's
autoCommit
state it reset before giving the connection to a new client?
That should be guaranteed by the pool.
Anything else is very error-prone.
When was playing with Oracle, I noticed that changing a state of the connection survives in a connection in the pool.
Was this about autocommit state, or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I observed it for autocommit state. Hence the 2nd fixup commit.
Hikari does guarantee it - Oracle UCP seems not to.
Can you please share more about |
Sure. In 541edbe we changed the PageSink to use autocommit instead of explicit transaction management. From JDBC javadocs "If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions". MySQL and SQL Server follow this and hence those 1000 Using explicit transaction ensures that we only do 1 COMMIT for the entire batch instead of commit per statement. For comparision I added a test that does a CTAS and INSERT of On MySQL:
On SQL Server:
|
@@ -161,6 +174,7 @@ else if (javaType == Slice.class) { | |||
PreparedStatement statement = this.statement) { | |||
if (batchSize > 0) { | |||
statement.executeBatch(); | |||
connection.commit(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we assume that any connection pool or the driver will ensure that the connection's
autoCommit
state it reset before giving the connection to a new client?
That should be guaranteed by the pool.
Anything else is very error-prone.
When was playing with Oracle, I noticed that changing a state of the connection survives in a connection in the pool.
Was this about autocommit state, or something else?
// executed and committed as individual transactions." Notably MySQL and SQL Server respect this which | ||
// leads to multiple commits when we close the connection leading to slow performance. Explicit commits | ||
// where needed ensure that all of the submitted statements are committed as a single transaction and | ||
// performs better. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move to separate commit, so that revert is a pure revert.
However, please add this to revert's commit message -- revert commit message should tell the "why revert".
plugin/trino-oracle/src/main/java/io/trino/plugin/oracle/OraclePoolConnectionFactory.java
Show resolved
Hide resolved
// Oracle's pool doesn't reset autocommit of connections when reusing them so we explicitly enable autocommit | ||
// by default to match the JDBC spec. | ||
connection.setAutoCommit(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the oracle pool be configured to do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, by passing the connection property OracleConnection.CONNECTION_PROPERTY_AUTOCOMMIT
to true
to the PoolDataSource.
I'll do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite - turns out that it only affects the initial connection. UCP isn't touching the connections once created.
I'll need to keep how I've done it now.
The JDBC specification says that connections are expected to be in autocommit mode by default. Oracle's UCP connection pool doesn't reset the autocommit state when reusing an already created connection. So we explicitly enable autocommit for any connection returned from the pool to match the JDBC specification.
This reverts commit 541edbe. In 541edbe JdbcPageSink was changed to use autocommit instead of explicit transactions. They are equivalent ONLY when a single statement is being executed. But in JdbcPageSink we add multiple statements (1 per `addBatch` call) and then execute all of them using `executeBatch`. From JDBC javadocs "If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions". MySQL and SQL Server follow this and hence each statement within an `executeBatch` call is executed in it's own transaction which hurts performance because instead of a single commit we get `batchSize` (default 1000) commits for each `executeBatch` call. Using explicit transaction ensures that we only perform a single commit for the entire batch instead of a commit per statement.
1aca943
to
dd38bae
Compare
The failed Oracle test passes locally - pushed an empty commit to make sure. |
All green. |
This reverts commit 541edbe.