Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document JDBC reWriteBatchedInserts as best practice #3578

Closed
robert-s-lee opened this issue Aug 24, 2018 · 8 comments
Closed

Document JDBC reWriteBatchedInserts as best practice #3578

robert-s-lee opened this issue Aug 24, 2018 · 8 comments
Assignees
Labels
O-sales-eng Internal source: Sales Engineering P-2 Normal priority; secondary task T-missing-info
Milestone

Comments

@robert-s-lee
Copy link
Contributor

@drewdeally found this option which improved customer's performance by over 3x.

https://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters

reWriteBatchedInserts = boolean

This will change batch inserts from insert into foo (col1, col2, col3) values (1,2,3) into insert into foo (col1, col2, col3) values (1,2,3), (4,5,6) this provides 2-3x performance improvement
@robert-s-lee
Copy link
Contributor Author

In order for this to kick in:

  • use prepared statement
  • works with "INSERT INTO accounts (id, balance) VALUES (?, 1000)"
  • works with "INSERT INTO accounts (id, balance) VALUES (?, 1000) in conflict (id) do update set balance=excluded.balance"
  • does NOT work with "INSERT INTO accounts (id, balance) VALUES (?, 1000) returning id,balance;"

The return code int[] count = stmt.executeBatch(); is also different:

  • 0 is returned with regular batch
  • -2 aka SUCCESS_NO_INFO is returned when reWriteBatchedInserts is enabled

attached is a sample java code:

javac BasicSample.java
java -cp ~/Downloads/postgresql-42.2.4.jar:./ BasicSample "jdbc:postgresql://127.0.0.1:26257/bank?sslmode=disable&user=maxroach&ApplicationName=BasicSample&reWriteBatchedInserts=true"
  • query fingerprint show batch of 1,000 being converted into 7 that has 100 rows, and other sizes.
root@:26257/defaultdb> select key, count from crdb_internal.node_statement_statistics where application_name='BasicSample';
+--------------------------------------------------------------------------+-------+
|                                   key                                    | count |
+--------------------------------------------------------------------------+-------+
| CREATE TABLE IF NOT EXISTS accounts (id INT PRIMARY KEY, balance INT)    |     1 |
| TRUNCATE TABLE accounts                                                  |     1 |
| INSERT INTO accounts(id, balance) VALUES ($1, _), (__more100__) ON       |     7 |
| CONFLICT (id) DO UPDATE SET balance = excluded.balance                   |       |
| INSERT INTO accounts(id, balance) VALUES ($1, _), (__more60__) ON        |     1 |
| CONFLICT (id) DO UPDATE SET balance = excluded.balance                   |       |
| INSERT INTO accounts(id, balance) VALUES ($1, _), (__more30__) ON        |     1 |
| CONFLICT (id) DO UPDATE SET balance = excluded.balance                   |       |
| INSERT INTO accounts(id, balance) VALUES ($1, _), (__more7__) ON         |     1 |
| CONFLICT (id) DO UPDATE SET balance = excluded.balance                   |       |
+--------------------------------------------------------------------------+-------+
(6 rows)

Time: 2.844919ms

on OSX, we can watch the packet to confirm

tshark -i lo -f "tcp port 26257" -d tcp.port==26257,pgsql -O pgsql -Y pgsql

BasicSample.java.txt

@jseldess
Copy link
Contributor

jseldess commented Aug 30, 2018

@awoods187, in our discussion on third-party tools, we might want to consider more detailed guidance/tips for JDBC/Hibernate.

@awoods187
Copy link
Contributor

yeah i think that makes sense--we will probably need docs pages for each tool that show tips/tricks etc that are needed to make things work well

@jseldess jseldess added this to the 2.1 milestone Aug 30, 2018
@robert-s-lee
Copy link
Contributor Author

@jseldess jseldess added the A-sql label Oct 5, 2018
@rmloveland
Copy link
Contributor

@jseldess I'm reassigning this to me - believe it will be part of accomplishing #2464

(shout if you disagree or are already working on this!)

@rmloveland rmloveland assigned rmloveland and unassigned jseldess Oct 18, 2018
@robert-s-lee
Copy link
Contributor Author

reWriteBatchedInserts technique was added to opensource YCSB brianfrankcooper/YCSB#1220 there are results of single, batch, batchrewrite performance comparison that may be useful.

@jseldess jseldess modified the milestones: 2.1, 2.2 Oct 29, 2018
@jseldess jseldess added the O-sales-eng Internal source: Sales Engineering label Nov 12, 2018
@rmloveland rmloveland added the P-1 High priority; must be done this release label Jan 16, 2019
@jseldess jseldess added P-2 Normal priority; secondary task and removed P-1 High priority; must be done this release labels Jan 30, 2019
@jseldess jseldess modified the milestones: 2.2, Later Jan 30, 2019
rmloveland added a commit that referenced this issue May 30, 2019
Fixes #4621, #3578, #4399.

Summary of changes:

- Update 'Build a Java App with CockroachDB' to use the Java DAO pattern

- Add a 'Recommended Practices' section that includes information about
  IMPORT, recommended batch size, and the JDBC INSERT rewriting flag

TODO:

- Update "secure" code sample to use certs.  Not worth doing yet until
  this code is reviewed by someone with Java expertise and updated to
  match good style, etc.  Once that is done, the secure code sample in
  this PR will be updated with the necessary cert munging.
rmloveland added a commit that referenced this issue Jun 6, 2019
Fixes #4621, #3578, #4399.

Summary of changes:

- Update 'Build a Java App with CockroachDB' to use:

  - the Java DAO pattern

  - JDBC

  - code for automatically retrying in case of txn retry errors

  - an example of fast bulk insertion using JDBC batching

- Add a 'Recommended Practices' section that includes information about
  IMPORT, recommended batch size, and the JDBC INSERT rewriting flag
rmloveland added a commit that referenced this issue Jun 17, 2019
Fixes #4621, #3578, #4399.

Summary of changes:

- Update 'Build a Java App with CockroachDB' to use:

  - the Java DAO pattern

  - JDBC

  - code for automatically retrying in case of txn retry errors

  - an example of fast bulk insertion using JDBC batching

- Add a 'Recommended Practices' section that includes information about
  IMPORT, recommended batch size, and the JDBC INSERT rewriting flag
rmloveland added a commit that referenced this issue Jun 20, 2019
Fixes #4621, #3578, #4399.

Summary of changes:

- Update 'Build a Java App with CockroachDB' to use:

  - the Java DAO pattern

  - JDBC

  - code for automatically retrying in case of txn retry errors

  - an example of fast bulk insertion using JDBC batching

- Add a 'Recommended Practices' section that includes information about
  IMPORT, recommended batch size, and the JDBC INSERT rewriting flag
rmloveland added a commit that referenced this issue Jun 24, 2019
Summary of changes:

- Update 'Build a Java App with CockroachDB' to use:
  - the Java DAO pattern
  - JDBC
  - code for automatically retrying in case of txn retry errors
  - an example of fast bulk insertion using JDBC batching

- Add a 'Recommended Practices' section that includes information about
  IMPORT, recommended batch size, and the JDBC INSERT rewriting flag

Does all of the above for versions 19.1 and 19.2.

Fixes #4621, #3578, #4399.
rmloveland added a commit that referenced this issue Jun 25, 2019
Summary of changes:

- Update 'Build a Java App with CockroachDB' to use:
  - the Java DAO pattern
  - JDBC
  - code for automatically retrying in case of txn retry errors
  - an example of fast bulk insertion using JDBC batching

- Add a 'Recommended Practices' section that includes information about
  IMPORT, recommended batch size, and the JDBC INSERT rewriting flag

Does all of the above for versions 19.1 and 19.2.

Fixes #4621, #3578, #4399.
@rmloveland
Copy link
Contributor

Fixed by #4855

@nirao-dd
Copy link

nirao-dd commented Oct 23, 2023

Does the reWriteBatchedInserts kick in for upsert into <table> <columns> <values>? @rmloveland @robert-s-lee @jseldess

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-sales-eng Internal source: Sales Engineering P-2 Normal priority; secondary task T-missing-info
Projects
None yet
Development

No branches or pull requests

5 participants