Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Populate TestTable with multi-row INSERT #18693

Merged
merged 10 commits into from
Aug 30, 2023
Merged

Populate TestTable with multi-row INSERT #18693

merged 10 commits into from
Aug 30, 2023

Conversation

findepi
Copy link
Member

@findepi findepi commented Aug 16, 2023

Use single INSERT instead of multiple single-row inserts. Should be
faster, especially e.g. for Trino.

@cla-bot cla-bot bot added the cla-signed label Aug 16, 2023
@findepi findepi force-pushed the findepi/multi-insert branch from df9472c to 2430d92 Compare August 16, 2023 08:32
@github-actions github-actions bot added bigquery BigQuery connector mongodb MongoDB connector labels Aug 16, 2023
@findepi findepi marked this pull request as draft August 16, 2023 08:32
@findepi findepi force-pushed the findepi/multi-insert branch from 2430d92 to b6ab7ea Compare August 18, 2023 10:39
@findepi findepi changed the title Create TestTable with data faster Populate TestTable with multi-row INSERT Aug 18, 2023
The interface expects the method to return `TemporaryRelation` so let's
mark implementation as returning this, no the concrete `TestTable`.
MongoTestTable was extending TestTable and then disabling most of the
functionality (namely the table creation). The only remaining thing was
name and close, which corresponds to `TemporaryRelation` interface.
BigQueryTestView was extending TestTable and then disabling/overwriting all of the
inherited functionality.
@findepi findepi force-pushed the findepi/multi-insert branch 2 times, most recently from 9280095 to 8df7ce6 Compare August 29, 2023 13:09
@github-actions github-actions bot added iceberg Iceberg connector delta-lake Delta Lake connector labels Aug 29, 2023
@findepi findepi added test no-release-notes This pull request does not require release notes entry labels Aug 29, 2023
@findepi findepi marked this pull request as ready for review August 29, 2023 13:11
@findepi findepi force-pushed the findepi/multi-insert branch from 8df7ce6 to 679c93e Compare August 29, 2023 14:33
Copy link
Contributor

@findinpath findinpath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add in the description of the PR, for posterity, if you spot more dramatic test runtime improvements.

This follows `TestTable` pattern (and many other classes).

This also makes `BigQueryViewCreateAndInsertDataSetup` more robust in
case VIEW creation fails.
The test expects the three rows are inserted as individual operations,
so let's do this explicitly.
Use single INSERT instead of multiple single-row inserts. Should be
faster, especially e.g. for Trino.
@findepi findepi force-pushed the findepi/multi-insert branch from 679c93e to f12bf10 Compare August 30, 2023 06:06
@findepi findepi merged commit 1e7bc05 into master Aug 30, 2023
@findepi findepi deleted the findepi/multi-insert branch August 30, 2023 06:44
@hashhar
Copy link
Member

hashhar commented Aug 30, 2023

thanks a lot Piotr. I already see minimum of 3m savings per JDBC connector test. It'll be more for slow connectors like Redshift (but we don't run those on CI today).

@github-actions github-actions bot added this to the 426 milestone Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bigquery BigQuery connector cla-signed delta-lake Delta Lake connector iceberg Iceberg connector mongodb MongoDB connector no-release-notes This pull request does not require release notes entry test
Development

Successfully merging this pull request may close these issues.

4 participants