Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test Iceberg cost-based plans on TPC-H and TPC-DS #14489

Merged
merged 4 commits into from
Oct 7, 2022

Conversation

findepi
Copy link
Member

@findepi findepi commented Oct 6, 2022

No description provided.

@cla-bot cla-bot bot added the cla-signed label Oct 6, 2022
@findepi
Copy link
Member Author

findepi commented Oct 6, 2022

currently based on #14468 and #14375

cc @alexjo2144 @ebyhr @sopel39 @raunaqmorarka @przemekak

@findepi findepi added test performance no-release-notes This pull request does not require release notes entry labels Oct 6, 2022
@findepi findepi force-pushed the findepi/iceberg-plans branch 4 times, most recently from 763aed1 to 6949624 Compare October 6, 2022 10:57
@findepi
Copy link
Member Author

findepi commented Oct 6, 2022

currently based on #14468 and #14375

rebased after #14468 merged

prep part of this PR, that doesn't depend on #14375, is extracted as #14497

@findepi
Copy link
Member Author

findepi commented Oct 6, 2022

rebased after #14497 merged

Migrate convenience methods from HiveMinioDataLake to Minio itself, so
that it's viable to use MinIO alone without Hadoop container.
@findepi findepi marked this pull request as ready for review October 6, 2022 19:45
@findepi
Copy link
Member Author

findepi commented Oct 6, 2022

Merging #14503 back here, which allows to remove dependency on #14375.
The bigger benefit is that i no longer need Hadoop container with HMS and a temporary DistributedQueryRunner just to register the tables. This improves the startup time greatly.

@findepi findepi force-pushed the findepi/iceberg-plans branch 2 times, most recently from cf24555 to 0d81a3c Compare October 6, 2022 19:59
@findepi findepi force-pushed the findepi/iceberg-plans branch from 0d81a3c to 53bbeac Compare October 7, 2022 07:32
@findepi findepi changed the title Test Iceberg cost-based plans on TPC-DS Test Iceberg cost-based plans on TPC-H and TPC-DS Oct 7, 2022
@findepi
Copy link
Member Author

findepi commented Oct 7, 2022

For completeness, I've pulled TPC-H dataset metadata as well.

@@ -142,13 +151,20 @@ protected void generate()
getQueryPlanResourcePath(queryResourcePath));
createParentDirs(queryPlanWritePath.toFile());
write(generateQueryPlan(readQuery(queryResourcePath)).getBytes(UTF_8), queryPlanWritePath.toFile());
System.out.println("Generated expected plan for query: " + queryResourcePath);
log.info("Generated expected plan for query: %s", queryResourcePath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could be separate commit :P

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -16,8 +16,8 @@
import java.util.stream.Stream;

/**
* This class tests cost-based optimization rules related to joins. It contains unmodified TPCH queries.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: comment improvements in existing code could be separate commit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

System.out goes to logs anyway, just the context is stripped.
This uses TPC-H sf1000 and TPC-DS sf1000 Iceberg ORC data sets' metadata
files generated on Starburst's benchmark infrastructure.

The tables have no history and were created using single CTAS.
@findepi
Copy link
Member Author

findepi commented Oct 7, 2022

CI #14519

@findepi findepi force-pushed the findepi/iceberg-plans branch from 53bbeac to ede2e27 Compare October 7, 2022 12:43
@findepi findepi merged commit 1e89fb2 into trinodb:master Oct 7, 2022
@findepi findepi deleted the findepi/iceberg-plans branch October 7, 2022 12:44
@github-actions github-actions bot added this to the 400 milestone Oct 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed no-release-notes This pull request does not require release notes entry performance test
Development

Successfully merging this pull request may close these issues.

3 participants