Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement complex join pushdown in JDBC connectors #19996

Merged
merged 6 commits into from
Jan 10, 2024
Merged

Conversation

findepi
Copy link
Member

@findepi findepi commented Dec 1, 2023

Implement non-deprecated ConnectorMetadata.applyJoin overload in
DefaultJdbcMetadata. Thew old implementation is retained as a safety
valve. The new implementation is not limited to the
List<JdbcJoinCondition> model, so allows pushdown of joins involving
more complex expressions, such as arithmetics.

@findepi findepi requested review from wendigo, ebyhr and hashhar December 1, 2023 16:18
@cla-bot cla-bot bot added the cla-signed label Dec 1, 2023
@findepi
Copy link
Member Author

findepi commented Dec 1, 2023

cc @kokosing @SemionPar

@sajjoseph
Copy link
Contributor

Will unnest operation be covered in this PR?

@findepi
Copy link
Member Author

findepi commented Dec 8, 2023

Will unnest operation be covered in this PR?

no, no plan for this.

@findepi findepi force-pushed the findepi/apply-join branch from da7a572 to bfbc2b2 Compare December 8, 2023 14:30
@findepi
Copy link
Member Author

findepi commented Dec 8, 2023

(just rebased after #19998 merged. no other changes yet)

@findepi
Copy link
Member Author

findepi commented Dec 19, 2023

(just rebased)

@findepi findepi force-pushed the findepi/apply-join branch 5 times, most recently from 7b863c2 to 75c7b08 Compare December 22, 2023 13:21
@findepi
Copy link
Member Author

findepi commented Dec 22, 2023

PTAL

@findepi findepi requested review from wendigo, losipiuk, ebyhr, kokosing and hashhar and removed request for wendigo, ebyhr and hashhar January 3, 2024 15:27
Copy link
Contributor

@wendigo wendigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % some nits.

Great change Piotr!

@findepi findepi force-pushed the findepi/apply-join branch from f0b0a45 to 5683eee Compare January 10, 2024 09:59
@findepi
Copy link
Member Author

findepi commented Jan 10, 2024

(just rebased, no other changes)

Implement non-deprecated `ConnectorMetadata.applyJoin` overload in
`DefaultJdbcMetadata`. Thew old implementation is retained as a safety
valve. The new implementation is not limited to the
`List<JdbcJoinCondition>` model, so allows pushdown of joins involving
more complex expressions, such as arithmetics.

The `BaseJdbcClient.implementJoin` and
`QueryBuilder.prepareJoinQuery` methods logically changed, but the old
implementation is left as the fallback. These methods were extension
points, so the old implementations are renamed to ensure implementors
are updated. For example, if an implementation was overriding
`BaseJdbcClient.implementJoin` it most likely wants to override the new
`implementJoin` method as well, and this is reminded about by rename of
the old method.
Restore older JDBC join pushdown implementation not based on
`ConnectorExpression` as a fallback.

This comes as a separate commit so that the introduction of
`ConnectorExpression`-based join pushdown can be seen (e.g. reviewed) as
a _change_, not as an _addition_.
@findepi findepi force-pushed the findepi/apply-join branch from 5683eee to 4450dde Compare January 10, 2024 10:07
@findepi
Copy link
Member Author

findepi commented Jan 10, 2024

CI #20324

@findepi findepi merged commit d0f188e into master Jan 10, 2024
96 of 97 checks passed
@findepi findepi deleted the findepi/apply-join branch January 10, 2024 12:14
@github-actions github-actions bot added this to the 436 milestone Jan 10, 2024
@@ -1196,6 +1199,8 @@ public void testJoinPushdown()
"nation_lowercase",
"AS SELECT nationkey, lower(name) name, regionkey FROM nation")) {
for (JoinOperator joinOperator : JoinOperator.values()) {
log.info("Testing joinOperator=%s", joinOperator);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pollutes the logs and makes it much harder to spot where the actual errors are happening. We're actually trying to reduce logging in tests. Please revert this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there is a failure on CI, how do we know what failed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add a .describedAs(...) to the assertion containing the details of which join operator is being evaluated. That will be included in the failure message.

@colebow
Copy link
Member

colebow commented Jan 10, 2024

Does this need a release note?

@findepi findepi added the no-release-notes This pull request does not require release notes entry label Jan 10, 2024
@sajjoseph
Copy link
Contributor

Thanks @findepi for this PR. I see tremendous improvements with JOIN pushdown in our custom connector (Trino-to-Trino, similar to Stargate from Starburst). We had multiple custom rules in our connector that are now working in tandem and many complex joins are now getting pushed down into the remote cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed no-release-notes This pull request does not require release notes entry
Development

Successfully merging this pull request may close these issues.

5 participants