Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement substrait join filter support #6868

Merged
merged 4 commits into from
Jul 9, 2023

Conversation

nseekhao
Copy link
Contributor

@nseekhao nseekhao commented Jul 6, 2023

Which issue does this PR close?

Closes #6866 .

Rationale for this change

Explained in issue.

What changes are included in this PR?

  • Parse DF join filter into Substrait's post_join_filter in producer
  • Parse post_join_filter into join filter in consumer
  • Test cases:
    • non-empty join expression, non-empty filter
    • empty join expression, non-empty filter
    • query with exists clause

Are these changes tested?

Yes.

Please note that if a test case includes subquery aliases, the results will not be correct due to issue #6867 .

Are there any user-facing changes?

No

@nseekhao
Copy link
Contributor Author

nseekhao commented Jul 6, 2023

Does anyone know what's the difference between the tests/roundtrip_logical_plan.rs file and the tests/cases/roundtrip_logical_plan.rs file?

@nseekhao nseekhao changed the title Substrait join filter support feat: implement substrait join filter support Jul 6, 2023
@andygrove
Copy link
Member

@waynexia fyi

Copy link
Member

@waynexia waynexia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great, thanks 👍

@waynexia
Copy link
Member

waynexia commented Jul 9, 2023

Does anyone know what's the difference between the tests/roundtrip_logical_plan.rs file and the tests/cases/roundtrip_logical_plan.rs file?

I just realized we have two different case files 🤣 It looks like this is a mistake from #6604. I'll reorganize those test cases later when I split our consumer/producer implementation into functionality-grouped style. I believe there are many duplications...

@alamb alamb merged commit ce3a0bb into apache:main Jul 9, 2023
@alamb
Copy link
Contributor

alamb commented Jul 9, 2023

I just realized we have two different case files 🤣 It looks like this is a mistake from #6604. I'll reorganize those test cases later when I split our consumer/producer implementation into functionality-grouped style. I believe there are many duplications...

I took a quick look at it looks to me like the tests are entirely duplicated. Proposed removing them here: #6894

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Substrait: Add support for joins with filter
4 participants