-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect results with parquet filtering pushdown enabled #4005
Comments
I wonder if this relates to pushing down an empty projection, it is possible there is a bug there, I don't remember seeing a test in arrow-rs for this. Edit: Sadly not 😢 Edit Edit: I think I have found it |
Ok so the bug is that You can see this, by reordering the predicates in the query
This is likely also the issue behind #4006 |
Thank you for debugging / fixing this @tustvold |
…edicate (apache#4005) (apache#4006) (apache#4021) * Project columns within DatafusionArrowPredicate (apache#4005) (apache#4006) * Add test * Format * Fix merge blunder Co-authored-by: Andrew Lamb <[email protected]>
…edicate (apache#4005) (apache#4006) (apache#4021) * Project columns within DatafusionArrowPredicate (apache#4005) (apache#4006) * Add test * Format * Fix merge blunder Co-authored-by: Andrew Lamb <[email protected]>
Describe the bug
DataFusion gets different answers when parquet pushdown is enabled
NOTE that pushdown filtering is not enabled by default (as we are still working on it) so this issue will not likely affect users:
To Reproduce
repro.zip
The query run is
Expected behavior
Same answer should be produced with and without page index filtering enabled. However, the answers are different
Without filter pushdown
39982
rows are producedWith it enabled:
Additional context
Found by the test here #3976
The text was updated successfully, but these errors were encountered: