-
Notifications
You must be signed in to change notification settings - Fork 867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add offset pushdown to parquet #3848
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
selectors.push(RowSelector::skip(skipped_count + offset)); | ||
selectors.push(RowSelector::select(selected_count - offset)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skip
and select
cannot be interleaved? i.e. all skip
selections are in front of select
selection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This operator is skipping the first offset rows, so will always start with a skip, the remaining selection afterwards can be interleaved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I meant the way you calculate skpped_count and selected_count. If original selectors are mixed with skip
and select
, is the logic still correct?
E.g., 1st selector skip
5 row, 2nd selector select
1 row, 3rd selector skip
2 rows, 4 selector select
5 rows.
With offset
2, this produces 1st selector skip
7 + 2 rows, 2nd selector select
6 - 2 rows. But original 2nd selector's 1 row is skipped now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is the correct behaviour no, the offset is an offset into the selected rows? So in this case you would expect the last 4 rows, skipping the first 2 that were selected originally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, it's correct as skipped rows only count until selected count is larger than offset.
selectors.push(RowSelector::skip(skipped_count + offset)); | ||
selectors.push(RowSelector::select(selected_count - offset)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, it's correct as skipped rows only count until selected count is larger than offset.
Benchmark runs are scheduled for baseline = c156715 and contender = dfb8c76. dfb8c76 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #.
Rationale for this change
A follow up to #3633 that also allows pushing down an offset
What changes are included in this PR?
Are there any user-facing changes?