Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add column indexes to Parquet writer #11302
Add column indexes to Parquet writer #11302
Changes from 84 commits
7892c5a
2ed90a0
5303443
7349adb
617faf3
8fba754
2a77e5b
65ea003
6ef2f2b
80ec547
c4f0f9c
1591bdd
7e8d038
f142141
b88807e
edb7f86
eed2920
ba6b9ac
47de717
06822a6
5c4b50e
646135b
646d934
ef3997f
2133fda
e1f451c
18f041b
b330680
562bf89
aae5aa9
5ab7e19
af4b4bd
abe98a4
9542388
088672b
2147971
f206255
7be2705
9f0fa88
be73d05
b031393
7139f51
722cf34
c676cb3
e11de2d
b6b85d3
17f71de
b530bee
7d9b8ae
025e6a0
0f2f5bf
48ec76a
a96ecb7
934fc76
f31f867
a07dac1
2b6e915
7085234
591b847
2547da7
f8a35a7
f929745
ad621e9
9ffd509
656b826
6a48bc6
e57331a
ed9c38a
2f52654
c7f1f9c
f2d439d
1ecab10
372f64a
82543af
770338b
fabfcbe
30b81bb
5c0e93c
7cac483
383925b
d814f6b
f74d185
82df8a9
c6f3750
17b3389
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The race is on to see who can merge first!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're going to win...there's so much to do still. >.<
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you haven't seen the state of my PRs yet!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this kernel gets invoked as a single thread but it seems to be doing per-page work. Is the number of pages ever large enough that might make sense to parallelize this work a bit? Like maybe:
It doesn't look like there's too much work being done per page, but maybe if we have a zillion pages there's some wins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a bit like gpuEncodePageHeaders, where t0 does all the work and the other threads just wait for the sync up at the end. In the profiling I've done, the cost of this step is very small (1ms vs 37 ms for encode pages and 400ms for snap_kernel). I don't know if the juice would be worth the squeeze. Can definitely revisit later if need be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thumbs up for the phrase juice worth the squeeze...