-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use nvcomp's snappy compressor in ORC writer #9242
Use nvcomp's snappy compressor in ORC writer #9242
Conversation
Cmake changes (excluding changes needed in nvcomp's cmake) Replace cuIO's snappy compressor with nvcomp
…or rather than a hardcoded value
When writing statistics, there's not enough space allocated in chunk's compressed buffer. This results in the compressed buffer being written into another chunk's memory.
Codecov Report
@@ Coverage Diff @@
## branch-21.10 #9242 +/- ##
================================================
- Coverage 10.85% 10.84% -0.01%
================================================
Files 115 116 +1
Lines 19158 19171 +13
================================================
Hits 2080 2080
- Misses 17078 17091 +13
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quick review, looks good so far
device_span<gpu_inflate_input_s> comp_in, | ||
device_span<gpu_inflate_status_s> comp_out, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it mildly worse to pass a span here if you don't ever use the size? In the sense of having to hypothetically push another argument on the stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I see it, the trade-off is that with span it's clear that the parameter is an array, not a pointer to a single object. Now, whether this is a good trade-off is up for debate :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just one question.
thrust::transform(rmm::exec_policy(stream), | ||
compressed_bytes_written.begin(), | ||
compressed_bytes_written.end(), | ||
comp_out.begin(), | ||
[] __device__(size_t size) { | ||
gpu_inflate_status_s status{}; | ||
status.bytes_written = size; | ||
return status; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be replaced with a thrust::transform_output_iterator
in the call to nvcompBatchedSnappyCompressAsync
, basically to safe allocating and materializing the compressed_bytes_written
? [1]
[1] https://thrust.github.io/doc/classthrust_1_1transform__output__iterator.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvcompBatchedSnappyCompressAsync
is a C API and doesn't take iterators. 😞
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. Too bad! Thanks for clarifying 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, just a couple copyrights to update.
@gpucibot merge |
This reverts commit 08cbbcd.
Issue #9205
depends on #9235