-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use nvcomp's snappy compressor in parquet writer #8229
Use nvcomp's snappy compressor in parquet writer #8229
Conversation
Cmake changes (excluding changes needed in nvcomp's cmake) Replace cuIO's snappy compressor with nvcomp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of C++ suggestions. Otherwise 🔥
Did not review the CMake stuff, not proficient enough for it to be useful.
Updates the Java bindings to nvcomp to statically link libnvcomp. This will help avoid libnvcomp v1.x and v2.x conflicts when libcudf starts pulling in libnvcomp 2.x as part of #8229. Switching to a statically-linked libnvcomp requires a small patch to the nvcomp source, as it is hard-coded to only produce a shared library. The patch changes the target to a static library compiled with position-independent code so it can be linked into a shared object like libcudfjni.so. Authors: - Jason Lowe (https://github.com/jlowe) Approvers: - Robert (Bobby) Evans (https://github.com/revans2) URL: #8334
When writing statistics, there's not enough space allocated in chunk's compressed buffer. This results in the compressed buffer being written into another chunk's memory.
rerun tests |
Codecov Report
@@ Coverage Diff @@
## branch-21.10 #8229 +/- ##
===============================================
Coverage ? 10.81%
===============================================
Files ? 115
Lines ? 18775
Branches ? 0
===============================================
Hits ? 2030
Misses ? 16745
Partials ? 0 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few more minor suggestions. My main gripe is the previously posted comment on error handling.
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvcomp CMake addition LGTM
@gpucibot merge |
Adds nvcomp dependency and uses nvcomp's batched snappy compression functions in parquet writer.
Adds an environment variable
LIBCUDF_USE_NVCOMP
to switch between cuIO's internal snappy compressor and nvcomp's compressor.Using nvcomp is disabled by default.
Use
export LIBCUDF_USE_NVCOMP=1
to switch to nvcomp's compressor.