-
Notifications
You must be signed in to change notification settings - Fork 916
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Enable ZSTD compression in ORC and Parquet writers (#11551)
Closes #9058, #9056 Expands nvCOMP adapter to include ZSTD compression. Adds centralized nvCOMP policy. `is_compression_enabled`. Adds centralized nvCOMP alignment utility, `compress_input_alignment_bits`. Adds centralized nvCOMP utility to get the maximum supported compression chunk size - `batched_compress_max_allowed_chunk_size`. Encoded ORC row groups are aligned based on compression requirements. Encoded Parquet pages are aligned based on compression requirements. Parquet fragment size now scales with the page size to better fit the default page size with ZSTD compression. Small refactoring around `decompress_status` for improved type safety and hopefully naming. Replaced `snappy_compress` from the Parquet writer with the nvCOMP adapter call. Vectors of `compression_result`s are initialized before compression to avoid issues with random chunk skipping due to uninitialized memory. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Jason Lowe (https://github.com/jlowe) - Jim Brennan (https://github.com/jbrennan333) - Mike Wilson (https://github.com/hyperbolic2346) - Tobias Ribizel (https://github.com/upsj) - Matthew Roeschke (https://github.com/mroeschke) URL: #11551
- Loading branch information
Showing
31 changed files
with
686 additions
and
405 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.