-
Notifications
You must be signed in to change notification settings - Fork 915
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Improve parquet dictionary encoding (#10635)
This PR includes several changes to improve parquet dictionary encoding: - API cleanups: get rid of unused arguments - Remove min block limit in ` __launch_bounds__` - Simplify the grid-stride loop logic by using `while` - All threads calculate start/end indices instead of one doing the calculation and broadcasting the result (no more shared memory or block-wide sync). Other ideas tested but not eventually included in this PR due to zero or negative performance impact: - Tuning hash map occupancy - `cg::shfl` instead of shared memory + sync - CG based `insert`/`find` - Relaxed atomic for `num_dict_entries` and `uniq_data_size` - `cg::reduce` instead of `cub::BlockReduce` Before: ``` ----------------------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... ----------------------------------------------------------------------------------------------------------------------- ParquetWrite/integral_void_output/29/0/1/1/2/manual_time 734 ms 734 ms 1 bytes_per_second=697.128M/s encoded_file_size=530.706M peak_memory_usage=1.7804G ParquetWrite/integral_void_output/29/1000/1/1/2/manual_time 303 ms 303 ms 2 bytes_per_second=1.65131G/s encoded_file_size=397.998M peak_memory_usage=1.49675G ParquetWrite/integral_void_output/29/0/32/1/2/manual_time 734 ms 734 ms 1 bytes_per_second=697.713M/s encoded_file_size=530.706M peak_memory_usage=1.7804G ParquetWrite/integral_void_output/29/1000/32/1/2/manual_time 61.9 ms 61.9 ms 11 bytes_per_second=8.07721G/s encoded_file_size=159.574M peak_memory_usage=1.49675G ParquetWrite/integral_void_output/29/0/1/0/2/manual_time 690 ms 690 ms 1 bytes_per_second=742.205M/s encoded_file_size=531.066M peak_memory_usage=1.3148G ParquetWrite/integral_void_output/29/1000/1/0/2/manual_time 282 ms 282 ms 2 bytes_per_second=1.76991G/s encoded_file_size=398.712M peak_memory_usage=1.49675G ParquetWrite/integral_void_output/29/0/32/0/2/manual_time 690 ms 690 ms 1 bytes_per_second=742.268M/s encoded_file_size=531.066M peak_memory_usage=1.3148G ParquetWrite/integral_void_output/29/1000/32/0/2/manual_time 59.5 ms 59.5 ms 12 bytes_per_second=8.40878G/s encoded_file_size=199.926M peak_memory_usage=1.49675G ``` Now: ``` ----------------------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... ----------------------------------------------------------------------------------------------------------------------- ParquetWrite/integral_void_output/29/0/1/1/2/manual_time 733 ms 733 ms 1 bytes_per_second=698.24M/s encoded_file_size=530.706M peak_memory_usage=1.7804G ParquetWrite/integral_void_output/29/1000/1/1/2/manual_time 302 ms 302 ms 2 bytes_per_second=1.65496G/s encoded_file_size=397.998M peak_memory_usage=1.49675G ParquetWrite/integral_void_output/29/0/32/1/2/manual_time 733 ms 733 ms 1 bytes_per_second=698.701M/s encoded_file_size=530.706M peak_memory_usage=1.7804G ParquetWrite/integral_void_output/29/1000/32/1/2/manual_time 61.3 ms 61.3 ms 11 bytes_per_second=8.1533G/s encoded_file_size=159.572M peak_memory_usage=1.49675G ParquetWrite/integral_void_output/29/0/1/0/2/manual_time 688 ms 688 ms 1 bytes_per_second=743.71M/s encoded_file_size=531.066M peak_memory_usage=1.3148G ParquetWrite/integral_void_output/29/1000/1/0/2/manual_time 282 ms 282 ms 2 bytes_per_second=1.7712G/s encoded_file_size=398.712M peak_memory_usage=1.49675G ParquetWrite/integral_void_output/29/0/32/0/2/manual_time 688 ms 688 ms 1 bytes_per_second=743.658M/s encoded_file_size=531.066M peak_memory_usage=1.3148G ParquetWrite/integral_void_output/29/1000/32/0/2/manual_time 58.9 ms 58.9 ms 12 bytes_per_second=8.49093G/s encoded_file_size=199.926M peak_memory_usage=1.49675G ``` Authors: - Yunsong Wang (https://github.com/PointKernel) Approvers: - Jake Hemstad (https://github.com/jrhemstad) - Mike Wilson (https://github.com/hyperbolic2346) URL: #10635
- Loading branch information
1 parent
65b1cbd
commit 017d52a
Showing
3 changed files
with
88 additions
and
115 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters