Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variable fragment sizes for Parquet writer #12685

Merged
merged 55 commits into from
Feb 22, 2023

Conversation

etseidl
Copy link
Contributor

@etseidl etseidl commented Feb 2, 2023

Description

Fixes #12613

This PR adds the ability for columns to have different fragment sizes. This allows a large fragment size for narrow columns, but allows for finer grained fragments for very wide columns. This change should make wide columns fit (approximately) within page size constraints, and should help with compressors that rely on pages being under a certain threshold (i.e. Zstandard).

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@rapids-bot
Copy link

rapids-bot bot commented Feb 2, 2023

Pull requests from external contributors require approval from a rapidsai organization member with write or admin permissions before CI can begin.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Feb 2, 2023
@etseidl
Copy link
Contributor Author

etseidl commented Feb 2, 2023

This PR retains the performance gains seen in #12627 for LIST benchmarks.

23.04:

## parquet_write_encode

### [0] NVIDIA RTX A6000

| data_type | cardinality | run_length | Samples |  CPU Time  | Noise |  GPU Time  | Noise | bytes_per_second | peak_memory_usage | encoded_file_size |
|-----------|-------------|------------|---------|------------|-------|------------|-------|------------------|-------------------|-------------------|
|  INTEGRAL |           0 |          1 |      5x | 111.447 ms | 0.07% | 111.442 ms | 0.07% |       4817500960 |         2.146 GiB |       498.123 MiB |
|  INTEGRAL |        1000 |          1 |    332x |  45.153 ms | 1.95% |  45.148 ms | 1.95% |      11891471139 |         2.770 GiB |       161.438 MiB |
|  INTEGRAL |           0 |         32 |    154x |  35.018 ms | 0.50% |  35.013 ms | 0.50% |      15333351555 |         2.770 GiB |        27.720 MiB |
|  INTEGRAL |        1000 |         32 |     17x |  30.016 ms | 0.12% |  30.011 ms | 0.12% |      17889329183 |         2.770 GiB |        14.403 MiB |
|    STRING |           0 |          1 |      5x | 130.701 ms | 0.19% | 130.696 ms | 0.19% |       4107797930 |         1.342 GiB |       597.486 MiB |
|    STRING |        1000 |          1 |    528x |  26.473 ms | 0.98% |  26.467 ms | 0.98% |      20284442095 |       677.964 MiB |        46.473 MiB |
|    STRING |           0 |         32 |      5x | 130.522 ms | 0.10% | 130.517 ms | 0.10% |       4113429866 |         1.342 GiB |       597.486 MiB |
|    STRING |        1000 |         32 |     35x |  14.425 ms | 0.42% |  14.420 ms | 0.42% |      37232061752 |       677.964 MiB |         8.504 MiB |
|      LIST |           0 |          1 |      5x | 534.714 ms | 0.03% | 534.707 ms | 0.03% |       1004046830 |         1.602 GiB |       498.003 MiB |
|      LIST |        1000 |          1 |      5x | 355.770 ms | 0.06% | 355.763 ms | 0.06% |       1509067048 |         2.752 GiB |       166.640 MiB |
|      LIST |           0 |         32 |      5x | 270.287 ms | 0.04% | 270.281 ms | 0.04% |       1986343787 |         2.752 GiB |        37.257 MiB |
|      LIST |        1000 |         32 |      5x | 274.208 ms | 0.07% | 274.202 ms | 0.07% |       1957941480 |         2.752 GiB |        24.422 MiB |
|    STRUCT |           0 |          1 |      5x | 135.583 ms | 0.21% | 135.578 ms | 0.21% |       3959881290 |         1.283 GiB |       569.525 MiB |
|    STRUCT |        1000 |          1 |     56x |  40.450 ms | 0.50% |  40.444 ms | 0.50% |      13274383477 |         1.324 GiB |        90.699 MiB |
|    STRUCT |           0 |         32 |      5x | 108.386 ms | 0.44% | 108.380 ms | 0.44% |       4953599367 |         1.473 GiB |       409.317 MiB |
|    STRUCT |        1000 |         32 |     19x |  27.252 ms | 0.48% |  27.247 ms | 0.48% |      19703957784 |         1.324 GiB |        15.400 MiB |

This PR:

| data_type | cardinality | run_length | Samples |  CPU Time  | Noise |  GPU Time  | Noise | bytes_per_second | peak_memory_usage | encoded_file_size |
|-----------|-------------|------------|---------|------------|-------|------------|-------|------------------|-------------------|-------------------|
|  INTEGRAL |           0 |          1 |      5x | 111.031 ms | 0.09% | 111.027 ms | 0.09% |       4835508962 |         2.143 GiB |       498.123 MiB |
|  INTEGRAL |        1000 |          1 |     12x |  44.471 ms | 0.25% |  44.466 ms | 0.25% |      12073838869 |         2.768 GiB |       161.438 MiB |
|  INTEGRAL |           0 |         32 |     15x |  34.655 ms | 0.36% |  34.650 ms | 0.36% |      15494046559 |         2.768 GiB |        27.720 MiB |
|  INTEGRAL |        1000 |         32 |     17x |  29.759 ms | 0.23% |  29.754 ms | 0.22% |      18043576900 |         2.768 GiB |        14.403 MiB |
|    STRING |           0 |          1 |      5x | 129.735 ms | 0.15% | 129.731 ms | 0.15% |       4138348486 |         1.342 GiB |       597.486 MiB |
|    STRING |        1000 |          1 |    528x |  26.598 ms | 0.85% |  26.592 ms | 0.85% |      20188830643 |       677.452 MiB |        46.473 MiB |
|    STRING |           0 |         32 |      5x | 129.458 ms | 0.19% | 129.453 ms | 0.19% |       4147236475 |         1.342 GiB |       597.486 MiB |
|    STRING |        1000 |         32 |     34x |  14.780 ms | 0.37% |  14.775 ms | 0.37% |      36337166491 |       677.452 MiB |         8.505 MiB |
|      LIST |           0 |          1 |      5x | 262.710 ms | 0.20% | 262.704 ms | 0.20% |       2043631541 |         1.602 GiB |       498.620 MiB |
|      LIST |        1000 |          1 |      5x | 137.624 ms | 0.35% | 137.619 ms | 0.35% |       3901142053 |         2.752 GiB |       167.154 MiB |
|      LIST |           0 |         32 |      5x | 113.366 ms | 0.20% | 113.361 ms | 0.20% |       4735919435 |         2.752 GiB |        37.597 MiB |
|      LIST |        1000 |         32 |      5x | 109.960 ms | 0.21% | 109.953 ms | 0.21% |       4882723938 |         2.752 GiB |        24.811 MiB |
|    STRUCT |           0 |          1 |      5x | 135.770 ms | 0.25% | 135.765 ms | 0.25% |       3954415646 |         1.283 GiB |       569.525 MiB |
|    STRUCT |        1000 |          1 |     13x |  40.804 ms | 0.37% |  40.799 ms | 0.37% |      13158931041 |         1.323 GiB |        90.699 MiB |
|    STRUCT |           0 |         32 |      5x | 108.203 ms | 0.19% | 108.198 ms | 0.19% |       4961947254 |         1.473 GiB |       409.317 MiB |
|    STRUCT |        1000 |         32 |     34x |  27.772 ms | 0.50% |  27.767 ms | 0.50% |      19334866180 |         1.323 GiB |        15.399 MiB |

@etseidl
Copy link
Contributor Author

etseidl commented Feb 2, 2023

For parquet_write_io_compression
23.04:

|    io    | compression | cardinality | run_length | Samples | CPU Time | Noise | GPU Time | Noise | bytes_per_second | peak_memory_usage | encoded_file_size |
|----------|-------------|-------------|------------|---------|----------|-------|----------|-------|------------------|-------------------|-------------------|
| FILEPATH |      SNAPPY |           0 |          1 |      5x |  3.468 s | 0.42% |  3.468 s | 0.42% |        154828664 |         1.556 GiB |       493.950 MiB |
| FILEPATH |      SNAPPY |        1000 |          1 |      5x |  2.000 s | 0.44% |  2.000 s | 0.44% |        268476241 |         2.536 GiB |       161.238 MiB |
| FILEPATH |      SNAPPY |           0 |         32 |      5x |  1.603 s | 0.21% |  1.603 s | 0.21% |        334929961 |         2.532 GiB |        49.703 MiB |
| FILEPATH |      SNAPPY |        1000 |         32 |      5x |  1.600 s | 0.20% |  1.600 s | 0.20% |        335544238 |         2.536 GiB |        23.415 MiB |

This PR:

|    io    | compression | cardinality | run_length | Samples | CPU Time | Noise | GPU Time | Noise | bytes_per_second | peak_memory_usage | encoded_file_size |
|----------|-------------|-------------|------------|---------|----------|-------|----------|-------|------------------|-------------------|-------------------|
| FILEPATH |      SNAPPY |           0 |          1 |      5x |  3.460 s | 0.55% |  3.460 s | 0.55% |        155186150 |         1.556 GiB |       494.475 MiB |
| FILEPATH |      SNAPPY |        1000 |          1 |      5x |  2.023 s | 0.43% |  2.023 s | 0.43% |        265387308 |         2.536 GiB |       161.668 MiB |
| FILEPATH |      SNAPPY |           0 |         32 |      5x |  1.620 s | 0.13% |  1.620 s | 0.13% |        331384852 |         2.532 GiB |        50.055 MiB |
| FILEPATH |      SNAPPY |        1000 |         32 |      5x |  1.613 s | 0.13% |  1.613 s | 0.13% |        332883310 |         2.536 GiB |        23.734 MiB |

@etseidl etseidl force-pushed the feature/frag_sizev4 branch from fa06424 to b8bb1dd Compare February 3, 2023 17:56
@etseidl
Copy link
Contributor Author

etseidl commented Feb 3, 2023

LIST write benchmark got even better with the 4 fragments/page change, but this is due to allowing more parallelism in page encoding and compression due to simply adding more pages of smaller size. Kind of a cheat 😅

| data_type | cardinality | run_length | Samples |  CPU Time  | Noise |  GPU Time  | Noise | bytes_per_second | peak_memory_usage | encoded_file_size |
|-----------|-------------|------------|---------|------------|-------|------------|-------|------------------|-------------------|-------------------|
|      LIST |           0 |          1 |      5x | 255.131 ms | 0.35% | 255.126 ms | 0.35% |       2104338618 |         1.602 GiB |       498.625 MiB |
|      LIST |        1000 |          1 |      5x | 124.339 ms | 0.15% | 124.334 ms | 0.15% |       4317982003 |         2.752 GiB |       167.056 MiB |
|      LIST |           0 |         32 |      5x | 101.464 ms | 0.44% | 101.459 ms | 0.44% |       5291510044 |         2.752 GiB |        37.520 MiB |
|      LIST |        1000 |         32 |    150x | 100.153 ms | 1.63% | 100.148 ms | 1.63% |       5360782799 |         2.752 GiB |        24.722 MiB |

@etseidl etseidl force-pushed the feature/frag_sizev4 branch from c674c23 to 138d5f4 Compare February 4, 2023 00:09
@vuule
Copy link
Contributor

vuule commented Feb 15, 2023

/ok to test

Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, thank you for implementing this!
It's complicated, but not as much as it's beneficial :)

@etseidl
Copy link
Contributor Author

etseidl commented Feb 15, 2023

Thanks for all the help! Going to release it into the wild now :D

@etseidl etseidl changed the title [WIP] Variable fragment sizes for Parquet writer Variable fragment sizes for Parquet writer Feb 15, 2023
@etseidl etseidl marked this pull request as ready for review February 15, 2023 23:08
@etseidl etseidl requested a review from a team as a code owner February 15, 2023 23:08
Copy link
Contributor

@hyperbolic2346 hyperbolic2346 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the use of std::optional in here over a magic default value. Some comments, but mostly nits.

cpp/src/io/parquet/page_enc.cu Outdated Show resolved Hide resolved
cpp/src/io/parquet/writer_impl.cu Show resolved Hide resolved
cpp/src/io/parquet/writer_impl.cu Outdated Show resolved Hide resolved
cpp/src/io/parquet/writer_impl.cu Outdated Show resolved Hide resolved
@vuule
Copy link
Contributor

vuule commented Feb 16, 2023

/ok to test

@hyperbolic2346
Copy link
Contributor

/ok to test

@vuule
Copy link
Contributor

vuule commented Feb 22, 2023

/merge

@rapids-bot rapids-bot bot merged commit d077c9b into rapidsai:branch-23.04 Feb 22, 2023
@etseidl etseidl deleted the feature/frag_sizev4 branch February 22, 2023 17:47
rapids-bot bot pushed a commit that referenced this pull request Mar 1, 2023
Fixes #12867.

Bug introduced in #12685. A calculation of total bytes in a column was returned in a 32-bit `size_type` rather than 64-bit `size_t` leading to overflow for tables with many millions of rows.

Authors:
  - Ed Seidl (https://github.com/etseidl)
  - Vukasin Milovanovic (https://github.com/vuule)
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Vukasin Milovanovic (https://github.com/vuule)
  - Karthikeyan (https://github.com/karthikeyann)
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #12870
rapids-bot bot pushed a commit that referenced this pull request Aug 3, 2023
…ragment size (#13806)

#12685 introduced a bug in page calculation. If the `max_page_size_rows` parameter is set smaller than the page fragment size, the writer will produce a spurious empty page. This PR fixes this by only checking the fragment size if there are already rows in the page, and then returns the old check for number of rows exceeding the page limit.

Interestingly, libcudf can read these files with empty pages just fine, but parquet-mr cannot.

Authors:
  - Ed Seidl (https://github.com/etseidl)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - Vukasin Milovanovic (https://github.com/vuule)

URL: #13806
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuIO cuIO issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Performance Performance related issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Compressing a table with larger strings with ZSTD fails to compress
5 participants