Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark nvcomp zstd compression stable #12059

Merged
merged 2 commits into from
Nov 4, 2022

Conversation

jbrennan333
Copy link
Contributor

Description

NVCOMP zstd compression was added in 22.10, but marked experimental, meaning you have to define the environment variable LIBCUDF_NVCOMP_POLICY=ALWAYS to enable it. After completing validation testing using the spark rapids plugin as documented here: NVIDIA/spark-rapids#3037, we believe that we can now change the zstd compression status to stable, which will enable it in cudf by default. LIBCUDF_NVCOMP_POLICY=STABLE is the default value.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
    I made local modifications to parquet/orc tests/benchmarks to test with ZSTD. Currently we don't test with ZSTD by default because it's possible to build with a version of nvcomp that does not support it.
  • The documentation is up to date with these changes.

@jbrennan333 jbrennan333 added feature request New feature or request cuIO cuIO issue 4 - Needs cuIO Reviewer labels Nov 3, 2022
@jbrennan333 jbrennan333 self-assigned this Nov 3, 2022
@jbrennan333 jbrennan333 requested a review from a team as a code owner November 3, 2022 16:45
@jbrennan333 jbrennan333 added the non-breaking Non-breaking change label Nov 3, 2022
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Nov 3, 2022
@davidwendt davidwendt requested a review from vuule November 3, 2022 18:30
@davidwendt
Copy link
Contributor

Does this require a documentation change somewhere else?
Seems that if we are changing a default that this should be a breaking change?

@vuule
Copy link
Contributor

vuule commented Nov 3, 2022

Does this require a documentation change somewhere else? Seems that if we are changing a default that this should be a breaking change?

The docs at

compression : {{'snappy', 'ZSTD', None}}, default 'snappy'

compression : {{ 'snappy', 'ZSTD', None }}, default 'snappy'

already include ZSTD. I don't see any other places where we list supported compression types.

Seems that if we are changing a default that this should be a breaking change?

We're not really changing the default here; this PR basically enables a feature.

@vuule
Copy link
Contributor

vuule commented Nov 3, 2022

rerun tests

1 similar comment
@vuule
Copy link
Contributor

vuule commented Nov 3, 2022

rerun tests

@ttnghia
Copy link
Contributor

ttnghia commented Nov 4, 2022

Need to merge upstream to fix CI.

Need to wait for #12067.

@jbrennan333
Copy link
Contributor Author

rerun tests

@codecov
Copy link

codecov bot commented Nov 4, 2022

Codecov Report

Base: 87.47% // Head: 88.08% // Increases project coverage by +0.61% 🎉

Coverage data is based on head (f2c055e) compared to base (f817d96).
Patch has no changes to coverable lines.

❗ Current head f2c055e differs from pull request most recent head 715edb0. Consider uploading reports for the commit 715edb0 to get more accurate results

Additional details and impacted files
@@               Coverage Diff                @@
##           branch-22.12   #12059      +/-   ##
================================================
+ Coverage         87.47%   88.08%   +0.61%     
================================================
  Files               133      135       +2     
  Lines             21826    21997     +171     
================================================
+ Hits              19093    19377     +284     
+ Misses             2733     2620     -113     
Impacted Files Coverage Δ
python/cudf/cudf/io/text.py 91.66% <0.00%> (-8.34%) ⬇️
python/cudf/cudf/core/_base_index.py 81.28% <0.00%> (-4.27%) ⬇️
python/cudf/cudf/io/json.py 92.06% <0.00%> (-2.68%) ⬇️
python/cudf/cudf/utils/utils.py 89.91% <0.00%> (-0.69%) ⬇️
python/dask_cudf/dask_cudf/core.py 73.72% <0.00%> (-0.41%) ⬇️
python/cudf/cudf/io/parquet.py 90.45% <0.00%> (-0.39%) ⬇️
python/dask_cudf/dask_cudf/backends.py 84.90% <0.00%> (-0.37%) ⬇️
python/cudf/cudf/core/column/numerical.py 95.18% <0.00%> (-0.31%) ⬇️
python/cudf/cudf/core/dataframe.py 93.63% <0.00%> (-0.10%) ⬇️
python/cudf/cudf/core/column/datetime.py 89.62% <0.00%> (-0.09%) ⬇️
... and 33 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@jbrennan333
Copy link
Contributor Author

@vuule can you please merge if this looks good?

@vuule
Copy link
Contributor

vuule commented Nov 4, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit a3e9c1c into rapidsai:branch-22.12 Nov 4, 2022
@vyasr vyasr added 4 - Needs Review Waiting for reviewer to review or respond and removed 4 - Needs cuIO Reviewer labels Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4 - Needs Review Waiting for reviewer to review or respond cuIO cuIO issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants