-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prepare dask_cudf test_parquet.py for upcoming API changes #10709
Prepare dask_cudf test_parquet.py for upcoming API changes #10709
Conversation
Merge pull request rapidsai#5690 from ajschmidt8/phase2 [skip ci] Update master references for main branch
[RELEASE] Re-release v0.15 cudf [skip-ci]
[RELEASE] cudf v0.17
[RELEASE] cudf v0.18
[RELEASE] Release v0.18.1 cudf
[RELEASE] v0.18.2 `cudf` release [skip-ci]
[RELEASE] v0.19.1 cudf
[RELEASE] v0.19.2 cudf [skip-ci]
LGTM . I'll also ping @randerzander directly to get his comments on the changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like auto-generated changelog modifications need to be reverted.
Oops - Thanks for pointing this out @galipremsagar ! |
@randerzander - It looks like this PR is now blocking cudf CI (the upstream changes have begun). So, let me know if the current changes are "ok" for now. |
@gpucibot merge |
This is a relatively-simple PR to clean up
dask_cudf
'sto/read_parquet
tests. These changes are mostly meant to avoid future test failures that will arise after impending changes are implemented in up-stream Dask. These changes include:write_metadata_file
will becomeFalse
forto_parquet
(because writing the _metadata file scales very poorly)split_row_groups
will becomeFalse
(because this setting is typically optimal when the file are not too large). Users with larger-than-memory files will need to specifysplit_row_groups=True/int
explicitly.gather_statistics
argument will be removed in favor of a more descriptivecalculate_divisions
argument.This PR also removes the long-deprecated
row_groups_per_part
argument fromdask_cudf.read_parquet
(established replacement issplit_row_groups
).