Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Pre-emptive fix for upstream
dask.dataframe.read_parquet
changes (#…
…12983) Once dask/dask#10007 is merged, users will be able to pass a dictionary of hive-partitioning options to `dd.read_parquet` (using the `dataset=` kwarg). This new feature provides a workaround for the fact that `pyarrow.dataset.Partitioning` objects **cannot** be serialized in Python. In order for this feature to be supported in `dask_cudf` the `CudfEngine.read_partition` method must account for the case that `partitioning` is a `dict`. **NOTE**: It is not possible to add test coverage for this change until dask#10007 is merged. However, I don't see any good reason not to merge this PR **before** dask#10007. Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: #12983
- Loading branch information