Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Enable automatic column projection in groupby().agg (#12124)
This PR corresponds to the Dask-cudf version of dask/dask#9442, which was found to improve the performance of many groupby-based workflows. After this PR, ```python import dask_cudf path = "/criteo-dataset/day_0.parquet" ddf = dask_cudf.read_parquet(path, split_row_groups=10) # The following takes <2s with this PR, and fails with # an OOM error on main (using a 32GB GPU): ddf.groupby("C1").agg({"C2": "mean"}).compute() ``` Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #12124
- Loading branch information