We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grouping by an external Series doesn't always work in dask_cudf:
dask_cudf
In [34]: df = dask_cudf.from_cudf(cudf.DataFrame({'a': [1, 2, 3, 4, 5]}), npartitions=1) In [35]: s = dask_cudf.from_cudf(cudf.Series([1, 1, 1, 2, 2], name='id'), npartitions=1) In [36]: df.groupby([s]).agg(["sum"]).compute() # error ... ValueError: Metadata inference failed in `eq`. Original error is below: ------------------------ TypeError("cannot broadcast <class 'str'>")
Although, for very simple aggregations, it does -- note how I'm not wrapping "sum" in a list:
"sum"
In [37]: df.groupby([s]).agg("sum").compute() Out[37]: a id 2 9 1 6
The text was updated successfully, but these errors were encountered:
agg
Fetch correct grouping keys agg of dask groupby (#9022)
2e980b8
Fixes: #9020 This PR enables fallback to upstream `dask` when the groupby operation is performed by a list of `Series` objects. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Ashwin Srinath (https://github.com/shwina) URL: #9022
galipremsagar
Successfully merging a pull request may close this issue.
Grouping by an external Series doesn't always work in
dask_cudf
:Although, for very simple aggregations, it does -- note how I'm not wrapping
"sum"
in a list:The text was updated successfully, but these errors were encountered: