-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] libcudf Series correlation (Pearson) #1267
Comments
Updating this to explicitly refer to a libcuDF implementation now that #2719 has merged (providing Series.corr) |
Updating this to also refer to DataFrame level correlation |
@beckernick Can we now apply
Thanks. |
Assuming you mean once #4140 merges , we cannot. Groupby correlation via the standard API will require a libcuDF implementation. |
@jrhemstad discussed this today in the context of groupbys. Supporting correlation (and implicitly covariance) in the groupby machinery would require additional design, as the aggregation takes more than one input. I'm going to file a new issue to summarize and consolidate further discussion for the groupby aggregation |
Is your feature request related to a problem? Please describe.
As a cuDF user, I want to calculate the correlation of two series. Pearson correlation is likely the most commonly used as it is the default in Pandas (API docs).
Describe the solution you'd like
I'd like to be able to do this with
series1.corr(series2)
and also on DataFrame and Groupby objects.Describe alternatives you've considered
The alternative is to actually calculate the correlation manually, which is cumbersome.
The text was updated successfully, but these errors were encountered: