-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Standardize applymap support with pandas to enable Dask applymap #10169
Comments
I can look into this, @beckernick . |
This issue has been labeled |
Part of #10169 Authors: - https://github.com/brandon-b-miller Approvers: - Bradley Dice (https://github.com/bdice) URL: #10497
Naive implementation of `DataFrame.applymap` that just calls `apply` in a loop over columns. This could theoretically be made much faster within our framework. This requires at worst `N` compilations and `M` kernel launches, where `N` is the number of different dtypes in the data, and `M` is the number of total columns. We could however as an improvement to this launch just one kernel that populates the entire output data. This would still suffer from the compilation bottleneck however, since the function must be compiled in order for an output dtype to be determined, and this will need to be done for each distinct dtype within the data. Part of #10169 Authors: - https://github.com/brandon-b-miller - Bradley Dice (https://github.com/bdice) Approvers: - Bradley Dice (https://github.com/bdice) - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #10542
This issue has been labeled |
This is in progress, applymap is deprecated and we're moving to supercede it with |
with #11031 merged I think this is done :) |
Today, we support the
applymap
interface on Series but not DataFrames. Pandas supportsapplymap
on DataFrames but not Series. In pandas, the interface provides applies a scalar function/UDF to every element in the dataframe (elementwise UDF).The reason for our Series.applymap implementation may be that, until recently, we did not support an elementwise
apply
interface directly. Now that we do, it's possibleSeries.applymap
is redundant, as both interfaces explicitly provide users access to elementwise UDFs run via our udf pipeline. (Please feel free to correct me if I'm off base here).For compatibility with Dask, we should explore aligning our interfaces with pandas. Right now, it's not possible to use
applymap
with Dask-cuDF, as the Dask.Series object does not have our Series interface and we don't implement the DataFrame interface.We might consider:
The text was updated successfully, but these errors were encountered: