Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Square Root via __array_ufunc__ protocol #1055

Closed
mrocklin opened this issue Feb 26, 2019 · 6 comments
Closed

[FEA] Square Root via __array_ufunc__ protocol #1055

mrocklin opened this issue Feb 26, 2019 · 6 comments
Assignees
Labels
dask Dask issue feature request New feature or request Python Affects Python cuDF API.

Comments

@mrocklin
Copy link
Collaborator

mrocklin commented Feb 26, 2019

As a special case of #256 it would be useful to have a sqrt computation accessible from cudf on the Python side. This would enable the computation of groupby-std (among other things).

Looking at instances of sqrt in the codebase today

MROCKLIN-MLT:cudf mrocklin$ git grep sqrt
cpp/include/cudf/functions.h:gdf_error gdf_sqrt_generic(gdf_column *input, gdf_column *output);
cpp/include/cudf/functions.h:gdf_error gdf_sqrt_f32(gdf_column *input, gdf_column *output);
cpp/include/cudf/functions.h:gdf_error gdf_sqrt_f64(gdf_column *input, gdf_column *output);
cpp/python/libgdf_cffi/tests/test_unaryops.py:def test_sqrt(dtype, ulp):
cpp/python/libgdf_cffi/tests/test_unaryops.py:    math_op_test(dtype, ulp, np.sqrt, libgdf.gdf_sqrt_generic,
cpp/src/unary/unary_ops.cu:        return std::sqrt(data);
cpp/src/unary/unary_ops.cu:DEF_UNARY_OP_REAL(gdf_sqrt)
cpp/src/unary/unary_ops.cu:gdf_error gdf_sqrt_f32(gdf_column *input, gdf_column *output) {
cpp/src/unary/unary_ops.cu:gdf_error gdf_sqrt_f64(gdf_column *input, gdf_column *output) {
python/cudf/bindings/cudf_cpp.pxd:    cdef gdf_error gdf_sqrt_generic(gdf_column *input, gdf_column *output)
python/cudf/bindings/cudf_cpp.pxd:    cdef gdf_error gdf_sqrt_f32(gdf_column *input, gdf_column *output)
python/cudf/bindings/cudf_cpp.pxd:    cdef gdf_error gdf_sqrt_f64(gdf_column *input, gdf_column *output)
python/cudf/dataframe/series.py:        return np.sqrt(self.var(ddof=ddof))

We see references to it in C++ side, a few references to bindings in the Cython side, but no use of it on the Python side as far as I can tell. Oddly we do see np.sqrt used in the Series.std method, which I suspect converts to a numpy array?

@mrocklin mrocklin added feature request New feature or request Needs Triage Need team to review and classify dask Dask issue labels Feb 26, 2019
@kkraus14 kkraus14 added Python Affects Python cuDF API. and removed Needs Triage Need team to review and classify labels Mar 4, 2019
@harrism
Copy link
Member

harrism commented Mar 7, 2019

This appears to be true of all unary ops. All the unary ops are implemented in C++ and have Cython cdefs. They are just not exposed via the Series interface in python.

@thomcom
Copy link
Contributor

thomcom commented Mar 8, 2019

Is the syntax you require for this

import cudf as cd
gdf = cd.DataFrame({'x': [1, 2, 3]})
cd.sqrt(gdf)

?

@mrocklin
Copy link
Collaborator Author

mrocklin commented Mar 8, 2019

Currently we use np.sqrt(gdf) or np.sqrt(gdf.x)

We could drop that down to gdf ** 0.5 in dask-dataframe if desired.

Or, at some point, we should implement the __array_ufunc__ protocol so that np.sqrt(gdf) calls cudf.sqrt(gdf) appropriately. See #256

@thomcom
Copy link
Contributor

thomcom commented Mar 8, 2019

For today, I can give you gd.sqrt

@mrocklin
Copy link
Collaborator Author

mrocklin commented Mar 9, 2019

If it's ok, I'd like to leave this open until one of np.sqrt(s) or s ** 0.5 work. Those are what are necessary to get operations like groupby-std to work.

@mrocklin mrocklin reopened this Mar 9, 2019
@kkraus14 kkraus14 changed the title [FEA] Square Root [FEA] Square Root via __array_ufunc__ protocol Mar 9, 2019
@thomcom
Copy link
Contributor

thomcom commented Mar 9, 2019

Oh sorry, I thought cudf.sqrt would be adequate. I'll look at array_ufunc now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dask Dask issue feature request New feature or request Python Affects Python cuDF API.
Projects
None yet
Development

No branches or pull requests

4 participants