Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable compiled binary ops in libcudf, python and java #8741

Merged

Conversation

karthikeyann
Copy link
Contributor

@karthikeyann karthikeyann commented Jul 14, 2021

closes #7801

cudf::binary_operation calls compiled binary ops.
cudf::jit::binary_operation calls jit binary ops
So, compiled binary ops is called in libcudf (groupby, rescale), python (binary ops) and java (binary ops)

Breaking change:
New: Logical and Comparison operators can have output type to be only bool type.
Old: Logical operators can have integer or any other output type that can be constructed from bool type. Comparison operators required bool type only.

In this release (21.10), experimental namespace is dropped, and compiled binary ops replaces jit binary ops in libcudf, except for user defined op.

@karthikeyann karthikeyann requested review from a team as code owners July 14, 2021 15:50
@github-actions github-actions bot added Java Affects Java cuDF API. Python Affects Python cuDF API. libcudf Affects libcudf (C++/CUDA) code. labels Jul 14, 2021
@karthikeyann karthikeyann added breaking Breaking change feature request New feature or request labels Jul 14, 2021
Copy link
Contributor

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran through all of the Spark tests with this and they all passed. I deleted the JIT cache before running them and it was not recreated. The effectively removes all JIT from the Spark plugin.

Comment on lines 86 to 89
* @copydoc cudf::experimental::binary_operation(scalar const&, column_view const&, binary_operator,
* data_type, rmm::mr::device_memory_resource *)
*
* @param stream CUDA stream used for device memory operations and kernel launches.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameter order here is incorrect. Should be stream, mr

Same applies for the other 3 declarations in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copydoc is used at many places in our cpp documentation to avoid duplication of doc text, most often used in detail APIs. There is no cleaner way so far.

Note: Alternatively, we could contribute to doxygen to reorder the @param as listed in the function signature. Timeline: Uncertain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copydoc is used at many places in our cpp documentation to avoid duplication of doc text, most often used in detail APIs. There is no cleaner way so far.

Right, but isn't this going to result in incorrect documentation? My expectation is that you will end up getting something like:

 * @param lhs         The left operand scalar
 * @param rhs         The right operand column
 * @param op          The binary operator
 * @param output_type The desired data type of the output column
 * @param mr          Device memory resource used to allocate the returned column's device memory
 * @param stream    stream CUDA stream used for device memory operations and kernel launches.
 * @return            Output column of `output_type` type containing the result of
 *                    the binary operation
 * @throw cudf::logic_error if @p output_type dtype isn't fixed-width

...which is the incorrect order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how it's rendered now.
image

Copy link
Contributor Author

@karthikeyann karthikeyann Jul 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not better too.
(I generated this temporarily to show. it's not in released documentation)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

detail APIs are not in released libcudf documentation. They are excluded by doxygen.

@karthikeyann
Copy link
Contributor Author

@revans2 Great.

@karthikeyann karthikeyann added 4 - Needs cuDF (Python) Reviewer 4 - Needs Review Waiting for reviewer to review or respond 3 - Ready for Review Ready for review by team labels Jul 14, 2021
@codecov
Copy link

codecov bot commented Jul 14, 2021

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.10@6cd0167). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head bc29430 differs from pull request most recent head e7e81ce. Consider uploading reports for the commit e7e81ce to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.10    #8741   +/-   ##
===============================================
  Coverage                ?   10.73%           
===============================================
  Files                   ?      114           
  Lines                   ?    19058           
  Branches                ?        0           
===============================================
  Hits                    ?     2046           
  Misses                  ?    17012           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6cd0167...e7e81ce. Read the comment docs.

@harrism
Copy link
Member

harrism commented Jul 14, 2021

In next release (21.10), experimental namespace will be dropped, and compiled binary ops will replace jit binary ops in libcudf, except for user defined op.

Do we need to formally deprecate anything? Will dropping the experimental namespace be a breaking change?

@karthikeyann
Copy link
Contributor Author

In next release (21.10), experimental namespace will be dropped, and compiled binary ops will replace jit binary ops in libcudf, except for user defined op.

Do we need to formally deprecate anything? Will dropping the experimental namespace be a breaking change?

No deprecation needed. Dropping experimental means replacing jit with compiled. Few operator + output type combinations will not work. It will break for these few combinations. There are no changes in API interface.

@galipremsagar
Copy link
Contributor

galipremsagar commented Jul 15, 2021

Looks like there is a repartition error specific to this PR

@karthikeyann
Copy link
Contributor Author

rerun tests

@harrism
Copy link
Member

harrism commented Jul 27, 2021

In next release (21.10), experimental namespace will be dropped, and compiled binary ops will replace jit binary ops in libcudf, except for user defined op.

@karthikeyann since this got delayed to 21.10, should we go ahead and do this as part of this PR? Or are you concerned that we need to test it in the field as "experimental" first?

CC @jrhemstad for another opinion.

@karthikeyann
Copy link
Contributor Author

will do this as part of this PR.

@github-actions github-actions bot added the CMake CMake build issue label Aug 16, 2021
@revans2
Copy link
Contributor

revans2 commented Aug 16, 2021

You forgot to delete the experimental namespace from the JNI code. After that it looks good to me.

diff --git a/java/src/main/native/src/ColumnViewJni.cpp b/java/src/main/native/src/ColumnViewJni.cpp
index 4bd1ac06f3..e9d427cb54 100644
--- a/java/src/main/native/src/ColumnViewJni.cpp
+++ b/java/src/main/native/src/ColumnViewJni.cpp
@@ -1110,7 +1110,7 @@ JNIEXPORT jlong JNICALL Java_ai_rapids_cudf_ColumnView_binaryOpVV(JNIEnv *env, j
     cudf::data_type n_data_type = cudf::jni::make_data_type(out_dtype, scale);
     cudf::binary_operator op = static_cast<cudf::binary_operator>(int_op);
     std::unique_ptr<cudf::column> result =
-        cudf::experimental::binary_operation(*lhs, *rhs, op, n_data_type);
+        cudf::binary_operation(*lhs, *rhs, op, n_data_type);
     return reinterpret_cast<jlong>(result.release());
   }
   CATCH_STD(env, 0);
@@ -1142,7 +1142,7 @@ JNIEXPORT jlong JNICALL Java_ai_rapids_cudf_ColumnView_binaryOpVS(JNIEnv *env, j
 
     cudf::binary_operator op = static_cast<cudf::binary_operator>(int_op);
     std::unique_ptr<cudf::column> result =
-        cudf::experimental::binary_operation(*lhs, *rhs, op, n_data_type);
+        cudf::binary_operation(*lhs, *rhs, op, n_data_type);
     return reinterpret_cast<jlong>(result.release());
   }
   CATCH_STD(env, 0);
diff --git a/java/src/main/native/src/ScalarJni.cpp b/java/src/main/native/src/ScalarJni.cpp
index b43b32e6be..cd3f23beac 100644
--- a/java/src/main/native/src/ScalarJni.cpp
+++ b/java/src/main/native/src/ScalarJni.cpp
@@ -468,7 +468,7 @@ JNIEXPORT jlong JNICALL Java_ai_rapids_cudf_Scalar_binaryOpSV(JNIEnv *env, jclas
 
     cudf::binary_operator op = static_cast<cudf::binary_operator>(int_op);
     std::unique_ptr<cudf::column> result =
-        cudf::experimental::binary_operation(*lhs, *rhs, op, n_data_type);
+        cudf::binary_operation(*lhs, *rhs, op, n_data_type);
     return reinterpret_cast<jlong>(result.release());
   }
   CATCH_STD(env, 0);

@github-actions github-actions bot removed the Java Affects Java cuDF API. label Aug 16, 2021
@karthikeyann karthikeyann added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond 4 - Needs cuDF (Java) Reviewer labels Aug 17, 2021
@github-actions github-actions bot removed the CMake CMake build issue label Aug 17, 2021
@karthikeyann
Copy link
Contributor Author

rerun tests

4 similar comments
@karthikeyann
Copy link
Contributor Author

rerun tests

@galipremsagar
Copy link
Contributor

rerun tests

@galipremsagar
Copy link
Contributor

rerun tests

@karthikeyann
Copy link
Contributor Author

rerun tests

@karthikeyann
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 332dedf into rapidsai:branch-21.10 Aug 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge breaking Breaking change feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Convert Binary Operations to use device-side type dispatch rather than JIT
10 participants