-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable compiled binary ops in libcudf, python and java #8741
Enable compiled binary ops in libcudf, python and java #8741
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran through all of the Spark tests with this and they all passed. I deleted the JIT cache before running them and it was not recreated. The effectively removes all JIT from the Spark plugin.
cpp/include/cudf/detail/binaryop.hpp
Outdated
* @copydoc cudf::experimental::binary_operation(scalar const&, column_view const&, binary_operator, | ||
* data_type, rmm::mr::device_memory_resource *) | ||
* | ||
* @param stream CUDA stream used for device memory operations and kernel launches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parameter order here is incorrect. Should be stream, mr
Same applies for the other 3 declarations in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copydoc
is used at many places in our cpp documentation to avoid duplication of doc text, most often used in detail APIs. There is no cleaner way so far.
Note: Alternatively, we could contribute to doxygen to reorder the @param
as listed in the function signature. Timeline: Uncertain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copydoc
is used at many places in our cpp documentation to avoid duplication of doc text, most often used in detail APIs. There is no cleaner way so far.
Right, but isn't this going to result in incorrect documentation? My expectation is that you will end up getting something like:
* @param lhs The left operand scalar
* @param rhs The right operand column
* @param op The binary operator
* @param output_type The desired data type of the output column
* @param mr Device memory resource used to allocate the returned column's device memory
* @param stream stream CUDA stream used for device memory operations and kernel launches.
* @return Output column of `output_type` type containing the result of
* the binary operation
* @throw cudf::logic_error if @p output_type dtype isn't fixed-width
...which is the incorrect order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not better too.
(I generated this temporarily to show. it's not in released documentation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
detail APIs are not in released libcudf documentation. They are excluded by doxygen.
@revans2 Great. |
Codecov Report
@@ Coverage Diff @@
## branch-21.10 #8741 +/- ##
===============================================
Coverage ? 10.73%
===============================================
Files ? 114
Lines ? 19058
Branches ? 0
===============================================
Hits ? 2046
Misses ? 17012
Partials ? 0 Continue to review full report at Codecov.
|
Do we need to formally deprecate anything? Will dropping the experimental namespace be a breaking change? |
No deprecation needed. Dropping experimental means replacing jit with compiled. Few operator + output type combinations will not work. It will break for these few combinations. There are no changes in API interface. |
Looks like there is a repartition error specific to this PR |
rerun tests |
@karthikeyann since this got delayed to 21.10, should we go ahead and do this as part of this PR? Or are you concerned that we need to test it in the field as "experimental" first? CC @jrhemstad for another opinion. |
will do this as part of this PR. |
You forgot to delete the experimental namespace from the JNI code. After that it looks good to me. diff --git a/java/src/main/native/src/ColumnViewJni.cpp b/java/src/main/native/src/ColumnViewJni.cpp
index 4bd1ac06f3..e9d427cb54 100644
--- a/java/src/main/native/src/ColumnViewJni.cpp
+++ b/java/src/main/native/src/ColumnViewJni.cpp
@@ -1110,7 +1110,7 @@ JNIEXPORT jlong JNICALL Java_ai_rapids_cudf_ColumnView_binaryOpVV(JNIEnv *env, j
cudf::data_type n_data_type = cudf::jni::make_data_type(out_dtype, scale);
cudf::binary_operator op = static_cast<cudf::binary_operator>(int_op);
std::unique_ptr<cudf::column> result =
- cudf::experimental::binary_operation(*lhs, *rhs, op, n_data_type);
+ cudf::binary_operation(*lhs, *rhs, op, n_data_type);
return reinterpret_cast<jlong>(result.release());
}
CATCH_STD(env, 0);
@@ -1142,7 +1142,7 @@ JNIEXPORT jlong JNICALL Java_ai_rapids_cudf_ColumnView_binaryOpVS(JNIEnv *env, j
cudf::binary_operator op = static_cast<cudf::binary_operator>(int_op);
std::unique_ptr<cudf::column> result =
- cudf::experimental::binary_operation(*lhs, *rhs, op, n_data_type);
+ cudf::binary_operation(*lhs, *rhs, op, n_data_type);
return reinterpret_cast<jlong>(result.release());
}
CATCH_STD(env, 0);
diff --git a/java/src/main/native/src/ScalarJni.cpp b/java/src/main/native/src/ScalarJni.cpp
index b43b32e6be..cd3f23beac 100644
--- a/java/src/main/native/src/ScalarJni.cpp
+++ b/java/src/main/native/src/ScalarJni.cpp
@@ -468,7 +468,7 @@ JNIEXPORT jlong JNICALL Java_ai_rapids_cudf_Scalar_binaryOpSV(JNIEnv *env, jclas
cudf::binary_operator op = static_cast<cudf::binary_operator>(int_op);
std::unique_ptr<cudf::column> result =
- cudf::experimental::binary_operation(*lhs, *rhs, op, n_data_type);
+ cudf::binary_operation(*lhs, *rhs, op, n_data_type);
return reinterpret_cast<jlong>(result.release());
}
CATCH_STD(env, 0); |
rerun tests |
4 similar comments
rerun tests |
rerun tests |
rerun tests |
rerun tests |
@gpucibot merge |
closes #7801
cudf::binary_operation
calls compiled binary ops.cudf::jit::binary_operation
calls jit binary opsSo, compiled binary ops is called in libcudf (groupby, rescale), python (binary ops) and java (binary ops)
Breaking change:
New: Logical and Comparison operators can have output type to be only bool type.
Old: Logical operators can have integer or any other output type that can be constructed from bool type. Comparison operators required bool type only.
In this release (21.10),
experimental
namespace is dropped, and compiled binary ops replaces jit binary ops in libcudf, except for user defined op.