-
Notifications
You must be signed in to change notification settings - Fork 745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Support lambda functions passed to reduction #2190
[SYCL] Support lambda functions passed to reduction #2190
Conversation
Signed-off-by: Vyacheslav N Klochkov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM
@@ -0,0 +1,71 @@ | |||
// UNSUPPORTED: cuda | |||
// OpenCL C 2.x alike work-group functions not yet supported by CUDA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see a bit more clarification here. Are block reductions missing from CUDA itself or CUDA plugin does not expose them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Alex, I updated the comment by copying it from here: https://github.com/intel/llvm/blame/sycl/sycl/test/reduction/reduction_nd_s0_dw.cpp#L2 and adding a note about intel::reduce() that is not supported yet by CUDA.
I don't have any other details regarding CUDA support for that feature.
Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any other details regarding CUDA support for that feature.
It's implementable, but not in the plugin yet. CUDA doesn't have native support for equivalents to everything in the GroupAlgorithms extension, but it did recently add support for reductions as part of its cooperative group functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the link. The updated comment is explanatory enough, I think:
// Reductions use work-group builtins (e.g. intel::reduce()) not yet supported
// by CUDA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely agree. I was just providing some more background for @alexbatashev.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from the failing test, LGTM.
…g lambdas for reduction Signed-off-by: Vyacheslav N Klochkov <[email protected]>
Signed-off-by: Vyacheslav N Klochkov <[email protected]>
John, thank you for the quick code-review! I fixed the test. |
…rogram * upstream/sycl: (609 commits) [SYCL] Fix fail in the post commit testing (intel#2210) [SYCL] Materialize shadow local variables for byval arguments before use (intel#2200) [SYCL] Support lambda functions passed to reduction (intel#2190) [SYCL][USM] Improve USM Allocator. (intel#2026) [SYCL] Disallow mutable lambdas (intel#1785) [SYCL][ESIMD] Setup compilation pipeline for ESIMD (intel#2134) [SYCL] Fix not found kernel due to empty kernel name when using set_arg(s) (intel#2181) [SYCL] Fixed check for set_arg (intel#2203) Refactor indirect access calls to minimize invocations. (intel#2185) [SYCL][NFC] Fix potential null-pointer access (intel#2197) [SYCL] Propagate attributes from transitive calls to kernel (intel#1878) [SYCL] Fix warnings from static analysis tool (intel#2193) [SYCL][NFC] Fix ac_float test for compilation with FE optimizations (intel#2184) [GitHub Actions] Uplift clang-format version to 10 (intel#2194) [SYCL][ESIMD] Pass to replace simd* parameters with native llvm vectors. (intel#2097) [SYCL][NFC] Fixed SYCL_PI_TRACE output while selecting a device. (intel#2192) [SYCL][FPGA] New spec for controlling load-store units in FPGAs (intel#2158) [SYCL][Doc] Clarify reqd_sub_group_size (intel#2103) [SYCL] Remove noreturn function attribute (intel#2165) [SYCL] Aligned set_arg behaviour with SYCL specification (intel#2159) ...
Signed-off-by: Vyacheslav N Klochkov [email protected]