-
-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use KernelAbstractions.jl for gather/scatter kernels #487
Conversation
It may be that LLVM on Julia 1.6 is too old and we can't support all of the ops that we currently support for CUDA. As an alternative, we can retain (for now) NNlibCUDA scatter. |
I've retained scatter kernels for CUDA in NNlibCUDA so that this PR is not blocked. |
FWIW the copy of NNlibCUDA in this repo doesn't do anything at present, so we need to make sure the overloads in the standalone NNlibCUDA repo still work. |
IIC, if NNlibCUDA in this repo is not registered, then tests at |
Not sure what's with CUDA on 1.8... |
Probably was some kind of hiccup, now the CI passes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything left before merging?
I think no, should be good to go |
Depends on #486.
Successfully tested on
CPU
,CUDABackend
,ROCBackend
.Since LLVM does not support certain atomic operations (fmin, fmax, fmul, fdiv) tests for them are disabled when using such backends, which is currently only
ROCBackend
:https://llvm.org/docs/LangRef.html#atomicrmw-instruction
fmin/fmax requires LLVM 15+: https://reviews.llvm.org/D127041
TODO
@inbounds
.PR Checklist