Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use KernelAbstractions.jl for gather/scatter kernels #487

Merged
merged 10 commits into from
Apr 16, 2023
Merged

Use KernelAbstractions.jl for gather/scatter kernels #487

merged 10 commits into from
Apr 16, 2023

Conversation

pxl-th
Copy link
Member

@pxl-th pxl-th commented Apr 9, 2023

Depends on #486.
Successfully tested on CPU, CUDABackend, ROCBackend.

Since LLVM does not support certain atomic operations (fmin, fmax, fmul, fdiv) tests for them are disabled when using such backends, which is currently only ROCBackend:
https://llvm.org/docs/LangRef.html#atomicrmw-instruction

fmin/fmax requires LLVM 15+: https://reviews.llvm.org/D127041

TODO

  • Add @inbounds.

PR Checklist

  • Tests are added
  • Documentation, if applicable

@pxl-th pxl-th marked this pull request as draft April 9, 2023 18:18
@pxl-th
Copy link
Member Author

pxl-th commented Apr 11, 2023

It may be that LLVM on Julia 1.6 is too old and we can't support all of the ops that we currently support for CUDA.
On 1.9-rc2 it fails for - op, but all other ops are working...

As an alternative, we can retain (for now) NNlibCUDA scatter.

@pxl-th pxl-th marked this pull request as ready for review April 11, 2023 13:20
@pxl-th
Copy link
Member Author

pxl-th commented Apr 11, 2023

I've retained scatter kernels for CUDA in NNlibCUDA so that this PR is not blocked.
But what's important is that other backends that rely on KernelAbstractions.jl will be able to use embedding layers (fwd + bwd pass), which is probably the most common use-case for it.

@ToucheSir
Copy link
Member

FWIW the copy of NNlibCUDA in this repo doesn't do anything at present, so we need to make sure the overloads in the standalone NNlibCUDA repo still work.

@pxl-th
Copy link
Member Author

pxl-th commented Apr 11, 2023

FWIW the copy of NNlibCUDA in this repo doesn't do anything at present, so we need to make sure the overloads in the standalone NNlibCUDA repo still work.

IIC, if NNlibCUDA in this repo is not registered, then tests at using NNlibCUDA line should import the standalone repo.
And I've disabled Gather and Scatter tests from test suite for CUDA, because we call them from NNlibCUDA tests.

@pxl-th
Copy link
Member Author

pxl-th commented Apr 11, 2023

Not sure what's with CUDA on 1.8...
Will investigate in a bit

@pxl-th
Copy link
Member Author

pxl-th commented Apr 11, 2023

Probably was some kind of hiccup, now the CI passes.

Copy link
Member

@ToucheSir ToucheSir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything left before merging?

@pxl-th
Copy link
Member Author

pxl-th commented Apr 16, 2023

Anything left before merging?

I think no, should be good to go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants