Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This draft PR adds snapshot tests. I am trying to incrementally rewrite operations in KernelAbstractions.jl, and am using these tests to ensure consistent outputs at temp 0.
Currently, it doesn't run on GPUs, for that I think a method that moves weights to a chosen backend would be needed.
Right now, I just want to write correct kernels that run on the CPU and produce outputs consistent to the original version.
Here is a kernel for
rmsnorm
- DhruvDh@1d49d42. It makes the whole thing some 2-5% slower but produces correct outputs. This way, we can try incrementally writing kernels for operations and later look at the results and decide if GPU acceleration is needed, how to maintain readability, etc.I can also keep a draft PR open where we do this incremental addition of kernels
I am also having trouble with formatting, if a
.JuliaFormatter.toml
can be added to the project, that would help a lot.