Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to rethink how we use CUDA backend in Spatter #51

Closed
gshipman opened this issue Sep 8, 2023 · 0 comments · Fixed by #74
Closed

Need to rethink how we use CUDA backend in Spatter #51

gshipman opened this issue Sep 8, 2023 · 0 comments · Fixed by #74
Assignees

Comments

@gshipman
Copy link
Collaborator

gshipman commented Sep 8, 2023

Currently the Spatter CUDA backend assumes the ability to partition the pattern array all the way down to a single thread block (size of 1KB elements - 8 byte elements would be 8KB) so that each thread block can keep its pattern array in shared memory cache. The target/source buffer remains in main memory. This doesn't match the Flag and xRAGE use cases, we need to have an option for the the pattern array to be main memory resident and shared by all thread blocks. We also need to ensure that scatters use atomics to avoid races on writes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants