Need to rethink how we use CUDA backend in Spatter #51

gshipman · 2023-09-08T16:35:05Z

Currently the Spatter CUDA backend assumes the ability to partition the pattern array all the way down to a single thread block (size of 1KB elements - 8 byte elements would be 8KB) so that each thread block can keep its pattern array in shared memory cache. The target/source buffer remains in main memory. This doesn't match the Flag and xRAGE use cases, we need to have an option for the the pattern array to be main memory resident and shared by all thread blocks. We also need to ensure that scatters use atomics to avoid races on writes.

gshipman assigned JDTruj2018 Sep 8, 2023

JDTruj2018 mentioned this issue Dec 20, 2023

Spatter/long gpu patterns #74

Merged

JDTruj2018 closed this as completed in #74 Dec 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need to rethink how we use CUDA backend in Spatter #51

Need to rethink how we use CUDA backend in Spatter #51

gshipman commented Sep 8, 2023

Need to rethink how we use CUDA backend in Spatter #51

Need to rethink how we use CUDA backend in Spatter #51

Comments

gshipman commented Sep 8, 2023