You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your contribution. I wanted to ask if you have any methods to simulate concurrent execution on GPU. NVIDIA provides three concurrency mechanisms to support concurrent applications: priority streams, time-slicing, and multi-process server (MPS). Is there a way to emulate the concurrency behavior for a simulator? If not, do you any suggestion to approach this problem?
The text was updated successfully, but these errors were encountered:
priority streams: We support multiple streams. You need to enable this by adding -gpgpu_concurrent_kernel_sm 1 to your config file. But your kernels must be small enough. If kernels are just big to fill all SMs then only that kernel will run. You can change the behavior by changing select_kernel.
time-slicing: We don't support this. You would have to model context switching, which writes all register values back to memory.
MPS: We don't support this but this is easy to support. Change the select_kernel function to issue one kernel to only a subset of SMs.
Thank you for getting back to me quickly! I have a question regarding simulating DNN inference and training using Accel-sim. I would like to exclude the first few initial iterations, commonly referred to as "warm-up" iterations, from the final stats report. Is there a way to do this?
Thank you for your contribution. I wanted to ask if you have any methods to simulate concurrent execution on GPU. NVIDIA provides three concurrency mechanisms to support concurrent applications: priority streams, time-slicing, and multi-process server (MPS). Is there a way to emulate the concurrency behavior for a simulator? If not, do you any suggestion to approach this problem?
The text was updated successfully, but these errors were encountered: