You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When optimizing cuda kernels, modifying the layout of the data seems to be a common approach - to facilitate coalesced data access. That often boils down to transposing a multidimensional data along a pair of indices.
It would be handy to have something like this as a helper for the mdarray type to reduce the boilerplate code.
When optimizing cuda kernels, modifying the layout of the data seems to be a common approach - to facilitate coalesced data access. That often boils down to transposing a multidimensional data along a pair of indices.
It would be handy to have something like this as a helper for the
mdarray
type to reduce the boilerplate code.Inspired by #926 (comment)
The text was updated successfully, but these errors were encountered: