You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.
This is an early version of the integration, it doesn't have padding support, the file is placed under test directory. Opening a PR so that we can continue the design discussion with actual code we can reference.
For `mdspan`, it's currently using a fork of mine, but in general, the required changes are trivial so it's directly integrated as a cmake module. Padding will add some more complexity, but I think we can implement it as custom layouts.
For `mdarray`, due to the `operator()`, copy constructor, and CUDA stream, I copied the reference implementation with some substantial changes. (need to add license).
- For the `operator()`, I have created a `device_reference` that's a simplified version of `thrust::device_reference` but supports the CUDA stream.
- For supporting CUDA stream, I have removed most of the constructors to make sure there's a valid stream being passed into mdarray. Also, a new container policy is required to store the stream.
- For the `extent` template parameter, I have changed the signature to match `mdspan`.
- To integrate rmm allocator, I made a thin wrapper over `rmm::device_uvector` with a custom policy.
- For the container policy, the ref implementation reuses `container_policy` for `mdspan::accessor_policy`. This is done by having multiple overloads of `access` methods, including `reference access(container const& c)` and `reference access(pointer ptr)`, and the latter is a host device function. I have split it up into 2 policies as the container policy might have unwanted states for kernels like CUDA stream.
Lastly, I have added host and device varients of both mdarray and mdspan.
@cjnolet@divyegala@achirkin
Related:
- #408
- #423
Todos:
- [ ] Finalize the design doc.
- [x] Add host/device accessor policy.
- [x] Add view type.
- [x] Change default accessor.
- [ ] Add padding support.
- [ ] Upstream changes to mdspan.
- [x] Add more tests.
Authors:
- Jiaming Yuan (https://github.com/trivialfis)
- Corey J. Nolet (https://github.com/cjnolet)
Approvers:
- AJ Schmidt (https://github.com/ajschmidt8)
- Corey J. Nolet (https://github.com/cjnolet)
URL: #437
No description provided.
The text was updated successfully, but these errors were encountered: