Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cuda::mr::cuda_memory_resource #1513

Closed
wants to merge 15 commits into from

Conversation

miscco
Copy link
Collaborator

@miscco miscco commented Mar 7, 2024

This introduces a new memory resource that allocates device memory through cudaMalloc / cudaFree

Fixes #1512

@miscco miscco requested review from a team as code owners March 7, 2024 19:13
@miscco
Copy link
Collaborator Author

miscco commented Mar 7, 2024

@harrism you might be interested in this

@miscco miscco force-pushed the cuda_memory_resource branch from aae814f to 21ce157 Compare March 7, 2024 19:26
@miscco miscco force-pushed the cuda_memory_resource branch from 21ce157 to 98485fb Compare March 7, 2024 19:28
@miscco miscco force-pushed the cuda_memory_resource branch 4 times, most recently from a76e390 to 842943e Compare March 11, 2024 11:39
Copy link
Contributor

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice addition. I didn't review the refactoring, only the new cuda_memory_resource. I think the refactoring is just splitting into multiple files, and that should be clarified in the PR description. It would have been nice to separate these into separate PRs.

Most of my suggestions are doc clarifications and fixes.

*/
void* allocate(const size_t __bytes, const size_t __alignment) const
{
_LIBCUDACXX_ASSERT(__alignment <= 256 && (256 % __alignment == 0),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: I think allocation should throw, not assert. Also note that this assertion/error means that the alignment parameter is NOT ignored.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Just because cudaMalloc aligns to 256 (or possibly 512, even though documented at 256), doesn't mean an MR can't support larger allocations. I suppose users who really want that can write a wrapping resource that aligns, but is there another reason we don't support it here? Seems a bit strange that cuda::mr has this parameter and our most basic implementation semi-ignores it and adds another method without it that is not part of the concept.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an exception / trap when we are facing an invalid alignment. I also merged the two allocate functions through a default argument

Copy link
Collaborator

@jrhemstad jrhemstad Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a bit strange that cuda::mr has this parameter and our most basic implementation semi-ignores it and adds another method without it that is not part of the concept.

The existence of the alignment parameter does not mean all possible alignment values are valid. Each resource implementation will have its own limitations and this includes alignment.

cuda_memory_resource is intended to be the simplest possible wrapper on top of cudaMalloc, and so it will inherit all the limitations of cudaMalloc, including alignment. So a user requesting an alignment up to 256 is fine, but beyond that, we throw.

We could add an additional resource that allows for larger alignments, (or an adapter!), but that would no longer be a trivial wrapper on top of cudaMalloc.

@miscco miscco force-pushed the cuda_memory_resource branch from 8cfde90 to 64088dd Compare March 12, 2024 16:38
@miscco miscco requested a review from a team as a code owner March 12, 2024 16:38
@copy-pr-bot copy-pr-bot bot force-pushed the pull-request/1532 branch from bbd9e91 to 16f9082 Compare March 12, 2024 16:38
Comment on lines 85 to 88
// We need to ensure that the provided alignment matches the minimal provided alignment
_LIBCUDACXX_ASSERT(
__default_cuda_malloc_alignment <= __alignment && (__alignment % __default_cuda_malloc_alignment == 0),
"Invalid alignment passed to cuda_memory_resource::deallocate.");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: I think this check is unnecessary. It's illegal to ever call deallocate(p, N, M) for p that was not returned by p = allocate(N, M). Therefore, we would have already checked the alignment during the allocate call.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no guarantee that a user matches allocate and deallocate correctly.

This is just a tiny bit safer than without and it does not hurt

*
* @param __lhs The cuda_memory_resource
* @param __rhs The resource to compare to
* @return Comparison of both resources converted to a resource_ref<>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The semantics are pretty simple, can we just spell it out something like the following (or whatever the incantation would be)?

Suggested change
* @return Comparison of both resources converted to a resource_ref<>
* @return true if both arguments are convertible to resource_ref<>(cuda_memory_resource&), false otherwise

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with: Result of equality comparison of both resources converted to a resource_ref<>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but then the reader has to go dig up docs on the semantics of comparison of refs. Doesn't that just compare the referenced object types? Is there a section in the docs on equality comparison that you could link to?

std::pmr::memory_resource and RMM (current) memory resources have very different comparison semantics, so I think this should be clearly documented.

@harrism
Copy link
Contributor

harrism commented Mar 13, 2024

Typo in issue title (hyphen should be underscore).

@miscco miscco changed the title Add cuda::mr::cuda_memory-resource Add cuda::mr::cuda_memory_resource Mar 13, 2024
@copy-pr-bot copy-pr-bot bot force-pushed the pull-request/1532 branch from 7674e55 to b65158b Compare April 2, 2024 10:02
@miscco miscco force-pushed the cuda_memory_resource branch 2 times, most recently from eee001b to 318af5c Compare April 2, 2024 12:46
@miscco miscco force-pushed the cuda_memory_resource branch from 318af5c to 14ddff1 Compare April 2, 2024 13:08
@copy-pr-bot copy-pr-bot bot deleted the branch NVIDIA:pull-request/1532 April 2, 2024 18:31
@copy-pr-bot copy-pr-bot bot closed this Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[FEA]: Implement a memory_resource using cudaMalloc and cudaFree
3 participants