-
Notifications
You must be signed in to change notification settings - Fork 744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][USM] Improve USM Allocator. #2026
Conversation
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
…locations. Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than a few comments, the changes LGTM. Please, take a look at the failing allocator_shared test.
Signed-off-by: James Brodman <[email protected]>
@jbrodman Please, take a look at the unexpected pass of allocator_equal with CUDA |
Signed-off-by: James Brodman <[email protected]>
return !((AllocKind == AllocKindU) && (One.MContext == Two.MContext) && | ||
(One.MDevice == Two.MDevice)); | ||
} | ||
|
||
private: | ||
constexpr size_t getAlignment() const { | ||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How could an implementation do the right thing? The value_type
isn't passed to aligned_alloc
and therefore the implementation doesn't know about the required alignment. Maybe the best solution would be if line 29 would be changed to size_t Alignment = alignof(T)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0 is treated as "default - do something legal"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the template argument is fine as-is. What's not OK is that you only pass getAlignment
to aligned_alloc
and it might be zero. Alignment
is 0 and T
is over-aligned there is no way for aligned_alloc
to know.
You can either
- Change the template argument
- Uncomment the next few lines
- Or pass T to
aligned_alloc
…lications. Update tests Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
return !((AllocKind == AllocKindU) && (One.MContext == Two.MContext) && | ||
(One.MDevice == Two.MDevice)); | ||
} | ||
|
||
private: | ||
constexpr size_t getAlignment() const { | ||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the template argument is fine as-is. What's not OK is that you only pass getAlignment
to aligned_alloc
and it might be zero. Alignment
is 0 and T
is over-aligned there is no way for aligned_alloc
to know.
You can either
- Change the template argument
- Uncomment the next few lines
- Or pass T to
aligned_alloc
0 is valid for alignment - "just do the default thing" - at the end of the day these get turned into byte amt mallocs - all type info is discarded. Aligned_alloc( align = 0) is equivalent to just calling malloc. |
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
My point is we shouldn't do that. By default it should have the same alignment as |
So what's the change? |
Options:
Not sure which option is better. |
@jbrodman
|
There are too many problems with device allocations. Too many C++ allocator-isms just don't work, so there's a static_assert that fires at compile time if you try to use them. We have to disallow them with the allocator interface. |
I see. |
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
@jbrodman Could you please provide a final commit message? The text in the first comment looks outdated. |
I hope to be able to come back to work on the CUDA support during the summer - i.e. this or next week... |
@romanovvlad what's the best Github way to do that? |
Can I use current PR description as a commit message?
Commit title: [SYCL][USM] Improve USM Allocator. |
Sure? |
…rogram * upstream/sycl: (609 commits) [SYCL] Fix fail in the post commit testing (intel#2210) [SYCL] Materialize shadow local variables for byval arguments before use (intel#2200) [SYCL] Support lambda functions passed to reduction (intel#2190) [SYCL][USM] Improve USM Allocator. (intel#2026) [SYCL] Disallow mutable lambdas (intel#1785) [SYCL][ESIMD] Setup compilation pipeline for ESIMD (intel#2134) [SYCL] Fix not found kernel due to empty kernel name when using set_arg(s) (intel#2181) [SYCL] Fixed check for set_arg (intel#2203) Refactor indirect access calls to minimize invocations. (intel#2185) [SYCL][NFC] Fix potential null-pointer access (intel#2197) [SYCL] Propagate attributes from transitive calls to kernel (intel#1878) [SYCL] Fix warnings from static analysis tool (intel#2193) [SYCL][NFC] Fix ac_float test for compilation with FE optimizations (intel#2184) [GitHub Actions] Uplift clang-format version to 10 (intel#2194) [SYCL][ESIMD] Pass to replace simd* parameters with native llvm vectors. (intel#2097) [SYCL][NFC] Fixed SYCL_PI_TRACE output while selecting a device. (intel#2192) [SYCL][FPGA] New spec for controlling load-store units in FPGAs (intel#2158) [SYCL][Doc] Clarify reqd_sub_group_size (intel#2103) [SYCL] Remove noreturn function attribute (intel#2165) [SYCL] Aligned set_arg behaviour with SYCL specification (intel#2159) ...
Add ability to use std::allocate_shared.
Add equality operators for allocators.
Add tests.
Disallow device allocations in usm_allocator as there are too many incompatibilities with how C++ allocators are used.