Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][USM] Improve USM Allocator. #2026

Merged
merged 16 commits into from
Jul 29, 2020
Merged

[SYCL][USM] Improve USM Allocator. #2026

merged 16 commits into from
Jul 29, 2020

Conversation

jbrodman
Copy link
Contributor

@jbrodman jbrodman commented Jul 1, 2020

Add ability to use std::allocate_shared.
Add equality operators for allocators.
Add tests.

Disallow device allocations in usm_allocator as there are too many incompatibilities with how C++ allocators are used.

@jbrodman jbrodman requested a review from a team as a code owner July 1, 2020 15:54
@jbrodman jbrodman requested a review from sergey-semenov July 1, 2020 15:54
jbrodman added 2 commits July 1, 2020 12:00
Signed-off-by: James Brodman <[email protected]>
Copy link
Contributor

@sergey-semenov sergey-semenov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than a few comments, the changes LGTM. Please, take a look at the failing allocator_shared test.

sycl/test/usm/allocator_equal.cpp Show resolved Hide resolved
sycl/include/CL/sycl/usm/usm_allocator.hpp Show resolved Hide resolved
sycl/test/usm/allocator_equal.cpp Outdated Show resolved Hide resolved
sycl/test/usm/allocator_shared.cpp Outdated Show resolved Hide resolved
sycl/test/usm/allocator_shared.cpp Show resolved Hide resolved
sergey-semenov
sergey-semenov previously approved these changes Jul 13, 2020
@sergey-semenov
Copy link
Contributor

@jbrodman Please, take a look at the unexpected pass of allocator_equal with CUDA

sycl/include/CL/sycl/usm/usm_allocator.hpp Outdated Show resolved Hide resolved
return !((AllocKind == AllocKindU) && (One.MContext == Two.MContext) &&
(One.MDevice == Two.MDevice));
}

private:
constexpr size_t getAlignment() const {
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could an implementation do the right thing? The value_type isn't passed to aligned_alloc and therefore the implementation doesn't know about the required alignment. Maybe the best solution would be if line 29 would be changed to size_t Alignment = alignof(T).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 is treated as "default - do something legal"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the template argument is fine as-is. What's not OK is that you only pass getAlignment to aligned_alloc and it might be zero. Alignment is 0 and T is over-aligned there is no way for aligned_alloc to know.

You can either

  • Change the template argument
  • Uncomment the next few lines
  • Or pass T to aligned_alloc

sycl/include/CL/sycl/usm/usm_allocator.hpp Outdated Show resolved Hide resolved
sycl/include/CL/sycl/usm/usm_allocator.hpp Outdated Show resolved Hide resolved
jbrodman added 2 commits July 15, 2020 16:15
Signed-off-by: James Brodman <[email protected]>
@jbrodman jbrodman requested a review from sergey-semenov July 15, 2020 20:27
sycl/include/CL/sycl/usm/usm_allocator.hpp Outdated Show resolved Hide resolved
sycl/include/CL/sycl/usm/usm_allocator.hpp Show resolved Hide resolved
return !((AllocKind == AllocKindU) && (One.MContext == Two.MContext) &&
(One.MDevice == Two.MDevice));
}

private:
constexpr size_t getAlignment() const {
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the template argument is fine as-is. What's not OK is that you only pass getAlignment to aligned_alloc and it might be zero. Alignment is 0 and T is over-aligned there is no way for aligned_alloc to know.

You can either

  • Change the template argument
  • Uncomment the next few lines
  • Or pass T to aligned_alloc

@jbrodman
Copy link
Contributor Author

0 is valid for alignment - "just do the default thing" - at the end of the day these get turned into byte amt mallocs - all type info is discarded.

Aligned_alloc( align = 0) is equivalent to just calling malloc.

Signed-off-by: James Brodman <[email protected]>
sergey-semenov
sergey-semenov previously approved these changes Jul 17, 2020
@rolandschulz
Copy link
Contributor

0 is valid for alignment - "just do the default thing" - at the end of the day these get turned into byte amt mallocs - all type info is discarded.

Aligned_alloc( align = 0) is equivalent to just calling malloc.

My point is we shouldn't do that. By default it should have the same alignment as std::alloc or new. They both use the alignment required by the type as the default alignment. That way a user can use this allocator for an over-aligned type and the default correctly works.

@jbrodman
Copy link
Contributor Author

0 is valid for alignment - "just do the default thing" - at the end of the day these get turned into byte amt mallocs - all type info is discarded.
Aligned_alloc( align = 0) is equivalent to just calling malloc.

My point is we shouldn't do that. By default it should have the same alignment as std::alloc or new. They both use the alignment required by the type as the default alignment. That way a user can use this allocator for an over-aligned type and the default correctly works.

So what's the change?
Change Alignment = 0 to Alignment = alignof(T)?

@rolandschulz
Copy link
Contributor

So what's the change?
Change Alignment = 0 to Alignment = alignof(T)?

Options:

  1. Alignment = 0 to Alignment = alignof(T) for default template argument. I think this requires also a change to rebind: typedef usm_allocator<U, AllocKind, max(Alignment,alignof(U))> other;
  2. Change getAlignment to return ((Alignment == 0)? alignof(value_type) : Alignment);

Not sure which option is better.

@romanovvlad
Copy link
Contributor

@jbrodman
With this patch, what happens if a user tries to use usm_allocator which allocates device type of memory for containers like std::vector? Could you please check that the following works or gives some diagnostic:

  cl::sycl::queue Q{};
  using USMAllocator= cl::sycl::usm_allocator<int, cl::sycl::usm::alloc::device>;
  USMAllocator Allocator(Q.get_context(), Q.get_device());
  std::vector<int, USMAllocator> Vec1(/*Size=*/1, Allocator);
  std::vector<int, USMAllocator> Vec2(/*Size=*/1, /*InitVal=*/42, Allocator);
  std::vector<int, USMAllocator> Vec3 = Vec1;
  Vec.resize(43);

@jbrodman
Copy link
Contributor Author

There are too many problems with device allocations. Too many C++ allocator-isms just don't work, so there's a static_assert that fires at compile time if you try to use them. We have to disallow them with the allocator interface.

@romanovvlad
Copy link
Contributor

There are too many problems with device allocations. Too many C++ allocator-isms just don't work, so there's a static_assert that fires at compile time if you try to use them. We have to disallow them with the allocator interface.

I see.

@jbrodman jbrodman requested a review from sergey-semenov July 28, 2020 15:02
jbrodman added 3 commits July 28, 2020 11:09
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
@romanovvlad
Copy link
Contributor

@jbrodman Could you please provide a final commit message? The text in the first comment looks outdated.

@bader
Copy link
Contributor

bader commented Jul 29, 2020

One more question: what is the plan for PR #1577 once this PR is merged?
Tagging @fwyzard.

@fwyzard
Copy link
Contributor

fwyzard commented Jul 29, 2020

I hope to be able to come back to work on the CUDA support during the summer - i.e. this or next week...

@jbrodman
Copy link
Contributor Author

@jbrodman Could you please provide a final commit message? The text in the first comment looks outdated.

@romanovvlad what's the best Github way to do that?

@bader
Copy link
Contributor

bader commented Jul 29, 2020

@jbrodman Could you please provide a final commit message? The text in the first comment looks outdated.

@romanovvlad what's the best Github way to do that?

Can I use current PR description as a commit message?

Add ability to use std::allocate_shared.
Add equality operators for allocators.
Add tests.

Disallow device allocations in usm_allocator as there are too many incompatibilities with how C++ allocators are used.

Commit title: [SYCL][USM] Improve USM Allocator.

@jbrodman
Copy link
Contributor Author

@jbrodman Could you please provide a final commit message? The text in the first comment looks outdated.

@romanovvlad what's the best Github way to do that?

Can I use current PR description as a commit message?

Add ability to use std::allocate_shared.
Add equality operators for allocators.
Add tests.
Disallow device allocations in usm_allocator as there are too many incompatibilities with how C++ allocators are used.

Commit title: [SYCL][USM] Improve USM Allocator.

Sure?

@bader bader merged commit ce915ef into intel:sycl Jul 29, 2020
alexbatashev pushed a commit to alexbatashev/llvm that referenced this pull request Jul 30, 2020
…rogram

* upstream/sycl: (609 commits)
  [SYCL] Fix fail in the post commit testing (intel#2210)
  [SYCL] Materialize shadow local variables for byval arguments before use (intel#2200)
  [SYCL] Support lambda functions passed to reduction (intel#2190)
  [SYCL][USM] Improve USM Allocator. (intel#2026)
  [SYCL] Disallow mutable lambdas (intel#1785)
  [SYCL][ESIMD] Setup compilation pipeline for ESIMD (intel#2134)
  [SYCL] Fix not found kernel due to empty kernel name when using set_arg(s) (intel#2181)
  [SYCL] Fixed check for set_arg (intel#2203)
  Refactor indirect access calls to minimize invocations. (intel#2185)
  [SYCL][NFC] Fix potential null-pointer access (intel#2197)
  [SYCL] Propagate attributes from transitive calls to kernel (intel#1878)
  [SYCL] Fix warnings from static analysis tool (intel#2193)
  [SYCL][NFC] Fix ac_float test for compilation with FE optimizations (intel#2184)
  [GitHub Actions] Uplift clang-format version to 10 (intel#2194)
  [SYCL][ESIMD] Pass to replace simd* parameters with native llvm vectors. (intel#2097)
  [SYCL][NFC] Fixed SYCL_PI_TRACE output while selecting a device. (intel#2192)
  [SYCL][FPGA] New spec for controlling load-store units in FPGAs (intel#2158)
  [SYCL][Doc] Clarify reqd_sub_group_size (intel#2103)
  [SYCL] Remove noreturn function attribute (intel#2165)
  [SYCL] Aligned set_arg behaviour with SYCL specification (intel#2159)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants