-
Notifications
You must be signed in to change notification settings - Fork 424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU Attribute Improvements #25826
GPU Attribute Improvements #25826
Conversation
Signed-off-by: Danila Fedorin <[email protected]>
Signed-off-by: Danila Fedorin <[email protected]>
Signed-off-by: Danila Fedorin <[email protected]>
Signed-off-by: Danila Fedorin <[email protected]>
Signed-off-by: Danila Fedorin <[email protected]>
Signed-off-by: Danila Fedorin <[email protected]>
Signed-off-by: Danila Fedorin <[email protected]>
Signed-off-by: Danila Fedorin <[email protected]>
Signed-off-by: Danila Fedorin <[email protected]>
Signed-off-by: Danila Fedorin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good generally. I have mostly nits. I have more testing ideas that I'd like to see go in with this PR, though:
- Error message for
blockSize(1,2)
if we don't have it. (I know it is not about this PR, but the PR touches relevant areas) - More importantly, what happens if I use both flavors of assertions together in flat and gpu locale models? Some of those cases will error, some should just work. It'd be good to see that they can be used without an embarrassing issue.
reason = "is marked with @gpu.assertEligible"; | ||
} else { | ||
CHPL_ASSERT(assertion->isPrimitive(PRIM_ASSERT_ON_GPU)); | ||
reason = "contains assertOnGpu()"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't look like we need this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do; the legacy assertOnGpu
is still present in the code. I deprecated and can remove it in a separate PR, but would rather not do it in this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. Would you be up for doing that before the release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for sure.
body->insertAtTail(new CallExpr(PRIM_ASSERT_ON_GPU, | ||
new SymExpr(gTrue))); | ||
return true; | ||
body->insertAtTail(new CallExpr("chpl__assertOnGpuAttr")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know I said this wouldn't bother me, but could you explain to me (or point me to an issue) why this is better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of my reasons:
- I can use the overload resolution process to select candidate functions. For instance, I may use a where clause to avoid inserting GPU primitives when CHPL_LOCALE_MODEL=flat
- I can ensure that the arguments to the primitives are correctly typed (since calls are just “regular functions”).
- I don’t have to change compiler code to adjust the behavior of attributes, to some extent
- Using an experimental annotation (which I plan to work on next after this PR) I can get
chpldoc
to use these function definitions (likechpl__assertOnGpuAttr
) to generate documentation for attributes, which addresses Should we provide a specific way to document attributes? #23149.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using an experimental annotation (which I plan to work on next after this PR) I can get chpldoc to use these function definitions (like chpl__assertOnGpuAttr) to generate documentation for attributes, which addresses
Nice.
Thanks for explanation. You are right on all counts.
if (assertOnGpuAttr) { | ||
body->insertAtTail(new CallExpr(PRIM_ASSERT_ON_GPU, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you removed this primitive? I don't think you did.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't, and don't intend to in this PR, though my goal in the long term is to remove it.
Signed-off-by: Danila Fedorin <[email protected]>
Signed-off-by: Danila Fedorin <[email protected]>
Signed-off-by: Danila Fedorin <[email protected]>
runtime/include/chpl-gpu.h
Outdated
// Provide a fallback for the chpl_assert_on_gpu function for non-GPU locales. | ||
// This works exactly the same as the standard one. | ||
|
||
static inline void chpl_assert_on_gpu(int32_t lineno, int32_t filenameIdx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider ln
and fn
for formal names to have consistency in the runtime. (I don't love those names at all, I love consistency)
Signed-off-by: Danila Fedorin <[email protected]>
Signed-off-by: Danila Fedorin <[email protected]>
Signed-off-by: Danila Fedorin <[email protected]>
Closes #23149. This PR implements my initial vision for documenting attributes based on procedures that implement them. In PR #25826, I made GPU attributes be backed by Chapel functions. One of my goals with this approach was to be able to document these attributes using the existing Chapel mechanism (doc comments attached to procedures). This PR does so, by adding a new chpldoc attribute (`@chpldoc.attributeSignature`) that marks a function as defining the signature (formals etc.) of an attribute. Since the functions in the GPU module that back attributes are internal (not intended to be called directly), they would normally be skipped by chpldoc's output. This PR adjusts the behavior of chpldoc to not skip the functions in this case. It also adds documentation to the various attributes using this new mechanism. This PR depends on chapel-lang/sphinxcontrib-chapeldomain#97, which implements the RST-to-HTML conversion of the new 'annotation' directive chpldoc will emit. <img width="725" alt="Screen Shot 2024-09-05 at 10 16 43 AM" src="https://github.com/user-attachments/assets/bcc3fbe8-301f-4727-ae92-be2968d4e0f4"> Reviewed by @e-kayrakli, @mppf, and @lydia-duncan -- thanks! ## Testing - [x] paratest (just in case)
This documents and makes use of the various changes I made in #25826, chapel-lang/sphinxcontrib-chapeldomain#97, and #25884. Reviewed by @e-kayrakli -- thanks!
Closes #22822.
This PR seeks to improve the experience of using various GPU attributes in several cases. These changes are motivated by some of the difficulties I've observed @Iainmon to have had while working on his GPU-enabled machine learning code.
Case 1: Writing CPU and GPU code using the GPU locale model
The way to ensure that a loop is GPU-eligible in user code (and to fail compilation if the loop is not GPU eligible), is to use
@assertOnGpu
. However, one cannot do this when writing code that is expected to support both GPUs and CPUs. I've observed Iain's code to have something like the following:This way, the code could be used on both the GPU and the CPU, and the compiler will ensure that the GPU version is eligible. However, this introduces a maintenance burden, and makes the code rather verbose. To work around this problem, I introduce a new GPU primitive + attribute:
@gpu.assertEligible
. This attribute has the same behavior as@assertOnGpu
at compile-time, but it does not have a runtime effect. Thus, the code above can be flattened and continue to support both CPU and GPU runs:In my opinion, we should phase out the use of
@assertOnGpu
in favor of@gpu.assertEligible
. It's unclear to me that having a runtime assertion using this attribute is worth keeping it around two similar attributes. Personally, I think that the compile-time assertion can be handled by@gpu.assertEligible,
and various utilities fromGpuDiagnostics
for tracking kernel launches etc. can be used to ensure that GPU execution occurs at runtime. This PR doesn't make this (potentially more controversial change).Case 2: Disabling GPU support and compiling with
CHPL_LOCALE_MODEL=flat
When I told Iain to run his performance experiments in the flat locale model (to get started with initial performance results via the CPU), he immediately ran into internal errors. This is an instance of #22822.
My chosen solution to this problem is to make
@assertOnGpu
a compile-time error underCHPL_LOCALE_MODEL=flat
. This is because of the semantics of@assertOnGpu
: this attribute has a runtime check; without a GPU, the check is guaranteed to fail, and cause a "certain" failure. This error is now user-facing, and tells the user to switch to@gpu.assertEligible
if all they want is a compile-time check.On the other hand, the
@gpu.assertEligible
attribute, which does not have any runtime semantics, does not cause a compilation error with theflat
locale model. Instead, the attribute is simply ignored (we don't perform any GPU logic with the flat locale model, and it doesn't seem worth it to actually perform GPU transformations / analysis for the sole purpose of validating GPU eligibility). The same is true for@gpu.blockSize
, and the non-user-facing "GPU primitive block" primitive which is used to group GPU primitives created via attributes. Thus, the following code compiles and runs just fine in theflat
locale model:Reviewed by @e-kayrakli -- thanks!
Testing
flat
tests for GPU primitives, including a new user-facing error.@gpu.assertEligible