Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libc] Support 'assert.h' on the GPU #16

Closed
wants to merge 1 commit into from

Conversation

smeenai
Copy link
Owner

@smeenai smeenai commented Sep 1, 2023

This patch adds the necessary support to provide assert functionality
through the GPU libc implementation. This implementation creates a
special-case GPU implementation rather than relying on the common
version. This is because the GPU has special considerings for printing.
The assertion is printed out in chunks with write_to_stderr, however
when combined with the GPU execution model this causes 32+ threads to
all execute in-lock step. Meaning that we'll get a horribly fragmented
message. Furthermore, potentially thousands of threads could hit the
assertion at once and try to print even if we had it all in one
printf.

This is solved by having a one-time lock that each thread group / wave /
warp will attempt to claim. We only let one thread group pass through
while the others simply stop executing. Finally only the first thread in
that group will do the printing until we finally abort execution.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D159296


Stack:

⚠️ Part of a stack created by spr. Do not merge manually using the UI - doing so may have unexpected results.

This patch adds the necessary support to provide `assert` functionality
through the GPU `libc` implementation. This implementation creates a
special-case GPU implementation rather than relying on the common
version. This is because the GPU has special considerings for printing.
The assertion is printed out in chunks with `write_to_stderr`, however
when combined with the GPU execution model this causes 32+ threads to
all execute in-lock step. Meaning that we'll get a horribly fragmented
message. Furthermore, potentially thousands of threads could hit the
assertion at once and try to print even if we had it all in one
`printf`.

This is solved by having a one-time lock that each thread group / wave /
warp will attempt to claim. We only let one thread group pass through
while the others simply stop executing. Finally only the first thread in
that group will do the printing until we finally abort execution.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D159296

commit-id:a8e5550c
@smeenai smeenai closed this Aug 19, 2024
smeenai pushed a commit that referenced this pull request Aug 20, 2024
A case for this transformation, https://gcc.godbolt.org/z/nhYcWq1WE
Fold
  mov     w8, llvm#56952
  movk    w8, #15, lsl #16
  ldrb    w0, [x0, x8]
into
  add     x0, x0, 1036288
  ldrb    w0, [x0, 3704]

Only LDRBBroX is supported for the first time.
Fix llvm#71917

Note: This PR is try relanding the commit 32878c2 with fix crash for PR79756
  this crash is exposes when there is MOVKWi instruction in the head of a block,
but without MOVZWi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants