-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cuda::ptx:mbarrier_{try/test}_wait{_parity}
#674
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs and source should be done and in reviewable shape. The test causes ptxas to segfault on CTK 12.2 (but the generated PTX seems to be okay). Still trying to find a way to prevent this.
|
||
#if __cccl_ptx_isa >= 700 | ||
NV_IF_TARGET(NV_PROVIDES_SM_80, ( | ||
if (threadIdx.x > thread_filter++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These weird if statements have been successful in the past to prevent ptxas
from segfaulting. Not anymore though. I am still working on a fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it help to offload the statement into a separate function?
|
||
#if __cccl_ptx_isa >= 700 | ||
NV_IF_TARGET(NV_PROVIDES_SM_80, ( | ||
if (threadIdx.x > thread_filter++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it help to offload the statement into a separate function?
cuda::ptx:mbarrier_{try/test}_wait{_parity}
Also some fixes to linking and formatting.
33c4d86
to
ab75fa0
Compare
Successfully created backport PR for |
Description
Add
mbarrier.test_wait, mbarrier.try_wait
exposure as well as the.parity
variants.closes #673
Checklist