-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ECAL skip GPU unpacking of the rest of the block if a bad block is detected #42301
ECAL skip GPU unpacking of the rest of the block if a bad block is detected #42301
Conversation
type ecal |
enable gpu |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-42301/36323
|
A new Pull Request was created by @thomreis (Thomas Reis) for master. It involves the following packages:
@cmsbuild, @mandrenguyen, @clacaputo can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-df519e/33762/summary.html Comparison SummarySummary:
GPU Comparison SummarySummary:
|
This doesn't change reco, and nothing alarming jumps out from a visual inspection. |
I can't really judge the implementation. I think it would make sense to wait for the review of @cms-sw/heterogeneous-l2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking into the break
on line 320, please speak up if you have any comments!
break; | ||
if (i_to_access == 0) { | ||
ch_with_bad_block = 999999; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in principle there should be a __syncthreads()
here, to guarantee that all threads see the initial assignment of ch_with_bad_block
before the atomicMin(&ch_with_bad_block, i_to_access);
at line 314.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could put a __syncthreads()
just before the if (bad_block) {
in line 313. I think there is no need for it to be in the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake. It had to be just after the initialisation.
|
||
// threads outside of the range or bad block detected in this thread or one working on a lower block -> leave the loop | ||
if (i_to_access >= nchannels || i_to_access >= ch_with_bad_block) { | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am somewhat concerned about what happens when i_to_access >= ch_with_bad_block
happens:
- the threads for which
i_to_access >= ch_with_bad_block
break from the loop - during the next iteration of the for loop, some of the threads execute again the loop until here, where they should also bail out
- however, the behaviour of the
__syncthreads()
in the first part of the loop in undefined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was probably not an issue during testing because with the current configuration the loop is only executed once.
I could replace the break
with a continue
to make sure all loop iterations are done at least until the __syncthreads
.
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-df519e/33902/summary.html Comparison SummarySummary:
GPU Comparison SummarySummary:
|
assign heterogeneous |
I'll try to have a second look during the afternoon |
+heterogeneous LGTM |
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
Do we need a backport to 131X and 130X or do we not care anymore? |
I would still make a backport to 13.0.x, if you and the release managers agree. |
By the way, we definitely need a backport to 13.2.x! |
+1 |
PR description:
Skip the GPU unpacking of the rest of the block if a bad block is detected in one thread. This behaviour matches the one of the CPU unpacker.
Addresses #42090.
PR validation:
No crashes with integrity errors observed in runs 367771, 368547, and 368724 in #39568. Passes WF 12434.512