-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clamp the number of pixel tracks to the capacity of the SoA #38883
Clamp the number of pixel tracks to the capacity of the SoA #38883
Conversation
type bugfix |
enable gpu |
please test |
uint32_t first = blockIdx.x * blockDim.x + threadIdx.x; | ||
// clamp the number of tracks to the capacity of the SoA | ||
auto ntracks = std::min<uint32_t>(apc->get().m, tracks.stride() - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VinInn does this look like a good fix to you ?
An alternative could be to adapt all client code to check the size vs the stride (or look for the nhits == 0
marker), but this seemed easier to implement and safer to use.
On the other this will completely silence the error.
Maybe we could check if nTracks() == stride() - 1
and assume it to mean that an "overflow" occurred ?
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38883/31314
|
A new Pull Request was created by @fwyzard (Andrea Bocci) for master. It involves the following packages:
@jpata, @clacaputo can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
if (0 == first) | ||
tracks.setNTracks(ntracks); | ||
for (int idx = first, nt = ntracks; idx < nt; idx += gridDim.x * blockDim.x) { | ||
for (uint32_t idx = first; idx < ntracks; idx += gridDim.x * blockDim.x) { | ||
auto nHits = tracks.nHits(idx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
int was intentional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which change ?
nt = ntracks
: I don't see how it can make a difference (ntracks
is also a local variable) but I can revert itint
vsuint32_t
: becuasentracks
is nowuint32_t
and the compiler warns about signed-vs-unsigned comparison
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-197dc3/26510/summary.html GPU Comparison SummarySummary:
Comparison SummarySummary:
|
+reconstruction |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
hold I'm waiting for further feedback from @VinInn and mulling on the possibility of adding a LogWarning. |
Pull request has been put on hold by @fwyzard |
f3c7494
to
d04920f
Compare
please test |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38883/31340
|
Pull request #38883 was updated. @jpata, @clacaputo can you please check and sign again. |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-197dc3/26543/summary.html GPU Comparison SummarySummary:
Comparison SummarySummary:
|
+reconstruction |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
urgent |
+1 |
PR description:
The data structure used to store the pixel tracks currently has a fixed size of 32768, which was found to be adequate based on MC studies.
While running over collisions data a small fraction of events has been found to have a larger number of tracks, leading to assertion failures or memory access errors.
This PR clamps the number of pixel tracks to the capacity of the SoA, and adds a warning like
PR validation:
Running over an event with > 33k pixel tracks no longer causes assertion failures.
If this PR will be backported please specify to which release cycle the backport is meant for:
To be backported to CMSSW 12.4.x for data taking.