-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[UBSAN] Undefined behavior in DataFormats/* simulation packages #35034
Comments
assign simulation |
New categories assigned: simulation @civanch,@mdhildreth you have been requested to review this Pull request/Issue and eventually sign? Thanks |
A new Issue was created by @mrodozov Mircho Rodozov. @Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
@mrodozov what are the exact steps to reproduce the issues. Just to run specified workflows inside an UBSAN IB? Here I am primarily interested in the issue with PixelDigi.h |
yes, just run the 11603.0 step 2 with the UBSAN IB, set SCRAM_ARCH=slc7_amd64_gcc10 , we use gcc10 for UBSAN |
@ferencek , have you any idea what happens? |
@civanch, I plan to have a look this week. |
@mrodozov, @civanch, I had a look at this but I am afraid this issue exceeds my level of expertise with C++. About two weeks ago I just had a quick look at the code and was quite confused because the problematic variable in https://github.com/cms-sw/cmssw/blob/CMSSW_12_1_0_pre1/DataFormats/SiPixelDigi/interface/PixelDigi.h#L75,
and ran step2 with I was able to reproduce the problem and for some reason it occurs for the first time in the 4th event. So for the subsequent tests I modified the cfg file to skip three and process only one event. The error was still there. I added a static int pixelToChannel(int row, int col) {
std::cout << ">> column width: " << PixelChannelIdentifier::thePacking.column_width << std::endl;
return (row << PixelChannelIdentifier::thePacking.column_width) | col;
} but the values printed out were always 10 as they should be. Here is an interesting portion of the printout
I even modified the data members in https://github.com/cms-sw/cmssw/blob/CMSSW_12_1_0_pre1/DataFormats/SiPixelDetId/interface/PixelChannelIdentifier.h#L33-L35 from Interestingly enough, when I set the cfg file to skip the first two events, I noticed the following error which otherwise does not appear
Any suggestions on what to check next would be very welcome. |
Just a quick follow-up on the issue with
and here
So there seems to be some randomness in the loaded value. |
@ferencek , I think problem is |
@smuzaffar, good point, thanks. I was interpreting the error message incorrectly. OK, this will be helpful for further debugging. |
OK, so here is the culprit https://github.com/cms-sw/cmssw/blob/CMSSW_12_1_0_pre1/SimTracker/SiPixelDigitizer/plugins/SiPixelChargeReweightingAlgorithm.cc#L234 Now I need to figure out why the row coordinate gets evaluated to -1. |
The problem with the PixelCPEBase has already been reported in #35036 |
Restricted the loop over the row and column coordinates to prevent it form going out of physical bounds. This addresses the following issue DataFormats/SiPixelDigi/interface/PixelDigi.h:75:61: runtime error: left shift of negative value -1 reported in cms-sw#35034
+1 the issue was fixed in #35337 |
closing this, some of these are fixed and we have newer issues open to track the rest |
cms-bot internal usage |
This issue is fully signed and ready to be closed. |
The UBSAN IB reports undefined behavior in 5 files, with example relval and step they appear in:
check the relval logs in here for the examples:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/ubsan_logs/relvals/
The text was updated successfully, but these errors were encountered: