-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[UBSAN] Undefined behavior in DataFormats/* reco packages #35033
Comments
assign reconstruction |
A new Issue was created by @mrodozov Mircho Rodozov. @Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
checklist
enum Status { trackSelected = 0, trackUsedForVertexFit, trackAssociatedToVertex };
inline bool associatedToVertex(unsigned int index) const {
return (int)svStatus == (int)trackAssociatedToVertex + (int)index;//<== HERE
}
Status svStatus; the warning points to a use case; I guess that the store may have happened here cmssw/RecoBTag/SecondaryVertex/plugins/TemplatedSecondaryVertexProducer.cc Lines 950 to 951 in f0b288d
and this conflicts with the 3-value enum type @emilbols
the lineno is not particularly helpful
this points to @cms-sw/ctpps-dpg-l2
this is apparently for case MuonSubdetId::GEM: {
GEMDetId gemid(id.rawId());
layer = ((gemid.station() - 1) << 2); // <== HERE
layer |= abs(gemid.layer() - 1); the workflow is D86
this points to
|
@slava77 for ME0 station we would like to keep it as 0. |
@slava77 until saturday/sunday (don't remember in which day we publish the next week)
Here is one case on insisting on lazy init to keep a struct as а POD |
@abdoulline Do you have an idea where this happens? Should we initialize this in the CaloTower constructor? |
All this looks weird to me. As to CTPPSPixelRecHits, I've never seen this error before. Any reason why these errors are happening only in this IB? |
This IB turns the Undefined behavior sanitizer (-fsanitize=undefined flag), which searches for such errors:
could you share what command you are using to run the relval and on what machine ? |
lxplus. Just (naively) installing |
would you try from lxplus, log onto cmsdev* machine and retry
|
It took ages but I could run 136.885, confirming the problem for CTPPSPixelRecHits. With @jan-kaspar we worked out a possible solution: @slava77 Shall I submit a PR? |
Did you ran it multiple times until it presented itself ? I have troubles reproducing some of the reported failures |
That's expected for most cases in this and other reco-related UBSAN issues: the symptoms are always for uninitialized values (which are also apparently not used at least in a few cases, beyond a load). The uninit values are often random (and not repeatable). |
No. I ran it once before changing the code and the error was there, I ran once after the changes and the error disappeared. |
@mseidel42 NB: there seems to be yet another issue in CaloTower (along with the aforementioned 'HcalSubdetector' one): -> CaloTower.h:26:7: runtime error: load of value 32, which is not a valid value for type 'bool' |
yes this issues were existing for long time that's why cleaning them out will narrow consequent bugs that pop up for no obvious reasons. although for all bugs we know we have some idea non of which is related to this reports (afaik) |
this is a kind ping on the pending issues as detailed in #35033 (comment) |
cmssw/DataFormats/TrackReco/src/HitPattern.cc Lines 100 to 104 in 6a5dc2e
|
@slava77 @mrodozov So if i understand correctly the error in 1302.17 appears because svStatus is set to a value beyond the enumeration range? Obviously this code was written a long time ago, but as far as i can see, this is the intended behavior. An svStatus=2 indicates a match to the vertex with index 0, svStatus=3 is a match with vertex with index 1, etc. . If we dont want to use enum objects like this, i guess the solution could be to change this object to an integer. I tried to run this on cmsdev25 following the recipe above with workflow 1302.17 and switching to CMSSW_12_1_UBSAN_X_2021-09-03-2300, but i dont get the error. If i understand the error correctly, it is most likely because it only appears when there are more than 1 SV in the event which the track is being associated to, so there is probably some randomness in whether you get it |
@emilbols As for reproducibility, you may want to enable more threads (the reference job used 4 threads) and add more events. |
+reconstruction the summary was updated at |
all of these ubsan runtime errors are fixed |
cms-bot internal usage |
This issue is fully signed and ready to be closed. |
The UBSAN IB reports undefined behavior in 5 files, with example relval and step they appear in:
check the relval logs in here for the examples:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/ubsan_logs/relvals/
The text was updated successfully, but these errors were encountered: