Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New rivet version fails ASAN relvals with sygsegv #31474

Closed
mrodozov opened this issue Sep 15, 2020 · 7 comments · Fixed by #31476
Closed

New rivet version fails ASAN relvals with sygsegv #31474

mrodozov opened this issue Sep 15, 2020 · 7 comments · Fixed by #31476

Comments

@mrodozov
Copy link
Contributor

After rivet was moved to version 3.1.2 in cms-sw/cmsdist#6209
a number of ASAN workflows started failing:
https://cmssdt.cern.ch/SDT/html/cmssdt-ib/#/relVal/CMSSW_11_2/2020-09-14-2300?selectedArchs=slc7_amd64_gcc820&selectedFlavors=ASAN_X&selectedStatus=failed
and the failing relvals can be seen in kibana right after the pr was merged:
https://tinyurl.com/y4ag6e7f

I ran it under gdb and for now I got that it fails somewhere in PhysicsTools/NanoAOD with this:


%MSG
%MSG-w LogicError:  GenWeightsTableProducer:genWeightsTable@beginRun  15-Sep-2020 21:22:51 CEST Run: 1
::getByLabel: An attempt was made to read a Run product before endRun() was called.
The product is of type 'LHERunInfoProduct'.
The specified ModuleLabel was 'source'.
The specified productInstanceName was ''.

%MSG
Begin processing the 1st record. Run 1, Event 5002, LumiSection 101 on stream 0 at 15-Sep-2020 21:22:51.992 CEST

Thread 1 "cmsRun" received signal SIGSEGV, Segmentation fault.
0x00007fffab3082bc in Rivet::ProjectionApplier::getProjHandler (this=0xbebebebebebebebe)
    at /build/mrodozov/rivet_fails/build_rivet/slc7_amd64_gcc820/external/rivet/3.1.2/include/Rivet/ProjectionApplier.hh:142
142	      return _projhandler;

@intrepid42

@cmsbuild
Copy link
Contributor

A new Issue was created by @mrodozov Mircho Rodozov.

@Dr15Jones, @dpiparo, @silviodonato, @smuzaffar, @makortel, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@makortel
Copy link
Contributor

assign generators

@cmsbuild
Copy link
Contributor

New categories assigned: generators

@alberto-sanchez,@SiewYan,@GurpreetSinghChahal,@mkirsano,@agrohsje you have been requested to review this Pull request/Issue and eventually sign? Thanks

@mrodozov
Copy link
Contributor Author

more output, it's in geninterface ,

Thread 1 "cmsRun" hit Breakpoint 4, edm::HepMCProduct::GetEvent (this=0x604001192520)
    at /cvmfs/cms-ib.cern.ch/week1/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_ASAN_X_2020-09-11-2300/src/SimDataFormats/GeneratorProducts/interface/HepMCProduct.h:34
34	    const HepMC::GenEvent *GetEvent() const { return evt_; }
(gdb) s
HTXSRivetProducer::produce (this=0x61a00012c080, iEvent=...) at /build/mrodozov/rivet_fails/CMSSW_11_2_ASAN_X_2020-09-11-2300/src/GeneratorInterface/RivetInterface/plugins/HTXSRivetProducer.cc:71
71	    if (_prodMode == "AUTO") {
(gdb) s
73	      if (m_HiggsProdMode != HTXS::GGF && m_HiggsProdMode != HTXS::VBF && m_HiggsProdMode != HTXS::GG2ZH) {
(gdb) s
74	        unsigned nWs = 0;
(gdb) s
75	        unsigned nZs = 0;
(gdb) 
76	        unsigned nTs = 0;
(gdb) 
77	        unsigned nBs = 0;
(gdb) 
78	        unsigned nHs = 0;
(gdb) 
80	        HepMC::GenVertex* HSvtx = myGenEvent->signal_process_vertex();
(gdb) b hasProjection
Breakpoint 5 at 0x7fff9a59d270 (4 locations)
(gdb) n
82	        if (HSvtx) {
(gdb) 
83	          for (auto ptcl : HepMCUtils::particles(HSvtx, HepMC::children)) {
(gdb) 
97	        if (nZs == 1 && nHs == 1 && (nWs + nTs) == 0) {
(gdb) 
99	        } else if (nWs == 1 && nHs == 1 && (nZs + nTs) == 0) {
(gdb) 
101	        } else if (nTs == 2 && nHs == 1 && nZs == 0) {
(gdb) 
103	        } else if (nTs == 1 && nHs == 1 && nZs == 0) {
(gdb) 
105	        } else if (nBs == 2 && nHs == 1 && nZs == 0) {
(gdb) 
111	    if (!_HTXS || !_HTXS->hasProjection("FS")) {
(gdb) 

Thread 1 "cmsRun" hit Breakpoint 5, Rivet::ProjectionApplier::hasProjection (this=0xbebebebebebebebe, name=...)
    at /build/mrodozov/rivet_fails/build_rivet/slc7_amd64_gcc820/external/rivet/3.1.2/include/Rivet/ProjectionApplier.hh:50
50	      return getProjHandler().hasProjection(*this, name);
(gdb) 

Thread 1 "cmsRun" hit Breakpoint 2, Rivet::ProjectionApplier::getProjHandler (this=0xbebebebebebebebe)
    at /build/mrodozov/rivet_fails/build_rivet/slc7_amd64_gcc820/external/rivet/3.1.2/include/Rivet/ProjectionApplier.hh:142
142	      return _projhandler;
(gdb) s

Thread 1 "cmsRun" received signal SIGSEGV, Segmentation fault.
0x00007fffab2b82bc in Rivet::ProjectionApplier::getProjHandler (this=0xbebebebebebebebe)
    at /build/mrodozov/rivet_fails/build_rivet/slc7_amd64_gcc820/external/rivet/3.1.2/include/Rivet/ProjectionApplier.hh:142
142	      return _projhandler;

@Dr15Jones
Copy link
Contributor

Dr15Jones commented Sep 15, 2020

The problem is the member variable _HTXS is never set to nullptr in the constructor.

@mrodozov
Copy link
Contributor Author

that was it.

mrodozov added a commit to mrodozov/cmssw that referenced this issue Sep 15, 2020
@mseidel42
Copy link
Contributor

Ah, sorry, nullptrs were there but the one for _HTXS got lost during the attempt to change to std::unique_ptr :(

Thanks a lot for fixing!

cmsbuild added a commit that referenced this issue Sep 16, 2020
Initialize vars to nullptr in HTXSRivetProducer, fix #31474
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants