Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors from SiPixelPhase1TrackClusters? #30911

Closed
davidlange6 opened this issue Jul 25, 2020 · 8 comments · Fixed by #30912
Closed

Errors from SiPixelPhase1TrackClusters? #30911

davidlange6 opened this issue Jul 25, 2020 · 8 comments · Fixed by #30912

Comments

@davidlange6
Copy link
Contributor

While looking at something else, I notice that this error is in the iBs. Should we not worry?

%MSG-w SiPixelPhase1TrackClusters: SiPixelPhase1TrackClusters:hltSiPixelPhase1TrackClustersAnalyzer 25-Jul-2020 03:35:07 CEST Run: 1 Event: 1208
37: PixelClusterShapeCache collection is not valid
38: %MSG

(eg https://cmssdt.cern.ch/SDT/cgi-bin/logreader/slc7_amd64_gcc820/CMSSW_11_2_X_2020-07-24-2300/pyRelValMatrixLogs/run/11634.0_TTbar_14TeV+TTbar_14TeV_TuneCP5_2021_GenSimFullINPUT+DigiFull_2021+RecoFull_2021+HARVESTFull_2021+ALCAFull_2021/step3_TTbar_14TeV+TTbar_14TeV_TuneCP5_2021_GenSimFullINPUT+DigiFull_2021+RecoFull_2021+HARVESTFull_2021+ALCAFull_2021.log#/)

@cmsbuild
Copy link
Contributor

A new Issue was created by @davidlange6 David Lange.

@Dr15Jones, @dpiparo, @silviodonato, @smuzaffar, @makortel, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@Dr15Jones
Copy link
Contributor

assign reconstruction

@cmsbuild
Copy link
Contributor

New categories assigned: reconstruction

@slava77,@perrotta,@jpata you have been requested to review this Pull request/Issue and eventually sign? Thanks

@mmusich
Copy link
Contributor

mmusich commented Jul 25, 2020

We are aware of this issue. This happens only for the HLT offline monitoring case.
This was discussed long ago, e.g. in: #23575 (comment) and happens because the hltSiPixelClusterShapeCache collection (created in the step2) is not persisted in the event and cannot be consumed in the offline DQM (in step3).
The trivial fix which would be:

diff --git a/DQMOffline/Trigger/python/SiPixel_OfflineMonitoring_cff.py b/DQMOffline/Trigger/python/SiPixel_OfflineMonitoring_cff.py
index a25d683f129..a73fb35bca2 100644
--- a/DQMOffline/Trigger/python/SiPixel_OfflineMonitoring_cff.py
+++ b/DQMOffline/Trigger/python/SiPixel_OfflineMonitoring_cff.py
@@ -2,8 +2,12 @@ import FWCore.ParameterSet.Config as cms

from DQMOffline.Trigger.SiPixel_OfflineMonitoring_Cluster_cff import *
from DQMOffline.Trigger.SiPixel_OfflineMonitoring_TrackCluster_cff import *
+from RecoPixelVertexing.PixelLowPtUtilities.siPixelClusterShapeCache_cfi import *
+
+hltSiPixelClusterShapeCache = siPixelClusterShapeCache.clone(src = 'hltSiPixelClusters')

sipixelMonitorHLTsequence = cms.Sequence(
- hltSiPixelPhase1ClustersAnalyzer
- + hltSiPixelPhase1TrackClustersAnalyzer
+ hltSiPixelClusterShapeCache +
+ hltSiPixelPhase1ClustersAnalyzer +
+ hltSiPixelPhase1TrackClustersAnalyzer
)

solves this particular issue, but leads to a segmentation fault later in the code [1].
@arossi83 has explored a way of getting rid of it, but since this module is also run for the offline reconstruction monitoring, the fix needs to be validated first.
At any rate currently this module is not giving any meaningful plot, since it returns after not finding the SiPixelClusterShapeCache

edm::Handle<SiPixelClusterShapeCache> pixelClusterShapeCacheH;
iEvent.getByToken(pixelClusterShapeCacheToken_, pixelClusterShapeCacheH);
if (!pixelClusterShapeCacheH.isValid()) {
edm::LogWarning("SiPixelPhase1TrackClusters") << "PixelClusterShapeCache collection is not valid";
return;
}

I would propose to just remove it from being executed (for the time being) unless a solution of the other issue is found.
@mtosi FYI

[1]

A fatal system signal has occurred: segmentation violation
The following is the call stack containing the origin of the signal.

Sat Jul 25 15:01:07 CEST 2020
Thread 5 (Thread 0x7fdc847ff700 (LWP 24160)):
#0  0x00007fdd05230a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fdd0581362c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_11_1_0_pre7-slc7_amd64_gcc820/build/CMSSW_11_1_0_pre7-build/BUILD/slc7_amd64_gcc820/external/gcc/8.2.0-bcolbf/gcc-8.2.0/obj/x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu/bits/gthr-default.h:864
#2  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#3  0x00007fdca844ca3c in Eigen::ThreadPoolTempl<tensorflow::thread::EigenEnvironment>::WorkerLoop(int) () from /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_X_2020-07-24-2300/external/slc7_amd64_gcc820/lib/libtensorflow_framework.so.2
#4  0x00007fdca8449963 in std::_Function_handler<void (), tensorflow::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_X_2020-07-24-2300/external/slc7_amd64_gcc820/lib/libtensorflow_framework.so.2
#5  0x00007fdd05818d2f in execute_native_thread_routine () at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#6  0x00007fdd0522cea5 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fdd04f548dd in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x7fdc855ff700 (LWP 24159)):
#0  0x00007fdd05230a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fdd0581362c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_11_1_0_pre7-slc7_amd64_gcc820/build/CMSSW_11_1_0_pre7-build/BUILD/slc7_amd64_gcc820/external/gcc/8.2.0-bcolbf/gcc-8.2.0/obj/x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu/bits/gthr-default.h:864
#2  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#3  0x00007fdca844ca3c in Eigen::ThreadPoolTempl<tensorflow::thread::EigenEnvironment>::WorkerLoop(int) () from /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_X_2020-07-24-2300/external/slc7_amd64_gcc820/lib/libtensorflow_framework.so.2
#4  0x00007fdca8449963 in std::_Function_handler<void (), tensorflow::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_X_2020-07-24-2300/external/slc7_amd64_gcc820/lib/libtensorflow_framework.so.2
#5  0x00007fdd05818d2f in execute_native_thread_routine () at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#6  0x00007fdd0522cea5 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fdd04f548dd in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7fdc86363700 (LWP 24158)):
#0  0x00007fdd05230a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fdd0581362c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>) at /data/cmsbld/jenkins/workspace/auto-builds/CMSSW_11_1_0_pre7-slc7_amd64_gcc820/build/CMSSW_11_1_0_pre7-build/BUILD/slc7_amd64_gcc820/external/gcc/8.2.0-bcolbf/gcc-8.2.0/obj/x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu/bits/gthr-default.h:864
#2  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#3  0x00007fdca844ca3c in Eigen::ThreadPoolTempl<tensorflow::thread::EigenEnvironment>::WorkerLoop(int) () from /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_X_2020-07-24-2300/external/slc7_amd64_gcc820/lib/libtensorflow_framework.so.2
#4  0x00007fdca8449963 in std::_Function_handler<void (), tensorflow::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_X_2020-07-24-2300/external/slc7_amd64_gcc820/lib/libtensorflow_framework.so.2
#5  0x00007fdd05818d2f in execute_native_thread_routine () at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#6  0x00007fdd0522cea5 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fdd04f548dd in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7fdcd9008700 (LWP 24058)):
#0  0x00007fdd052341d9 in waitpid () from /lib64/libpthread.so.0
#1  0x00007fdcf9e146c7 in edm::service::cmssw_stacktrace_fork() () from /cvmfs/cms-ib.cern.ch/nweek-02638/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-24-1400/lib/slc7_amd64_gcc820/pluginFWCoreServicesPlugins.so
#2  0x00007fdcf9e1518a in edm::service::InitRootHandlers::stacktraceHelperThread() () from /cvmfs/cms-ib.cern.ch/nweek-02638/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-24-1400/lib/slc7_amd64_gcc820/pluginFWCoreServicesPlugins.so
#3  0x00007fdd05818d2f in execute_native_thread_routine () at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#4  0x00007fdd0522cea5 in start_thread () from /lib64/libpthread.so.0
#5  0x00007fdd04f548dd in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7fdd0335a4c0 (LWP 23964)):
#0  0x00007fdd04f49c3d in poll () from /lib64/libc.so.6
#1  0x00007fdcf9e14b2f in full_read.constprop () from /cvmfs/cms-ib.cern.ch/nweek-02638/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-24-1400/lib/slc7_amd64_gcc820/pluginFWCoreServicesPlugins.so
#2  0x00007fdcf9e1526c in edm::service::InitRootHandlers::stacktraceFromThread() () from /cvmfs/cms-ib.cern.ch/nweek-02638/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-24-1400/lib/slc7_amd64_gcc820/pluginFWCoreServicesPlugins.so
#3  0x00007fdcf9e16149 in sig_dostack_then_abort () from /cvmfs/cms-ib.cern.ch/nweek-02638/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-24-1400/lib/slc7_amd64_gcc820/pluginFWCoreServicesPlugins.so
#4  <signal handler called>
#5  0x00007fdcfa09ef60 in PixelGeomDetUnit::specificTopology() const () from /cvmfs/cms-ib.cern.ch/nweek-02638/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-24-1400/lib/slc7_amd64_gcc820/libGeometryCommonTopologies.so
#6  0x00007fdc91411306 in (anonymous namespace)::SiPixelPhase1TrackClusters::analyze(edm::Event const&, edm::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_X_2020-07-24-2300/lib/slc7_amd64_gcc820/pluginDQMSiPixelPhase1TrackAuto.so
#7  0x00007fdd07b5e766 in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventPrincipal const&, edm::EventSetupImpl const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms-ib.cern.ch/nweek-02638/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-24-1400/lib/slc7_amd64_gcc820/libFWCoreFramework.so
#8  0x00007fdd07b38883 in edm::WorkerT<edm::stream::EDProducerAdaptorBase>::implDo(edm::EventPrincipal const&, edm::EventSetupImpl const&, edm::ModuleCallingContext const*) () from /cvmfs/cms-ib.cern.ch/nweek-02638/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-24-1400/lib/slc7_amd64_gcc820/libFWCoreFramework.so
#9  0x00007fdd07a9ec9a in decltype ({parm#1}()) edm::convertException::wrap<bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetupImpl const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}>(bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetupImpl const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*)::{lambda()#1}) () from /cvmfs/cms-ib.cern.ch/nweek-02638/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-24-1400/lib/slc7_amd64_gcc820/libFWCoreFramework.so
#10 0x00007fdd07a9ee65 in bool edm::Worker::runModule<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetupImpl const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) () from /cvmfs/cms-ib.cern.ch/nweek-02638/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-24-1400/lib/slc7_amd64_gcc820/libFWCoreFramework.so
#11 0x00007fdd07a9f16b in std::__exception_ptr::exception_ptr edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(std::__exception_ptr::exception_ptr const*, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::MyPrincipal const&, edm::EventSetupImpl const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) () from /cvmfs/cms-ib.cern.ch/nweek-02638/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-24-1400/lib/slc7_amd64_gcc820/libFWCoreFramework.so
#12 0x00007fdd07aa080b in edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >::execute() () from /cvmfs/cms-ib.cern.ch/nweek-02638/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-24-1400/lib/slc7_amd64_gcc820/libFWCoreFramework.so
#13 0x00007fdd06221bfd in tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::process_bypass_loop (this=this@entry=0x7fdd01db3200, context_guard=..., t=t@entry=0x7fdbb42f7a40, isolation=isolation@entry=0) at ../../src/tbb/custom_scheduler.h:393
#14 0x00007fdd06221ef5 in tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all (this=0x7fdd01db3200, parent=..., child=<optimized out>) at ../../include/tbb/task.h:1003
#15 0x00007fdd07a1fcf5 in edm::EventProcessor::processLumis(std::shared_ptr<void> const&) () from /cvmfs/cms-ib.cern.ch/nweek-02638/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-24-1400/lib/slc7_amd64_gcc820/libFWCoreFramework.so
#16 0x00007fdd07a2802e in edm::EventProcessor::runToCompletion() () from /cvmfs/cms-ib.cern.ch/nweek-02638/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-24-1400/lib/slc7_amd64_gcc820/libFWCoreFramework.so
#17 0x0000000000412d9b in main::{lambda()#1}::operator()() const ()
#18 0x0000000000411362 in main ()

Current Modules:

Module: SiPixelPhase1TrackClusters:hltSiPixelPhase1TrackClustersAnalyzer (crashed)

@mmusich
Copy link
Contributor

mmusich commented Jul 25, 2020

@Dr15Jones this issue should rather be assigned DQM....

@Dr15Jones
Copy link
Contributor

unassign reconstruction

@Dr15Jones
Copy link
Contributor

assign dqm

@cmsbuild
Copy link
Contributor

New categories assigned: dqm

@jfernan2,@andrius-k,@schneiml,@fioriNTU,@kmaeshima you have been requested to review this Pull request/Issue and eventually sign? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants