Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation violation in SectorProcessorShower::process #42185

Closed
iarspider opened this issue Jul 4, 2023 · 25 comments
Closed

Segmentation violation in SectorProcessorShower::process #42185

iarspider opened this issue Jul 4, 2023 · 25 comments

Comments

@iarspider
Copy link
Contributor

We observe multiple RelVal failures in CMSSW_13_2_X_2023-07-04 IBs (all platforms). Example of crash log

A fatal system signal has occurred: segmentation violation
The following is the call stack containing the origin of the signal.

Tue Jul  4 14:52:22 CEST 2023
Thread 13 (Thread 0x1487469ff700 (LWP 984324) "cmsRun"):
#0  0x00001487e732b45c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00001487e79acd20 in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>) at /data/cmsbld/jenkins/workspace/jenkins-test-bootstrap/toolconf/BUILD/el8_amd64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/gcc-11.4.1/obj/x86_64-redhat-linux-gnu/libstdc++-v3/include/x86_64-redhat-linux-gnu/bits/gthr-default.h:865
#2  std::__condvar::wait (__m=..., this=<optimized out>) at /data/cmsbld/jenkins/workspace/jenkins-test-bootstrap/toolconf/BUILD/el8_amd64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/gcc-11.4.1/obj/x86_64-redhat-linux-gnu/libstdc++-v3/include/bits/std_mutex.h:155
#3  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:41
#4  0x000014879edab60b in Eigen::ThreadPoolTempl<tensorflow::thread::EigenEnvironment>::WaitForWork(Eigen::EventCount::Waiter*, tensorflow::thread::EigenEnvironment::Task*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libtensorflow_cc.so.2
#5  0x000014879edabc2a in Eigen::ThreadPoolTempl<tensorflow::thread::EigenEnvironment>::WorkerLoop(int) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libtensorflow_cc.so.2
#6  0x000014879eda8658 in std::_Function_handler<void (), tensorflow::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libtensorflow_cc.so.2
#7  0x000014879b46def0 in tensorflow::(anonymous namespace)::PThread::ThreadFn(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libtensorflow_framework.so.2
#8  0x00001487e73251ca in start_thread () from /lib64/libpthread.so.0
#9  0x00001487e6f91e73 in clone () from /lib64/libc.so.6
Thread 12 (Thread 0x1487471ff700 (LWP 984323) "cmsRun"):
#0  0x00001487e732b45c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00001487e79acd20 in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>) at /data/cmsbld/jenkins/workspace/jenkins-test-bootstrap/toolconf/BUILD/el8_amd64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/gcc-11.4.1/obj/x86_64-redhat-linux-gnu/libstdc++-v3/include/x86_64-redhat-linux-gnu/bits/gthr-default.h:865
#2  std::__condvar::wait (__m=..., this=<optimized out>) at /data/cmsbld/jenkins/workspace/jenkins-test-bootstrap/toolconf/BUILD/el8_amd64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/gcc-11.4.1/obj/x86_64-redhat-linux-gnu/libstdc++-v3/include/bits/std_mutex.h:155
#3  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:41
#4  0x000014879edab60b in Eigen::ThreadPoolTempl<tensorflow::thread::EigenEnvironment>::WaitForWork(Eigen::EventCount::Waiter*, tensorflow::thread::EigenEnvironment::Task*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libtensorflow_cc.so.2
#5  0x000014879edabc2a in Eigen::ThreadPoolTempl<tensorflow::thread::EigenEnvironment>::WorkerLoop(int) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libtensorflow_cc.so.2
#6  0x000014879eda8658 in std::_Function_handler<void (), tensorflow::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libtensorflow_cc.so.2
#7  0x000014879b46def0 in tensorflow::(anonymous namespace)::PThread::ThreadFn(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libtensorflow_framework.so.2
#8  0x00001487e73251ca in start_thread () from /lib64/libpthread.so.0
#9  0x00001487e6f91e73 in clone () from /lib64/libc.so.6
Thread 11 (Thread 0x1487479ff700 (LWP 984322) "cmsRun"):
#0  0x00001487e732b45c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00001487e79acd20 in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>) at /data/cmsbld/jenkins/workspace/jenkins-test-bootstrap/toolconf/BUILD/el8_amd64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/gcc-11.4.1/obj/x86_64-redhat-linux-gnu/libstdc++-v3/include/x86_64-redhat-linux-gnu/bits/gthr-default.h:865
#2  std::__condvar::wait (__m=..., this=<optimized out>) at /data/cmsbld/jenkins/workspace/jenkins-test-bootstrap/toolconf/BUILD/el8_amd64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/gcc-11.4.1/obj/x86_64-redhat-linux-gnu/libstdc++-v3/include/bits/std_mutex.h:155
#3  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:41
#4  0x000014879edab60b in Eigen::ThreadPoolTempl<tensorflow::thread::EigenEnvironment>::WaitForWork(Eigen::EventCount::Waiter*, tensorflow::thread::EigenEnvironment::Task*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libtensorflow_cc.so.2
#5  0x000014879edabc2a in Eigen::ThreadPoolTempl<tensorflow::thread::EigenEnvironment>::WorkerLoop(int) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libtensorflow_cc.so.2
#6  0x000014879eda8658 in std::_Function_handler<void (), tensorflow::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libtensorflow_cc.so.2
#7  0x000014879b46def0 in tensorflow::(anonymous namespace)::PThread::ThreadFn(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libtensorflow_framework.so.2
#8  0x00001487e73251ca in start_thread () from /lib64/libpthread.so.0
#9  0x00001487e6f91e73 in clone () from /lib64/libc.so.6
Thread 10 (Thread 0x14878f297700 (LWP 984134) "cmsRun"):
#0  0x00001487e732dda6 in do_futex_wait.constprop () from /lib64/libpthread.so.0
#1  0x00001487e732de98 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
#2  0x00001487e08e53f6 in XrdCl::JobManager::RunJobs() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libXrdCl.so.3
#3  0x00001487e08e54a9 in RunRunnerThread () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libXrdCl.so.3
#4  0x00001487e73251ca in start_thread () from /lib64/libpthread.so.0
#5  0x00001487e6f91e73 in clone () from /lib64/libc.so.6
Thread 9 (Thread 0x14878f498700 (LWP 984133) "cmsRun"):
#0  0x00001487e732dda6 in do_futex_wait.constprop () from /lib64/libpthread.so.0
#1  0x00001487e732de98 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
#2  0x00001487e08e53f6 in XrdCl::JobManager::RunJobs() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libXrdCl.so.3
#3  0x00001487e08e54a9 in RunRunnerThread () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libXrdCl.so.3
#4  0x00001487e73251ca in start_thread () from /lib64/libpthread.so.0
#5  0x00001487e6f91e73 in clone () from /lib64/libc.so.6
Thread 8 (Thread 0x14878f699700 (LWP 984132) "cmsRun"):
#0  0x00001487e732dda6 in do_futex_wait.constprop () from /lib64/libpthread.so.0
#1  0x00001487e732de98 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
#2  0x00001487e08e53f6 in XrdCl::JobManager::RunJobs() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libXrdCl.so.3
#3  0x00001487e08e54a9 in RunRunnerThread () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libXrdCl.so.3
#4  0x00001487e73251ca in start_thread () from /lib64/libpthread.so.0
#5  0x00001487e6f91e73 in clone () from /lib64/libc.so.6
Thread 7 (Thread 0x14878faab700 (LWP 984131) "cmsRun"):
#0  0x00001487e732f180 in nanosleep () from /lib64/libpthread.so.0
#1  0x00001487e09d2328 in XrdSysTimer::Wait(int) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libXrdUtils.so.3
#2  0x00001487e0852509 in XrdCl::TaskManager::RunTasks() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libXrdCl.so.3
#3  0x00001487e0852689 in RunRunnerThread () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libXrdCl.so.3
#4  0x00001487e73251ca in start_thread () from /lib64/libpthread.so.0
#5  0x00001487e6f91e73 in clone () from /lib64/libc.so.6
Thread 6 (Thread 0x14878fe34700 (LWP 984128) "cmsRun"):
#0  0x00001487e7086e87 in epoll_wait () from /lib64/libc.so.6
#1  0x00001487e09cc6d2 in XrdSys::IOEvents::PollE::Begin(XrdSysSemaphore*, int&, char const**) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libXrdUtils.so.3
#2  0x00001487e09c8e7d in XrdSys::IOEvents::BootStrap::Start(void*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libXrdUtils.so.3
#3  0x00001487e09d19d8 in XrdSysThread_Xeq () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libXrdUtils.so.3
#4  0x00001487e73251ca in start_thread () from /lib64/libpthread.so.0
#5  0x00001487e6f91e73 in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x1487909ff700 (LWP 984127) "cmsRun"):
#0  0x00001487e707bf41 in poll () from /lib64/libc.so.6
#1  0x00001487e011ed2f in full_read.constprop () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/pluginFWCoreServicesPlugins.so
#2  0x00001487e00e675c in edm::service::InitRootHandlers::stacktraceFromThread() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/pluginFWCoreServicesPlugins.so
#3  0x00001487e00e71bb in sig_dostack_then_abort () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/pluginFWCoreServicesPlugins.so
#4  <signal handler called>
#5  0x000014878ba9f68a in SectorProcessorShower::process(MuonDigiCollection<CSCDetId, CSCShowerDigi> const&, BXVector<l1t::RegionalMuonShower>&) const () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/libL1TriggerL1TMuonEndCap.so
#6  0x0000148775ad5aec in L1TMuonEndCapShowerProducer::produce(edm::Event&, edm::EventSetup const&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/pluginL1TriggerL1TMuonEndCapPlugins.so
#7  0x00001487e9a6617d in edm::stream::EDProducerAdaptorBase::doEvent(edm::EventTransitionInfo const&, edm::ActivityRegistry*, edm::ModuleCallingContext const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/libFWCoreFramework.so
#8  0x00001487e9a4c8e2 in edm::WorkerT<edm::stream::EDProducerAdaptorBase>::implDo(edm::EventTransitionInfo const&, edm::ModuleCallingContext const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/libFWCoreFramework.so
#9  0x00001487e99d732a in std::__exception_ptr::exception_ptr edm::Worker::runModuleAfterAsyncPrefetch<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >(std::__exception_ptr::exception_ptr, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::TransitionInfoType const&, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1>::Context const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/libFWCoreFramework.so
#10 0x00001487e99d77d8 in edm::Worker::RunModuleTask<edm::OccurrenceTraits<edm::EventPrincipal, (edm::BranchActionType)1> >::execute() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/libFWCoreFramework.so
#11 0x00001487e9b8af79 in tbb::detail::d1::function_task<edm::WaitingTaskList::announce()::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_X_2023-07-02-0000/lib/el8_amd64_gcc11/libFWCoreConcurrency.so
#12 0x00001487e81ce2e4 in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter> (t=0x1487e58c2500, waiter=..., this=0x1487e5979480) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/task_dispatcher.h:322
#13 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter> (t=0x0, waiter=..., this=0x1487e5979480) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/task_dispatcher.h:458
#14 tbb::detail::r1::arena::process (tls=..., this=<optimized out>) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/arena.cpp:137
#15 tbb::detail::r1::market::process (this=<optimized out>, j=...) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/market.cpp:599
#16 0x00001487e81d04a6 in tbb::detail::r1::rml::private_worker::run (this=0x1487e1397000) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/private_server.cpp:271
#17 tbb::detail::r1::rml::private_worker::thread_routine (arg=0x1487e1397000) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/private_server.cpp:221
#18 0x00001487e73251ca in start_thread () from /lib64/libpthread.so.0
#19 0x00001487e6f91e73 in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x1487918be700 (LWP 984126) "cmsRun"):
#0  0x00001487e7051868 in nanosleep () from /lib64/libc.so.6
#1  0x00001487e705176e in sleep () from /lib64/libc.so.6
#2  0x00001487e00e3980 in sig_pause_for_stacktrace () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x00001487e819d70f in ?? () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/liblzma.so.5
#5  0x00001487e819fe0d in ?? () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/liblzma.so.5
#6  0x00001487e8196802 in ?? () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/liblzma.so.5
#7  0x00001487e8190870 in ?? () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/liblzma.so.5
#8  0x00001487e8192a26 in ?? () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/liblzma.so.5
#9  0x00001487e8188bf8 in lzma_code () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/liblzma.so.5
#10 0x00001487e8b6dff7 in R__unzipLZMA () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libCore.so
#11 0x00001487e8b74f52 in R__unzip () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libCore.so
#12 0x00001487e95ffa24 in TBasket::ReadBasketBuffers(long long, int, TFile*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libTree.so
#13 0x00001487e960a6bc in TBranch::GetBasketImpl(int, TBuffer*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libTree.so
#14 0x00001487e960bd9d in TBranch::GetBasketAndFirst(TBasket*&, long long&, TBuffer*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libTree.so
#15 0x00001487e960bf97 in TBranch::GetEntry(long long, int) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libTree.so
#16 0x00001487e96265e5 in TBranchElement::GetEntry(long long, int) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libTree.so
#17 0x00001487e9626598 in TBranchElement::GetEntry(long long, int) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/external/el8_amd64_gcc11/lib/libTree.so
#18 0x0000148790e90794 in edm::RootTree::getEntry(TBranch*, long long) const () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/pluginIOPoolInput.so
#19 0x0000148790e72edc in edm::RootDelayedReader::getProduct_(edm::BranchID const&, edm::EDProductGetter const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/pluginIOPoolInput.so
#20 0x00001487e994a3c5 in edm::DelayedReader::getProduct(edm::BranchID const&, edm::EDProductGetter const*, edm::ModuleCallingContext const*) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/libFWCoreFramework.so
#21 0x00001487e99f1e4b in edm::DelayedReaderInputProductResolver::prefetchAsync_(edm::WaitingTaskHolder, edm::Principal const&, bool, edm::ServiceToken const&, edm::SharedResourcesAcquirer*, edm::ModuleCallingContext const*) const::{lambda()#1}::operator()() const::{lambda()#1}::operator()() const () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/libFWCoreFramework.so
#22 0x00001487e99f201c in edm::DelayedReaderInputProductResolver::prefetchAsync_(edm::WaitingTaskHolder, edm::Principal const&, bool, edm::ServiceToken const&, edm::SharedResourcesAcquirer*, edm::ModuleCallingContext const*) const::{lambda()#1}::operator()() const [clone .lto_priv.0] () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/libFWCoreFramework.so
#23 0x00001487e99f3408 in edm::SerialTaskQueue::QueuedTask<edm::SerialTaskQueueChain::push<edm::DelayedReaderInputProductResolver::prefetchAsync_(edm::WaitingTaskHolder, edm::Principal const&, bool, edm::ServiceToken const&, edm::SharedResourcesAcquirer*, edm::ModuleCallingContext const*) const::{lambda()#1}&>(tbb::detail::d1::task_group&, edm::DelayedReaderInputProductResolver::prefetchAsync_(edm::WaitingTaskHolder, edm::Principal const&, bool, edm::ServiceToken const&, edm::SharedResourcesAcquirer*, edm::ModuleCallingContext const*) const::{lambda()#1}&)::{lambda()#1}>::execute() [clone .lto_priv.0] () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/libFWCoreFramework.so
#24 0x00001487e9b8c099 in tbb::detail::d1::function_task<edm::SerialTaskQueue::spawn(edm::SerialTaskQueue::TaskBase&)::{lambda()#1}>::execute(tbb::detail::d1::execution_data&) () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw/CMSSW_13_2_X_2023-07-02-0000/lib/el8_amd64_gcc11/libFWCoreConcurrency.so
#25 0x00001487e81ce2e4 in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter> (t=0x1487e58b2b00, waiter=..., this=0x1487e5979500) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/task_dispatcher.h:322
#26 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter> (t=0x0, waiter=..., this=0x1487e5979500) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/task_dispatcher.h:458
#27 tbb::detail::r1::arena::process (tls=..., this=<optimized out>) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/arena.cpp:137
#28 tbb::detail::r1::market::process (this=<optimized out>, j=...) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/market.cpp:599
#29 0x00001487e81d04a6 in tbb::detail::r1::rml::private_worker::run (this=0x1487e1397100) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/private_server.cpp:271
#30 tbb::detail::r1::rml::private_worker::thread_routine (arg=0x1487e1397100) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/private_server.cpp:221
#31 0x00001487e73251ca in start_thread () from /lib64/libpthread.so.0
#32 0x00001487e6f91e73 in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x1487922bf700 (LWP 984125) "cmsRun"):
#0  0x00001487e7051868 in nanosleep () from /lib64/libc.so.6
#1  0x00001487e705176e in sleep () from /lib64/libc.so.6
#2  0x00001487e00e3980 in sig_pause_for_stacktrace () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x00001487e6f919bd in syscall () from /lib64/libc.so.6
#5  0x00001487e81d074f in tbb::detail::r1::futex_wait (comparand=2, futex=0x1487e13970a4) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/semaphore.h:103
#6  tbb::detail::r1::binary_semaphore::P (this=0x1487e13970a4) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/semaphore.h:290
#7  tbb::detail::r1::rml::internal::thread_monitor::wait (this=0x1487e13970a0) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/rml_thread_monitor.h:217
#8  tbb::detail::r1::rml::private_worker::run (this=0x1487e1397080) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/private_server.cpp:273
#9  tbb::detail::r1::rml::private_worker::thread_routine (arg=0x1487e1397080) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/private_server.cpp:221
#10 0x00001487e73251ca in start_thread () from /lib64/libpthread.so.0
#11 0x00001487e6f91e73 in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x1487c263e700 (LWP 984027) "cmsRun"):
#0  0x00001487e732f672 in waitpid () from /lib64/libpthread.so.0
#1  0x00001487e00e3cf7 in edm::service::cmssw_stacktrace_fork() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/pluginFWCoreServicesPlugins.so
#2  0x00001487e00e668a in edm::service::InitRootHandlers::stacktraceHelperThread() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/pluginFWCoreServicesPlugins.so
#3  0x00001487e79b2894 in std::execute_native_thread_routine (__p=0x1487ce0201b0) at ../../../../../libstdc++-v3/src/c++11/thread.cc:82
#4  0x00001487e73251ca in start_thread () from /lib64/libpthread.so.0
#5  0x00001487e6f91e73 in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x1487e64b5640 (LWP 983969) "cmsRun"):
#0  0x00001487e7051868 in nanosleep () from /lib64/libc.so.6
#1  0x00001487e705176e in sleep () from /lib64/libc.so.6
#2  0x00001487e00e3980 in sig_pause_for_stacktrace () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/pluginFWCoreServicesPlugins.so
#3  <signal handler called>
#4  0x00001487e6f9112b in sched_yield () from /lib64/libc.so.6
#5  0x00001487e81d43fe in __gthread_yield () at /data/cmsbld/jenkins/workspace/build-any-ib/w/el8_amd64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/include/c++/11.4.1/x86_64-redhat-linux-gnu/bits/gthr-default.h:693
#6  std::this_thread::yield () at /data/cmsbld/jenkins/workspace/build-any-ib/w/el8_amd64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/include/c++/11.4.1/bits/std_thread.h:329
#7  tbb::detail::r1::stealing_loop_backoff::pause (this=0x7fff96800468) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/scheduler_common.h:266
#8  tbb::detail::r1::waiter_base::pause (this=0x7fff96800460) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/waiters.h:35
#9  tbb::detail::r1::external_waiter::pause (this=0x7fff96800460) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/waiters.h:138
#10 tbb::detail::r1::task_dispatcher::receive_or_steal_task<true, tbb::detail::r1::external_waiter> (this=<optimized out>, tls=..., ed=..., waiter=..., isolation=<optimized out>, fifo_allowed=<optimized out>, critical_allowed=<optimized out>) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/task_dispatcher.h:231
#11 0x00001487e81d5c88 in tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::external_waiter> (waiter=..., t=0x0, this=0x1487e5979380) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/task_dispatcher.h:350
#12 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::external_waiter> (waiter=..., t=<optimized out>, this=0x1487e5979380) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/task_dispatcher.h:458
#13 tbb::detail::r1::task_dispatcher::execute_and_wait (t=<optimized out>, wait_ctx=..., w_ctx=...) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/task_dispatcher.cpp:168
#14 0x00001487e995d38d in edm::FinalWaitingTask::wait() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/libFWCoreFramework.so
#15 0x00001487e996ae11 in edm::EventProcessor::processRuns() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/libFWCoreFramework.so
#16 0x00001487e996b3a1 in edm::EventProcessor::runToCompletion() () from /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02792/el8_amd64_gcc11/cms/cmssw-patch/CMSSW_13_2_X_2023-07-04-1100/lib/el8_amd64_gcc11/libFWCoreFramework.so
#17 0x000000000040c17d in tbb::detail::d1::task_arena_function<main::{lambda()#1}::operator()() const::{lambda()#1}, void>::operator()() const ()
#18 0x00001487e81c3847 in tbb::detail::r1::task_arena_impl::execute (ta=..., d=...) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/el8_amd64_gcc11/external/tbb/v2021.8.0-7e31093a7b4a477d01bc3946dd0bf612/tbb-v2021.8.0/src/tbb/arena.cpp:694
#19 0x00000000004101e0 in main::{lambda()#1}::operator()() const ()
#20 0x0000000000407d6c in main ()

Current Modules:

Module: L1TMuonEndCapShowerProducer:simEmtfShowers (crashed)
Module: none
Module: none
Module: none

A fatal system signal has occurred: segmentation violation

Full log: link

Looks like #42176 is the culprit

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 4, 2023

A new Issue was created by @iarspider .

@Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@iarspider
Copy link
Contributor Author

assign l1

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 4, 2023

New categories assigned: l1

@epalencia,@aloeliger you have been requested to review this Pull request/Issue and eventually sign? Thanks

@iarspider
Copy link
Contributor Author

Pinging @eyigitba, who is the author of #42176

@perrotta
Copy link
Contributor

perrotta commented Jul 5, 2023

urgent

@cmsbuild cmsbuild added the urgent label Jul 5, 2023
@eyigitba
Copy link
Contributor

eyigitba commented Jul 5, 2023

Hi, I'm looking at this, but I don't know the reason for crash for now. Would you be able to give me more information? Maybe about which RelVals are crashing? Thanks!

@perrotta
Copy link
Contributor

perrotta commented Jul 5, 2023

Hi, I'm looking at this, but I don't know the reason for crash for now. Would you be able to give me more information? Maybe about which RelVals are crashing? Thanks!

For the relval that crash and their logs please have a look at the most recent Integration Builds in https://cmssdt.cern.ch/SDT/html/cmssdt-ib/#/ib/CMSSW_13_2_X

@eyigitba
Copy link
Contributor

eyigitba commented Jul 6, 2023

I tried to reproduce the crash locally, using the same file from one of the crashes and running the modified simEmtfShowers, but it ran normally.

How can I run the workflow in an identical way to what crashed?

@aandvalenzuela
Copy link
Contributor

Hello @eyigitba, you can run it from the same cmssw environment used in the latest IBs by following these steps:

  • From lxplus in order to access cvmfs: /cvmfs/cms.cern.ch/common/cmssw-el8
  • Set the correct SCRAM architecture, for example: export SCRAM_ARCH=el8_amd64_gcc11
  • Source the default env: source /cvmfs/cms.cern.ch/cmsset_default.sh
  • Create the cmssw developer area from latest IBs: scram -a $SCRAM_ARCH project CMSSW_13_2_X_2023-07-05-2300
  • Initialize the area: cd CMSSW_13_2_X_2023-07-05-2300/src and "eval `scram run -sh`"
  • Run the failing worflow. For example, for wf 136.897 : runTheMatrix.py -i all -l 136.897 -t 4 --ibeos

Let me know if that helps!

@Dr15Jones
Copy link
Contributor

I ran one of the failing workflows in the debugger. The crash happens here

data_.insert(data_.begin() + itrs_[indexFromBX(bx)] + size(bx), object);

the value of bx is -3 and the call to indexFromBX(bx) returns 4294967295. This is because

template <class T>
unsigned BXVector<T>::indexFromBX(int bx) const {
return bx - bxFirst_;
}

and bxFirst_ is -2.

@Dr15Jones
Copy link
Contributor

Tracing further back, the bxFirst_ is set in the constructor

: bxFirst_(std::min(0, bxFirst)), bxLast_(std::max(0, bxLast)), data_(std::vector<T>(size * numBX())) {

which is happening here

The hard coded values are clearly wrong for handling this case.

@perrotta perrotta closed this as completed Jul 7, 2023
@perrotta perrotta reopened this Jul 7, 2023
@perrotta
Copy link
Contributor

perrotta commented Jul 7, 2023

Thank you @Dr15Jones for the detailed investigation. I think the origin of the problem is clear now.

The bug was already present in the code of BXVector, but it only surfaced after merging #42176 because this line now pushes into the exact bx of the digi, while before that PR it was always pushed at bx=0.

@eyigitba I think that a possible quick fix could be putting a protection into BXVector::push_back so that only if bx is comprised between bxFirst_ and bxLast_ the data_ is inserted.

Still, I don't understand the role of L135-L139 in BXVector.icc:

for (unsigned k = 0; k < itrs_.size(); k++) {
if (k > indexFromBX(bx)) {
itrs_[k]++;
}
}

itrs_ is already setup in the constructor, what is the purpose of moving it by one in the following bx's if you push data_ for a previous bx?

@eyigitba
Copy link
Contributor

eyigitba commented Jul 7, 2023

Thanks @Dr15Jones and @perrotta for further inofrmation on this. I now see where the problem is. I didn't realize that there was data with BX values outside the [-2,2] range.

I can add the protection to BXVector::push_back and/or in L1TMuonEndCapShowerProducer to protect agains insterting BX outside the [bxFirst_, bxLast_] range.

I think this shouldn't cause any issues with other workflows since we didn't see this crash before.

Thanks @aandvalenzuela for the instructions. I'll test the code with this workflow.

@mmusich
Copy link
Contributor

mmusich commented Jul 7, 2023

I didn't realize that there was data with BX values outside the [-2,2] range.

I think the underlying issue of #41645 is correlated with this. @aloeliger FYI

@aloeliger
Copy link
Contributor

@mmusich Am I reading this right that WF 136.897 is a 2021 cosmics run and we're seeing unpacker failures here?

@mmusich
Copy link
Contributor

mmusich commented Jul 7, 2023

@aloeliger

Am I reading this right that WF 136.897 is a 2021 cosmics run and we're seeing unpacker failures here?

I think so (or at least we're getting events with seemingly corrupt L1T data, with BX values outside the expected range).

@aloeliger
Copy link
Contributor

That's curious. The corrupt data only started showing up online in 2023. I think this is evidence that it's not a uGT firmware issue, but a direct unpacker failure for muons/muon showers.

@aloeliger
Copy link
Contributor

@eyigitba I think some inspecting of muon/muon shower unpackers is necessary and how the BX is assigned to them.

@eyigitba
Copy link
Contributor

eyigitba commented Jul 7, 2023

@aloeliger , this is for sure something that needs to happen for muon showers. I don't think the problem is in muons, since we didn't touch anything there. However, for the muon showers #38941 changed how unpackers/emulators work on CSC side which apparently have these issues appearing in L1T side.

We can also ping @dinyar here in case he has any insight on possible problems with muon unpackers.

@perrotta
Copy link
Contributor

perrotta commented Jul 7, 2023

@eyigitba would it be possible to provide a quick and possibly not so dirty solution for the problem at hand? We must close 13_2_0_pre3 (the last open pre) and if the problem persists we will be forced to revert #42176

I have the impression that adding a protection into BXVector::push_back so that only if bx is comprised between bxFirst_ and bxLast_ the data_ is inserted may do the job (maybe also for fixing the related issue #41645), but I had no opportunity to check.

@aloeliger
Copy link
Contributor

@perrotta I just discussed a quick solution with him. Either he or I should have it available quickly. Let me ask and then I can give an ETA.

@aloeliger
Copy link
Contributor

Okay. I'll push a quick bx boundary check. Should be available within 30 minutes at a quick guess? I'll update when I have it.

@eyigitba
Copy link
Contributor

eyigitba commented Jul 7, 2023

Thanks @aloeliger . My connection is not great for now and staying connected to lxplus is not possible for some reason.

@aloeliger
Copy link
Contributor

@perrotta I included the kind of check you and @eyigitba discussed in #42214

@iarspider
Copy link
Contributor Author

please close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants