Results for Run-1 SIM step are sensitive to order of module execution #34448

civanch · 2021-07-12T21:52:58Z

When OscarMTProducer was migrated to the new scheme of access to EventSetup (#34338), Run-1 WFs results have different simulation histories even for the 1st event.

cmsbuild · 2021-07-12T21:53:20Z

A new Issue was created by @civanch Vladimir Ivantchenko.

@Dr15Jones, @perrotta, @dpiparo, @silviodonato, @smuzaffar, @makortel, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

civanch · 2021-07-12T21:57:48Z

assign simulation, core

cmsbuild · 2021-07-12T21:58:10Z

New categories assigned: core,simulation

@Dr15Jones,@smuzaffar,@civanch,@mdhildreth,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks

makortel · 2021-07-12T21:59:52Z

The title is misleading because the reason is in the execution order of VolumeBasedMagneticFieldESProducerFromDB and XMLIdealGeometryESProducer.

makortel · 2021-07-12T23:35:11Z

Let me copy here #34338 (comment)

I did some experiments in CMSSW_12_0_X_2021-07-07-2300 with this PR.

First I confirmed Vladimir's earlier observation that leaving only MagneticField to be read in the "old way" reproduces the reference results.

Then I added only the esConsumes() for the MagneticField, but still read it from EventSetup in the old way. That results in differences.

Logs from Tracer Service show that the relevant change in the order of ESProducers is that the VolumeBasedMagneticFieldESProducerFromDB (and PoolDBESSource for RunInfoRcd, MagFieldConfigRcd, and MFGeometryFileRcd) moved to be run before XMLIdealGeometryESProducer and MuonGeometryConstantsESModule with the esConsumes.

To check if the order of VolumeBasedMagneticFieldESProducerFromDB and XMLIdealGeometryESProducer would be important, I made VolumeBasedMagneticFieldESProducerFromDB to depend on the DDCompactView produced by XMLIdealGeometryESProducer. That test reproduced the reference results!

These symptoms indicate that something in what VolumeBasedMagneticFieldESProducerFromDB and XMLIdealGeometryESProducer do depends on their order (likely independently of #34338). @cms-sw/geometry-l2

makortel · 2021-07-12T23:44:31Z

I've done some further experimentation.

I dumped the values of MagneticField by setting process.VolumeBasedMagneticFieldESProducer.debugBuilder = True, but the text dump was identical with both ESProducer orderings.

I also dumped the DDCompactView with GeometryInfoDump in OscarMTMasterThread, but the text dump was identical with both ESProducer orderings (also after changing the dump number format to %0.10e). (I tried to use PerfectGeometryAnalyzer first, but with that the ESProducer were always run in one of the orderings)

@civanch How did you produce the geometry dumps for your #34338 (comment) ?

The only difference found in one box from tracker barrel rotation matrix: in case of this PR

VinInn · 2021-07-13T08:56:56Z

valgrind?

civanch · 2021-07-13T09:17:00Z

I use following option:
process.g4SimHits.FileNameGDML = 'cms2010_8July.gdml'
process.g4SimHits.FileNameField = 'cms2010_8JulyMF.txt'

AN observation of different rotation matrix mentioned in #34338 is not justified, because it is a printout - result may depend on precision of cout. Except this rotation matrix everything else was identical. There is a possibility that some modules use random numbers at initialisation (I do not know such cases but it is a possible theory), so this change of order of modules change histories.

makortel · 2021-07-13T13:31:41Z

@civanch Thanks. I used both options and found zero differences between the two orderings of the ESProducers.

I'll try valgrind next (thanks @VinInn).

makortel · 2021-07-22T00:36:14Z

Here is a test result that shows differences across the board in 9.0 of a PR that should not have any physics impact #34577 (comment)

makortel · 2021-07-22T00:41:52Z

I had run valgrind on one of the ordering options, but it didn't reveal anything interesting.

I also collected stack traces with gdb breaking into the random number generation. I had patience to collect logs for 3.4M calls (that was not enough to complete one event), but comparing those showed no difference between the two ESProducer orders.

civanch · 2021-08-27T22:50:24Z

There is some instability in DD4Hep WF #34995 not yet understood.

makortel · 2022-04-29T14:47:13Z

#37592 (comment) (and two earlier tests) shows lots of differences 9.0 workflow which is a Run 1 MC workflow (the PR itself is technical, but it causes many packages to be built). I'm reporting this mostly for the record, although it would be nice to understand the cause (but it is also incredibly laborious/difficult to investigate).

makortel · 2023-11-27T22:47:44Z

Workflow 9.0 started to show spurious comparison differences again, I opened a separate issue #43415.

makortel · 2024-02-29T22:35:32Z

#44271 (comment) showed lots of differences in 9.0 (the PR itself is technical)

cmsbuild added the pending-assignment label Jul 12, 2021

civanch mentioned this issue Jul 12, 2021

Use of ESGetToken in Oscar producer #34338

Merged

cmsbuild added core-pending pending-signatures simulation-pending and removed pending-assignment labels Jul 12, 2021

civanch changed the title ~~Results for Run-1 SIM step are sensitive to the EventSetup interface~~ Results for Run-1 SIM step are sensitive to order of module execution Jul 12, 2021

makortel mentioned this issue Jul 22, 2021

Add warning that lists legacy modules if any are configured. #34577

Merged

makortel mentioned this issue Aug 27, 2021

Migrate some modules in SimCalorimetry to esConsumes #34995

Merged

makortel mentioned this issue Dec 2, 2021

Instabilities in 11634.911 (DD4Hep) workflow comparisons #35109

Open

makortel mentioned this issue Apr 29, 2022

Mark legacy EDModule constructors deprecated #37592

Merged

makortel mentioned this issue May 6, 2022

[master] Drop references to legacy generators #37813

Merged

makortel mentioned this issue Apr 10, 2023

redefine IT digitizer ToF window + add customize fcn for activating IT signal shape with RelVals #41273

Merged

makortel mentioned this issue Nov 27, 2023

Spurious comparison differences in workflows 9.0 and 25.0 #43415

Open

makortel mentioned this issue Feb 29, 2024

Remove code related to legacy modules and SharedResources #44271

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Results for Run-1 SIM step are sensitive to order of module execution #34448

Results for Run-1 SIM step are sensitive to order of module execution #34448

civanch commented Jul 12, 2021

cmsbuild commented Jul 12, 2021 •

edited

Loading

civanch commented Jul 12, 2021

cmsbuild commented Jul 12, 2021

makortel commented Jul 12, 2021

makortel commented Jul 12, 2021

makortel commented Jul 12, 2021

VinInn commented Jul 13, 2021

civanch commented Jul 13, 2021

makortel commented Jul 13, 2021

makortel commented Jul 22, 2021

makortel commented Jul 22, 2021

civanch commented Aug 27, 2021

makortel commented Apr 29, 2022

makortel commented Nov 27, 2023

makortel commented Feb 29, 2024

Results for Run-1 SIM step are sensitive to order of module execution #34448

Results for Run-1 SIM step are sensitive to order of module execution #34448

Comments

civanch commented Jul 12, 2021

cmsbuild commented Jul 12, 2021 • edited Loading

civanch commented Jul 12, 2021

cmsbuild commented Jul 12, 2021

makortel commented Jul 12, 2021

makortel commented Jul 12, 2021

makortel commented Jul 12, 2021

VinInn commented Jul 13, 2021

civanch commented Jul 13, 2021

makortel commented Jul 13, 2021

makortel commented Jul 22, 2021

makortel commented Jul 22, 2021

civanch commented Aug 27, 2021

makortel commented Apr 29, 2022

makortel commented Nov 27, 2023

makortel commented Feb 29, 2024

cmsbuild commented Jul 12, 2021 •

edited

Loading