Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replay to test new FPIX LA PCL #4793

Closed
wants to merge 2 commits into from

Conversation

francescobrivio
Copy link
Contributor

@francescobrivio francescobrivio commented Feb 16, 2023

Replay Request

Requestor
AlCaDB

Describe the configuration

  • Release: CMSSW_12_6_4
  • Run: 359688 (2022 cosmic run with tracker HV ON)
  • GTs:
    • expressGlobalTag: 126X_dataRun3_Express_Candidate_2023_02_16_13_39_00
    • promptrecoGlobalTag: 126X_dataRun3_Prompt_Candidate_2023_02_16_13_40_28
    • alcap0GlobalTag: 126X_dataRun3_Prompt_Candidate_2023_02_16_13_40_28
  • Additional changes:
    • Added SiPixelCalCosmics ALCARECO to ExpressCosmic configuration
    • Added PromptCalibProdSiPixelLAMCS PCL workflow to ExpressCosmic configuration
    • Ignored all the other streams for this run (RPCMON, Calibration, Physics, NanoDST)
    • Config changes to run CMSSW_12_6_4 are copied from Replay testing CMSSW_12_6_4 and RucioCatalog fro MWGR2 #4792

Purpose of the test
The purpose of this replay is to test a new PCL workflow (FPIX LorentzAngle) which was introduced in CMSSW_12_6_X
in PR #40734.
The GT candidates have been udpated to included a new DropboxMetadata tag containing the destination tags for
the new PCL worflow.

T0 Operations cmsTalk thread
https://cms-talk.web.cern.ch/t/replay-test-of-cmssw-12-6-4/20428

@francescobrivio
Copy link
Contributor Author

test syntax please

@francescobrivio
Copy link
Contributor Author

@germanfgv I have to questions for you:

  1. the run to be replayed is from 2022, taken with 12_4_X, I hope this PR is fine to repack it in 12_6_X?
  2. I copied from your other 12_6_X Replay all the "file catalog configurations", please confirm they are ok?

Thanks a lot!

@mmusich
Copy link
Contributor

mmusich commented Feb 16, 2023

@tsusa @ferencek @mroguljic FYI

@germanfgv
Copy link
Contributor

We need to split the express processing, as 12_6_X wont be able to use 12_4_X files. This is actually what you are doing, as you are not overriting any of the 12_4_X versions. But what this also means, is that the override configuration should be the one that works with 12_4_X:

overrideCatalog="trivialcatalog_file:/cvmfs/cms.cern.ch/SITECONF/T0_CH_CERN/PhEDEx/storage.xml?protocol=eos"

@francescobrivio
Copy link
Contributor Author

We need to split the express processing, as 12_6_X wont be able to use 12_4_X files. This is actually what you are doing, as you are not overriting any of the 12_4_X versions. But what this also means, is that the override configuration should be the one that works with 12_4_X:

overrideCatalog="trivialcatalog_file:/cvmfs/cms.cern.ch/SITECONF/T0_CH_CERN/PhEDEx/storage.xml?protocol=eos"

Ok I can revert this line!
Is this what is causing the syntax check failure?

@mmusich
Copy link
Contributor

mmusich commented Feb 16, 2023

We need to split the express processing, as 12_6_X wont be able to use 12_4_X files.

mmh, then this replay is pointless, right?
We don't have the new PCL workflow in 12.4.x (yet), and I am not sure it makes sense to backport all the way to 12.4.x

@francescobrivio
Copy link
Contributor Author

francescobrivio commented Feb 16, 2023

We need to split the express processing, as 12_6_X wont be able to use 12_4_X files.

mmh, then this replay is pointless, right? We don't have the new PCL workflow in 12.4.x (yet), and I am not sure it makes sense to backport all the way to 12.4.x

ah you might be right indeed 😞 I was hoping to simply repack in 12_4_X and then run express in 12_6_X... @germanfgv is that possible?

If not the only option is to hope that during the upcoming MWGR we can collect a few cosmics runs with Trk HV ON and use those for the replay...

@francescobrivio
Copy link
Contributor Author

test syntax please

@malbouis
Copy link
Contributor

If not the only option is to hope that during the upcoming MWGR we can collect a few cosmics runs with Trk HV ON and use those for the replay...

Maybe in case it is not possible to run the Express processing, we could consider to run it in Prompt for this replay?

@mmusich
Copy link
Contributor

mmusich commented Feb 16, 2023

we could consider to run it in Prompt for this replay?

I am not sure the system is designed for that. E.g. how will the uploads to the GUI and DB happen?

@malbouis
Copy link
Contributor

we could consider to run it in Prompt for this replay?

I am not sure the system is designed for that. E.g. how will the uploads to the GUI and DB happen?

that's right, there would be no upload, just the AlCaReco and AlCaPrompt files produced (although I'd have to think it through more carefully).

@mmusich
Copy link
Contributor

mmusich commented Feb 16, 2023

that's right, there would be no upload, just the AlCaReco and AlCaPrompt files produced

I guess that assuming it works technically, this can be sort of a high stat validation that cannot be achieved in relvals. A real replay to test the actual configuration would still be needed down the line.

@germanfgv
Copy link
Contributor

We can run the replay. We can repack with 12_4_x and perform the Express reco step with 12_6_4. What I'm not completely sure, is whether the catalog override is going to work as I expect, given that 12_4_x requires TFC and 12_6_4.

I'll run the replay manually, to follow it closely. Worst case scenario, we don;t override the catalog for this replay.

@francescobrivio
Copy link
Contributor Author

We can run the replay. We can repack with 12_4_x and perform the Express reco step with 12_6_4. What I'm not completely sure, is whether the catalog override is going to work as I expect, given that 12_4_x requires TFC and 12_6_4.

I'll run the replay manually, to follow it closely. Worst case scenario, we don;t override the catalog for this replay.

Thanks a lot German! when you start it please let us know the monitoring id and links!

@germanfgv
Copy link
Contributor

TFC override didn't work, so I simply disabled it to run this test. It seems to be working so far:

https://monit-grafana.cern.ch/d/t_jr45h7k/cms-tier0-replayid-monitoring?orgId=11&var-Bin=5m&var-ReplayID=230217102402&var-JobType=All&var-WorkflowType=All

@francescobrivio
Copy link
Contributor Author

ok we checked the output:

  • the db file is produced in /eos/cms/store/unmerged/tier0_harvest/2023/Run359688@SiPixelLAMCS_pcl@b88ff5ae-f869-4c1a-a2da-0fa2eb564a91*
    • but unfortunately it's empty (I guess no condition was produced), so there is no upload to the DB
  • the alcarecos and apcaprompts can be found in:
    • FEVT: /eos/cms/tier0/store/backfill/1/express/Tier0_REPLAY_2023/ExpressCosmics/FEVT/Express-v17102402/000/359
    • ALCAPROMPT: /eos/cms/tier0/store/backfill/1/express/Tier0_REPLAY_2023/StreamExpressCosmics/ALCAPROMPT/PromptCalibProdSiPixelLAMCS-Express-v17102402
    • ALCARECO: /eos/cms/tier0/store/backfill/1/express/Tier0_REPLAY_2023/StreamExpressCosmics/ALCARECO/SiPixelCalCosmics-Express-v17102402
    • DQM: https://tinyurl.com/2pvuzzua

Could the experts ( @mmusich @tsusa @ferencek @mroguljic ) take a look if you can gather meaningful feedback from this? thanks a lot!

@mmusich
Copy link
Contributor

mmusich commented Feb 17, 2023

but unfortunately it's empty (I guess no condition was produced), so there is no upload to the DB

I think that's expected, normally this is supposed to be used in the mulit-run harvester to give meaningful results (FPix acceptance in cosmics is tiny)

take a look if you can gather meaningful feedback from this? thanks a lot!

given the minuscule amount of data actually processed, it looks OK from a quick look. Tanja might want to check more in details.

@francescobrivio
Copy link
Contributor Author

Thanks @mmusich!
Ok I then believe that this replay proves the new PCL wf does not break anything in Tier0, so we can consider it validated and we can put it in production - to be discussed at the next JointOps meeting.
In the meanwhile @tsusa if you have any feedback/comment please let us know!

@tvami
Copy link
Contributor

tvami commented Feb 22, 2023

@francescobrivio this can be closed now, right?

@francescobrivio
Copy link
Contributor Author

Integrated in #4794

@tsusa
Copy link

tsusa commented Feb 22, 2023

Thanks @mmusich! Ok I then believe that this replay proves the new PCL wf does not break anything in Tier0, so we can consider it validated and we can put it in production - to be discussed at the next JointOps meeting. In the meanwhile @tsusa if you have any feedback/comment please let us know!

@francescobrivio, as Marco said statistics was very small, but things looks as expected.

@francescobrivio francescobrivio deleted the add_FPIX_LA_PCL branch March 14, 2024 07:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants