Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(DAQ) File based protocol update #40099

Merged
merged 11 commits into from
Nov 25, 2022
Merged

Conversation

smorovic
Copy link
Contributor

PR description:

Two changes are implemented for the file-based output protocol:

Early "initemp" file marker:

  • A file marker is created in constructor (.initemp of .ini depending on the stream type and content availability at construction). From 12_6_X, as noticed in tests, beginRun (or globalBeginRun) in GlobalEvFOutputModule can be called after event processing in the source starts, so creating such markers at beginRun is no longer sufficient. Marker needs to be created early for hltd daemon to know which streams are in the run, and therefore wait for output completion until lumisection can be closed in the merging system. Standard INI file is still created with information from beginRun, except in case of DQMHistograms where it is empty, so creation was moved to constructor.

  • Changes are implemented in the GlobalEvFOutputModule (data streams), DQMFileSaverPB (DQMHistograms stream) and L1/HLTriggerJsonMonitoring (L1/HLTRates streams). Note: in combination with correspoding changes in hltd, this patch is required for CMSSW_12_6_X being used in the HLT environment, as collecting output doesn't work with current version of the sw.

Discard LS:

  • appearance of a file marker in hltd run directory will trigger discard of specific LS data locally in CMSSW. Motivation for this is freeing space in temporary ramdisks in HLT to allow data processing to be unblocked. With concurrent lumisections (N = 2 by default in HLT), LS potentially doesn't get closed until N new lumisections are queued, so a range of several lumisections is checked by the input source. In the output module, a new file marker check (this is a fast file operation done in ramdisk) is performed before each event is written. Discarding is implemented only for the data streams, since special streams are negligible in size.

PR validation:

Tested and verified in a small DAQ cluster providing environment (services) analogous to a production BU-FU setup.

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40099/33081

  • This PR adds an extra 68KB to repository

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40099/33082

  • This PR adds an extra 68KB to repository

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40099/33083

  • This PR adds an extra 68KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @smorovic (Srecko Morovic) for master.

It involves the following packages:

  • DQMServices/FileIO (dqm)
  • EventFilter/Utilities (daq)
  • HLTrigger/JSONMonitoring (hlt)

@Martin-Grunewald, @emanueleusai, @emeschi, @ahmad3213, @cmsbuild, @missirol, @jfernan2, @syuvivida, @pmandrik, @micsucmed, @smorovic, @rvenditti can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @missirol, @silviodonato, @barvic, @fwyzard this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@smorovic
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-41a425/29167/summary.html
COMMIT: fb80c07
CMSSW: CMSSW_12_6_X_2022-11-21-1100/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/40099/29167/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 10 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3417239
  • DQMHistoTests: Total failures: 12
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3417205
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 206 log files, 48 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@missirol
Copy link
Contributor

+hlt

@missirol
Copy link
Contributor

@cms-sw/dqm-l2, please prioritize the review of this PR. It is a bugfix that should get into 12_6_0_pre5, if possible.

@rvenditti
Copy link
Contributor

Hi, given that this is for the online, do you think it could be useful to do playback test on DQM machines? If yes, can we use 12_6 + this PR + DQM stream files from present data taking produced in 12_4 ?

@smorovic
Copy link
Contributor Author

Hi @rvenditti.
this touches only .ini file creation for the DQM modules and from what I can see, anelasticDQM.py (hltd script that handles output) doesn't do anything with those.
Also, do you set fakeFilterUnitMode to true for DQM? In that case none of the changed code is used anyway.
IMHO, DQM test is not needed, and especially in case that parameter is true.

For what concerns HLT use of the module, I made a test including DQMFileSaverPB module (with the latest version of this PR).

In case you will try it, I am not sure how 12_4_X streamer files will behave with 12_6. Streamer format hasn't been changed (as far as I know), but with ROOT and DataFormats there could be differences.

@pmandrik
Copy link
Contributor

Hi, @rvenditti , yes, clients load DQM.Integration.config.environment_cfi, e.g.

process.load("DQM.Integration.config.environment_cfi")

where
dqmSaverPB.fakeFilterUnitMode = True

@rvenditti
Copy link
Contributor

Thanks @smorovic and @pmandrik , so we will not do the playback test and proceed with the usual PR review.

@missirol
Copy link
Contributor

@cms-sw/dqm-l2 , please let us know if you can sign this PR soon.

@cms-sw/orp-l2, I'm hoping this can still make it in 12_6_0_pre5, as discussed yesterday.

@rvenditti
Copy link
Contributor

rvenditti commented Nov 24, 2022

+dqm

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@missirol
Copy link
Contributor

missirol commented Nov 25, 2022

This PR was not included in 12_6_0_pre5. Since I asked for the backport to 12_6_X, I opened it in #40156.

@smorovic, you could remove "(12_6_X)" from the title of this PR, just for clarity.

Edit: done by Andrea, thanks.

@perrotta perrotta changed the title (DAQ) File based protocol update (12_6_X) (DAQ) File based protocol update Nov 25, 2022
@perrotta
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 168c9f8 into cms-sw:master Nov 25, 2022
@smorovic smorovic deleted the 12_6_X-discardls branch August 24, 2023 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants