Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Pixel DQM modules prohibiting concurrent lumis #32805

Merged
merged 1 commit into from
Feb 4, 2021

Conversation

sroychow
Copy link
Contributor

@sroychow sroychow commented Feb 3, 2021

PR description:

In this PR, SiPixelRawDataErrorSource & SiPixelDigiSource modules are updated to allow concurrent LS.

PR validation:

Test with : runTheMatrix.py -t 4 -l 4.53

No differences observed when comparing the harvested DQM output files with dqmMemoryStats.py script.

if this PR is a backport please specify the original PR and why you need to backport that PR:

No backport is needed.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 3, 2021

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-32805/20981

  • This PR adds an extra 24KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 3, 2021

A new Pull Request was created by @sroychow (Suvankar Roy Chowdhury) for master.

It involves the following packages:

DQM/SiPixelMonitorDigi
DQM/SiPixelMonitorRawData

@andrius-k, @kmaeshima, @ErnestaP, @ahmad3213, @cmsbuild, @jfernan2, @rvenditti can you please review it and eventually sign? Thanks.
@arossi83, @hdelanno, @sroychow, @fioriNTU, @jandrea, @idebruyn, @threus this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@jfernan2
Copy link
Contributor

jfernan2 commented Feb 3, 2021

This PR solves issue mentioned in comment #25090 (comment)
Modules prohibiting concurrent lumis: SiPixelRawDataErrorSource & SiPixelDigiSource

@jfernan2
Copy link
Contributor

jfernan2 commented Feb 3, 2021

please test

@mmusich
Copy link
Contributor

mmusich commented Feb 3, 2021

Modules prohibiting concurrent lumis: SiPixelRawDataErrorSource & SiPixelDigiSource

just to clarify futher: #25090 (comment) also mentions some L1T modules.
As far as I can tell, these are still blocking

@jfernan2
Copy link
Contributor

jfernan2 commented Feb 3, 2021

Yes, L1T conveners and DQM contacts have been notified too on 16/12/2020 but no answer yet.
FYI: @abrinke1

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 3, 2021

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2a8f50/12688/summary.html
COMMIT: e97c3d4
CMSSW: CMSSW_11_3_X_2021-02-03-0800/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/32805/12688/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 37
  • DQMHistoTests: Total histograms compared: 2752926
  • DQMHistoTests: Total failures: 3
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2752901
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 36 files compared)
  • Checked 156 log files, 37 edm output root files, 37 DQM output files

@jfernan2
Copy link
Contributor

jfernan2 commented Feb 4, 2021

+1
There are changes in wf 11624.911 in L1TEMU folder, but unrelated to this PR
https://tinyurl.com/y6dfprgu

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 4, 2021

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@qliphy
Copy link
Contributor

qliphy commented Feb 4, 2021

+1

@cmsbuild cmsbuild merged commit 9f05175 into cms-sw:master Feb 4, 2021
Copy link
Contributor

@mmusich mmusich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sroychow, thanks for this.
Sorry for late feedback... if we ever touch again this module, few stylistic suggestions

#include "DQM/SiPixelMonitorDigi/interface/SiPixelDigiModule.h"

#include <DQMServices/Core/interface/DQMOneEDAnalyzer.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being picky here, I think <> are reserved to system includes and cmssw includes should use ""

int thisls = lb.id().luminosityBlock();
const bool resetCounters = luminosityBlockCache(lb.index());

float averageBPIXFed = float(nBPIXDigis) / 32.;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future memory and readability 32 and 8 can be made named constants.

float averageFPIXFed = float(nFPIXDigis) / 8.;

if (averageDigiOccupancy) {
for (int i = 0; i != 40; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This 40 can be taken from here

MAXSiPixelFEDID = 40, // increase from 39 for the pilot blade fed

@makortel
Copy link
Contributor

makortel commented Feb 4, 2021

Thanks for the update, but I'm afraid the changes here are not sufficient for the intended behavior in the presence of concurrent lumis (assuming I understood the code correctly).

If I understood correctly, the aim is to (have an ability to) reset some counters at every N lumis. In this PR the decision of whether to reset is done in globalBeginLumi, and the actual reset in globalEndLumi. This approach can lead to unintended behavior, e.g. with the following sequence of lumis and events (for the sake of simplicity let's say N=2)

lumi 1 begin
lumi 1 event 1
lumi 2 begin    # mark that counters are reset at the end of lumi 3
lumi 1 event 2
lumi 2 event 1
lumi 2 end      # reset counters, fill histograms
lumi 3 begin
lumi 1 event 3  # counter increments get accounted for the same group as lumi 3
lumi 3 event 1
lumi 1 end
lumi 3 event 2
...

One way to do the counting properly would be to have a separate struct for the counters

struct Counters {
  int nBPIXDigis = 0;
  ...
};

and use that as the lumi cache, but sharing one instance of it between lumis along

class SiPixelDigiSource ... {
  std::shared_ptr<Counters> latestCounters_;
};

std::shared_ptr<Counters> SiPixelDigiSource::globalBeginLuminosityBlock(...) {
  unsigned int currentLS = lumi.id().luminosityBlock();
  if (currentLS % 10 == 0) {
    // reset counters by making a new object
    latestCounters_ = std::make_shared<Counters>();
  }
  return latestCounters_;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants