Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EventSetup Records with large payloads #33436

Closed
makortel opened this issue Apr 15, 2021 · 18 comments
Closed

EventSetup Records with large payloads #33436

makortel opened this issue Apr 15, 2021 · 18 comments

Comments

@makortel
Copy link
Contributor

Enabling concurrent IOVs has a risk to increase memory usage, because the payloads for all active IOVs need to be kept in memory as long as events from those IOVs are being processed. One way to limit this memory increase is to disable concurrency for EventSetup Records that have large payloads (and hopefully have long IOVs). The purpose of this issue is to identify such Records.

@cmsbuild
Copy link
Contributor

A new Issue was created by @makortel Matti Kortelainen.

@Dr15Jones, @dpiparo, @silviodonato, @smuzaffar, @makortel, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@makortel
Copy link
Contributor Author

assign core, alca

@cmsbuild
Copy link
Contributor

New categories assigned: core,alca

@Dr15Jones,@smuzaffar,@christopheralanwest,@tlampen,@pohsun,@yuanchao,@makortel,@francescobrivio,@malbouis you have been requested to review this Pull request/Issue and eventually sign? Thanks

@makortel
Copy link
Contributor Author

@cms-sw/alca-l2 I can quickly think of

  • IdealGeometryRecord
  • SiPixelTemplateDBObjectRcd

as Records that can have large payloads, could you comment if this is correct and what other such Records we have? (let's say "large" is more than 10 MB)

@christopheralanwest
Copy link
Contributor

assign db

I think that @ggovi is probably the best person to answer this question.

@cmsbuild
Copy link
Contributor

New categories assigned: db

@ggovi you have been requested to review this Pull request/Issue and eventually sign? Thanks

@mmusich
Copy link
Contributor

mmusich commented Apr 15, 2021

@makortel

can quickly think of

IdealGeometryRecord
SiPixelTemplateDBObjectRcd

these are not even close to the absolute largest which is the per-pixel Gain Calibration used for offline reconstruction.
Here is a list of the worst offenders (5MB and above) extracted from the last open IOV of all tags in Prompt Reco:

  239M  SiPixelGainCalibrationOfflineRcd,-
   46M  EcalPulseCovariancesRcd,-
   35M  SiPixel2DTemplateDBObjectRcd,numerator
   22M  SiStripPedestalsRcd,-
   21M  DQMReferenceHistogramRootFileRcd,-
   20M  SiStripNoisesRcd,-
   20M  GBRWrapperRcd,PFGCorrectionBar
   20M  GBRWrapperRcd,PFGCorrectionEndHighR9
   20M  GBRWrapperRcd,PFGCorrectionEndLowR9
   20M  GBRWrapperRcd,PFEcalResolution
   13M  GBRWrapperRcd,PFLCCorrection
  9.8M  GBRWrapperRcd,PFLCorrectionBar
  9.8M  GBRWrapperRcd,PFLCorrectionEnd
  9.3M  CSCDBNoiseMatrixRcd,-
  7.3M  GBRWrapperRcd,wgbrph_EBCorrection
  6.7M  GBRDWrapperRcd,gedphoton_EECorrection_50ns
  6.1M  L1MuCSCPtLutRcd,-
  6.0M  SiPixelGainCalibrationForHLTRcd,-
  5.9M  GBRWrapperRcd,wgbrph_EBUncertainty
  5.8M  IdealGeometryRecord,-
  5.6M  GBRDWrapperRcd,gedphoton_EECorrection_25ns
  5.3M  GBRWrapperRcd,PFResolution
  5.3M  GBRWrapperRcd,PFGlobalCorrection

you can find the complete list here:
https://gist.github.com/mmusich/be5cfc4208f7146a333830f11d0a423e

@ggovi
Copy link
Contributor

ggovi commented Apr 15, 2021

Thanks Marco for this prompt answer. We need then to identify the threshold. How will the exclusion list be implemented? Hard-coded or configurable?

@ggovi
Copy link
Contributor

ggovi commented Apr 15, 2021

An other possibility is to avoid at all to keep the payloads in memory, given that they are all cached permanently in frontier...

@makortel
Copy link
Contributor Author

makortel commented Apr 16, 2021

Here is a list of the worst offenders (5MB and above) extracted from the last open IOV of all tags in Prompt Reco:
...
you can find the complete list here:
https://gist.github.com/mmusich/be5cfc4208f7146a333830f11d0a423e

Thanks @mmusich!

Does the list contain only the payloads in the CondDB? The EventSetup products created within CMSSW contribute to the memory requirement too. Does anyone have any hunch on those, or they have to be looked for with a profiler?

@makortel
Copy link
Contributor Author

How will the exclusion list be implemented? Hard-coded or configurable?

Simplest way is to disable the concurrent IOV support for the relevant Records in the C++ code along

class Dummy2Record : public edm::eventsetup::EventSetupRecordImplementation<Dummy2Record> {
public:
static constexpr bool allowConcurrentIOVs_ = false;
};

The level of concurrency can also be set in the configuration per Record (for those for which the concurrency is not disabled) along

process.options.eventSetup = cms.untracked.PSet(
    numberOfConcurrentIOVs = cms.untracked.uint32(2), # default concurrency
    forceNumberOfConcurrentIOVs = cms.untracked.PSet(
        SiPixelGainCalibrationOfflineRcd = cms.untracked.uint32(1),
        EcalPulseCovariancesRcd = cms.untracked.uint32(1),
        ...
    )
)

I would believe the hardcoding to be good-enough to get most threading efficiency benefits (also I can't think of a natural place for a configuration that would automatically propagate to all applications).

@makortel
Copy link
Contributor Author

An other possibility is to avoid at all to keep the payloads in memory, given that they are all cached permanently in frontier...

I probably misunderstood, but I believe asking the payloads from the Frontier on each event would (significantly) decrease the event processing throughput.

@mmusich
Copy link
Contributor

mmusich commented Apr 16, 2021

Does the list contain only the payloads in the CondDB?

correct

The EventSetup products created within CMSSW contribute to the memory requirement too. Does anyone have any hunch on those, or they have to be looked for with a profiler?

I am wondering if it would be possible to get the (non persisted) records data modifying this?
Otherwise yes, I think it needs to be profiled.

@makortel
Copy link
Contributor Author

If I got it right (from condDbBrowser), the largest payload with non-Run IOV is EcalPedestalsRcd (with time IOV) with size of 2.3 MB. So the maximum possible memory increase is not necessarily that large in practice (currently framework synchronizes anyway at Run boundaries, and AFAIK we don't really have jobs processing multiple Runs).

Actually, how easy would it be to get a list of tags that have non-run IOVs?

@mmusich
Copy link
Contributor

mmusich commented Apr 16, 2021

Actually, how easy would it be to get a list of tags that have non-run IOVs?

straightforward.
These are the only records in Prompt Reco with non-Run IOVs:

Record Label Tag Time Type Syncronization
BeamSpotObjectsRcd - BeamSpotObjects_PCL_byLumi_v0_prompt Lumi pcl
DTHVStatusRcd - DTHVStatus_V05_hlt Time express
DTKeyedConfigContainerRcd - DTKeyedConfig_V06_hlt Hash hlt
EcalLaserAPDPNRatiosRcd - EcalLaserAPDPNRatios_prompt_v2 Time pcl
LHCInfoRcd - LHCInfoEndFill_prompt_v2 Time prompt
LumiCorrectionsRcd - LumiPCC_Corrections_prompt Lumi pcl
SiPixelQualityFromDbRcd - SiPixelQuality_byPCL_prompt_v2 Lumi pcl
SiStripDetVOffRcd - SiStripDetVOff_v6_prompt Time prompt

for the record one can get it with:

import CondCore.Utilities.conddblib as conddb
con = conddb.connect(url = conddb.make_url("pro"))
session = con.session()
IOV     = session.get_dbtype(conddb.IOV)
TAG     = session.get_dbtype(conddb.Tag)
GT      = session.get_dbtype(conddb.GlobalTag)
GTMAP   = session.get_dbtype(conddb.GlobalTagMap)
RUNINFO = session.get_dbtype(conddb.RunInfo)

GTMap = session.query(GTMAP.record, GTMAP.label, GTMAP.tag_name).\
        filter(GTMAP.global_tag_name == "112X_dataRun3_Prompt_v5").\
        order_by(GTMAP.record, GTMAP.label).\
        all()

print "| Record | Label |Tag |Time Type |Syncronization|"
print "| -------| ------|----|----------|--------------|"
for element in GTMap:
    Record = element[0]
    Label  = element[1]
    Tag    = element[2]

    TagInfo = session.query(TAG.synchronization,TAG.time_type).filter(TAG.name == Tag).all()[0]
    if(TagInfo[1]!="Run"):
        print "|",Record,"|",Label,"|",Tag,"|",TagInfo[1],"|",TagInfo[0],"|"

@makortel
Copy link
Contributor Author

Thanks @mmusich! Correlating those to your earlier list gives

  • BeamSpotObjectsRcd 64 kB
  • DTHVStatusRcd 176 kB
  • DTKeyedConfigContainerRcd 64 kB
  • EcalLaserAPDPNRatiosRcd 1.2 MB
  • LHCInfoRcd 100 kB
  • LumiCorrectionsRcd: 64 kB
  • SiPixelQualityFromDbRcd: 64 kB
  • SiStripDetVOffRcd: 136 kB

so ~1.9 MB in total. That alone sounds something I'd expect us to live with (i.e. at most 2 MB memory increase per job during any IOV transition period). This number still misses all the ESProducts constructed within the job, but I'd imagine even factor of 10 increase to be tolerable.

@makortel
Copy link
Contributor Author

Given that the largest possible increase from DB payloads would be around 2 MB, and that the transient ESProducts in non-Run IOV records are unlikely (many) magnitudes larger, we could enable concurrent IOVs by default (when concurrent lumis are enabled), and deal with possible problems if they arise.

@makortel
Copy link
Contributor Author

+1

makortel added a commit to makortel/cmssw that referenced this issue Mar 30, 2022
I was supposed to do this at the same time as
cms-sw#35302
that followed
cms-sw#34231
and the conclusion in
cms-sw#33436
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants