Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AlcaBeamMonitor hoards memory #42995

Open
mmusich opened this issue Oct 11, 2023 · 13 comments
Open

AlcaBeamMonitor hoards memory #42995

mmusich opened this issue Oct 11, 2023 · 13 comments

Comments

@mmusich
Copy link
Contributor

mmusich commented Oct 11, 2023

Dumping in a gitHub issue the content of an old Hypernews message (dating back from 2017) and some private conversations:

One noticeable memory "hoarder" is the AlcaBeamMonitor: it collects a copy of all vertices and all selected tracks (in a vector of BSTrkParameters) for the full lumisection. In the 2K events in this comparison this module accumulated ~60MB of this.
The values get reset at the end of lumi after being used in the fit.
I suppose, with larger pileups and longer lumis we could get around 0.2-0.4 GB from this module, unless it starts running on rate-controlled paths.

A suggested mitigation from @slava77 was:

Perhaps a simple mitigation could be to save BSTrkParameters in float; that would be a x2 saving for free (considering that the underlying tracks are floats). Beyond that, some more developments may be needed.

but:

track vxyz,pxyz are floats only in storage, while in memory it's a double; the covariance is a float though. So, technically some fine precision can be lost in BSTrkParameters. It's worth to test if that matters though.

Cc: @gennai @francescobrivio @lguzzi @dzuolo

@cmsbuild
Copy link
Contributor

A new Issue was created by @mmusich Marco Musich.

@smuzaffar, @Dr15Jones, @makortel, @sextonkennedy, @rappoccio, @antoniovilela can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@makortel
Copy link
Contributor

assign dqm,db

@cmsbuild
Copy link
Contributor

New categories assigned: dqm,db

@rvenditti,@syuvivida,@tjavaid,@nothingface0,@antoniovagnerini,@francescobrivio,@saumyaphor4252,@perrotta,@consuegs you have been requested to review this Pull request/Issue and eventually sign? Thanks

@slava77
Copy link
Contributor

slava77 commented Oct 12, 2023

type tracking

@makortel
Copy link
Contributor

The memory hoarding in AlcaBeamMonitor came up in #46040 (comment) (with ~200k events) and #46040 (comment). The hoarding appears to be ~58 kB/event.

Can this be improved for 2025?

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 17, 2024

cms-bot internal usage

@makortel
Copy link
Contributor

type performance-improvements

@francescobrivio
Copy link
Contributor

I had totally forgotten about this, I'm having a look now.

Following the original suggestion

Perhaps a simple mitigation could be to save BSTrkParameters in float; that would be a x2 saving for free

I could try to replace with floats the BSTrkParameters values which are filled here:

BSTrkParameters BSTrk(fz0, fsigmaz0, fd0, fsigmad0, fphi0, fpt, 0., 0.);
BSTrk.setVx(fvx);
BSTrk.setVy(fvy);

with track quantities:
fpt = track->pt();
feta = track->eta();
fphi0 = track->phi();
fcharge = track->charge();
fnormchi2 = track->normalizedChi2();
fd0 = track->d0();
if (refBS)
fd0bs = -1 * track->dxy(refBS->position());
else
fd0bs = 0.;
fsigmad0 = track->d0Error();
fz0 = track->dz();
fsigmaz0 = track->dzError();
ftheta = track->theta();
fvx = track->vx();
fvy = track->vy();

What I don't understand is:

(considering that the underlying tracks are floats)

since from https://github.com/cms-sw/cmssw/blob/master/DataFormats/TrackReco/interface/TrackBase.h, it seems to me that actually all track methods are returning double...what am I missing?

@mmusich
Copy link
Contributor Author

mmusich commented Oct 18, 2024

what am I missing?

Quoting:

track vxyz,pxyz are floats only in storage, while in memory it's a double; the covariance is a float though.

see member data type of the TrackBase class:

private:
/// hit pattern
HitPattern hitPattern_;
/// perigee 5x5 covariance matrix
float covariance_[covarianceSize];
/// errors for time and velocity (separate from cov for now)
float covt0t0_, covbetabeta_;
/// chi-squared
float chi2_;
/// innermost (reference) point on track
Point vertex_;
/// time at the reference point on track
float t0_;
/// momentum vector at innermost point
Vector momentum_;
/// norm of the particle velocity at innermost point on track
/// can multiply by momentum_.Unit() to get velocity vector
float beta_;
/// algo mask, bit set for the algo where it was reconstructed + each algo a track was found overlapping by the listmerger
std::bitset<algoSize> algoMask_;
/// number of degrees of freedom
float ndof_;
/// electric charge
char charge_;
/// track algorithm
uint8_t algorithm_;
/// track algorithm
uint8_t originalAlgorithm_;
/// track quality
uint8_t quality_;
/// number of loops made during the building of the trajectory of a looper particle
// I use signed char because I don't expect more than 128 loops and I could use a negative value for a special purpose.
signed char nLoops_;
/// Stop Reason
uint8_t stopReason_;
};

@makortel
Copy link
Contributor

From the profile in #46040 (comment), specifically https://mkortela.web.cern.ch/mkortela/cgi-bin/navigator/issue46040/test_17.1000_live/390 shows that of the 57.5 MB over 1000 events

  • 20.0 MB is in void std::vector<BSTrkParameters> (link)
  • 29.5 is in std::vector<std::vector<reco::Vertex>> (link)

So while halving the size of BSTrkParameters would be an improvement, it would not be very significant (on the order of 17 %).

The container of reco::Vertex seems to be this one

std::vector<reco::VertexCollection> vertices_;

filled here
//------ Primary Vertices
Handle<VertexCollection> PVCollection;
if (iEvent.getByToken(primaryVertexLabel_, PVCollection)) {
beamSpotInfo->vertices_.push_back(*PVCollection.product());
}

and used apparently only in
for (vector<VertexCollection>::iterator itPV = beamSpotInfo->vertices_.begin();
itPV != beamSpotInfo->vertices_.end();
itPV++) {
if (!itPV->empty()) {
for (VertexCollection::const_iterator pv = itPV->begin(); pv != itPV->end(); pv++) {
if (pv->isFake() || pv->tracksSize() < 10)
continue;
if (*itV == "x") {
vertexResults.push_back(pair<double, double>(pv->x(), pv->xError()));
} else if (*itV == "y") {
vertexResults.push_back(pair<double, double>(pv->y(), pv->yError()));
} else if (*itV == "z") {
vertexResults.push_back(pair<double, double>(pv->z(), pv->zError()));
} else if (*itV != "sigmaX" && *itV != "sigmaY" && *itV != "sigmaZ") {
LogInfo("AlcaBeamMonitor") << "The histosMap_ has been built with the name " << *itV
<< " that I can't recognize!";
//assert(0);
}
}
}
}

The latter loop seems to really need only x,y,z and their uncertainties for a subset of the vertices, how about moving that extraction to analyze() and storing only those?

@mmusich
Copy link
Contributor Author

mmusich commented Oct 18, 2024

The latter loop seems to really need only x,y,z and their uncertainties for a subset of the vertices, how about moving that extraction to analyze() and storing only those?

would something like mmusich@6b1946c work?

@makortel
Copy link
Contributor

The latter loop seems to really need only x,y,z and their uncertainties for a subset of the vertices, how about moving that extraction to analyze() and storing only those?

would something like mmusich@6b1946c work?

Looks pretty much along what I had in mind, thanks!

@mmusich
Copy link
Contributor Author

mmusich commented Oct 18, 2024

Looks pretty much along what I had in mind, thanks!

I followed up at #46451. I guess looking further in the BSTrkParameters suggestion will lead to further gains.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants