Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dataformat compatibility issue for HI SiStrip clusters in RAW #39106

Closed
jpata opened this issue Aug 19, 2022 · 10 comments
Closed

dataformat compatibility issue for HI SiStrip clusters in RAW #39106

jpata opened this issue Aug 19, 2022 · 10 comments

Comments

@jpata
Copy link
Contributor

jpata commented Aug 19, 2022

@makortel pointed out in an email thread that edmNew::DetSetVector<SiStripApproximateCluster> introduced here
https://github.com/cms-sw/cmssw/pull/33546/files#diff-f5eea8df4ad3215e1d77c11f1e6dbbdb8a6bcef9d8f05516d0d37a898df47b6fR30 for HI datataking cannot necessarily be guaranteed to be compatible in the future due to the complexity of the edmNew::DetSetVector structure.

I'm concerned that if we store the edmNew::DetSetVector<...> and expect that to be readable in all future CMSSW releases, we could cause ourselves difficult problems in the future (like inability to evolve edmNew::DetSetVector if that would become necessary, or a significant overhaul of our data storage layer).

We consider classes holding basic data types, and std::vector's of those to be simple enough to for infinite backwards compatibility guarantee. Currently the most complex data format stored in RAW (from L1T) contains effectively vector<vector<vector>> (but the simpler the better).

I think a specific data format class that resembles the edmNew::DetSetVector containing e.g.

std::vector<std::pair<unsigned, unsigned>>; // first element for DetID, second element for index to first element of that DetID in the vector below
std::vector<SiStripApproximateCluster>; // actual objects stored in linear fashion

should be

  • simple-enough for infinite backwards compatibility
  • straightforward to convert from edmNew::DetSetVector
  • straightforward to convert to edmNew::DetSetVector (if such data format would be needed for downstream modules in reconstruction)

This is a required feature, and will need to be corrected before datataking. Due to the amount of changes, looks like it will have to land some time after the last open of 12_5 (August 24) and the final release date, which we realize is far from ideal, but it seems there is no other option.
At this point, as far as I understand, it will be a technical change (i.e. no physics changes).

I'm creating this issue to notify @cms-sw/orp-l2 about this upcoming post-deadline change.

cc @cms-sw/core-l2 @mandrenguyen @icali

@jpata
Copy link
Contributor Author

jpata commented Aug 19, 2022

assign reconstruction

@cmsbuild
Copy link
Contributor

New categories assigned: reconstruction

@jpata,@clacaputo,@mandrenguyen you have been requested to review this Pull request/Issue and eventually sign? Thanks

@cmsbuild
Copy link
Contributor

A new Issue was created by @jpata Joosep Pata.

@Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@jpata jpata changed the title dataformat backwards compatibility issue for HI SiStrip clusters in RAW dataformat forward compatibility issue for HI SiStrip clusters in RAW Aug 24, 2022
@jpata
Copy link
Contributor Author

jpata commented Aug 24, 2022

EDIT: I realized we are talking here about infinite forward compatibility (not backwards compatiblity) if I'm not mistaken, so I edited the title accordingly.

@davidlange6
Copy link
Contributor

davidlange6 commented Aug 24, 2022 via email

@jpata
Copy link
Contributor Author

jpata commented Aug 24, 2022

ah.. I may be confused.

"Backward compatibility is a design that is compatible with previous versions of itself. Forward compatibility is a design that is compatible with future versions of itself."

Now we are trying to make sure that RAW we generate today will be readable in the future if something changes in upcoming CMSSW EDM, if I'm not mistaken.

Quoting Matti:

More specifically, is the approximated SiStrip cluster collection expected to have the same backwards-compatibility guarantees that we have for RAW data itself (i.e. it would need to be readable by all future CMSSW releases)?

@makortel
Copy link
Contributor

Now we are trying to make sure that RAW we generate today will be readable in the future if something changes in upcoming CMSSW EDM, if I'm not mistaken.

Correct. Across major versions of CMSSW the backwards compatibility is guaranteed only for RAW data. In practice the situation tends to be much better and all data tiers (event contents?) are backwards compatible most of the time, but this is achieved only as "reasonable effort" basis.

I think we don't formally guarantee forward compatibility, even if it was utilized in the UL MC campaigns to run the HLT step in an earlier major release than the rest of the steps (so the HLT step had to read RAW from a file produced by a later major release). Such an ability also speaks for keeping all data format classes in RAW simple and independent of the rest.

@jpata jpata changed the title dataformat forward compatibility issue for HI SiStrip clusters in RAW dataformat compatibility issue for HI SiStrip clusters in RAW Aug 24, 2022
@mandrenguyen
Copy link
Contributor

+1
Addressed by #42545

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 1, 2023

This issue is fully signed and ready to be closed.

@makortel
Copy link
Contributor

makortel commented Dec 1, 2023

@cmsbuild, please close

@cmsbuild cmsbuild closed this as completed Dec 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants