Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UBSAN][L1] Fix read out of bounds issue #46372

Closed
wants to merge 1 commit into from

Conversation

smuzaffar
Copy link
Contributor

@smuzaffar smuzaffar commented Oct 14, 2024

This should fix the UBSAN error like [a]. There are few issues with this code

I am opening this PR but would like @cms-sw/l1-l2 to review this code and may be propose a batter change

[a] https://cmssdt.cern.ch/SDT/cgi-bin/buildlogs/raw/el8_amd64_gcc12/CMSSW_14_2_UBSAN_X_2024-10-11-2300/pyRelValMatrixLogs/run/23634.0_TTbar_14TeV+2026D95/step2_TTbar_14TeV+2026D95.log

src/L1Trigger/L1CaloTrigger/interface/Phase2L1RCT.h:1257:24: runtime error: index 25 out of bounds for type 'ap_uint [24]'
    #0 0x14e16aba5450 in p2eg::getBremsValuesNeg(p2eg::crystal (*) [20], ap_uint<5>, ap_uint<5>) src/L1Trigger/L1CaloTrigger/interface/Phase2L1RCT.h:1257
    #1 0x14e16abedbad in p2eg::getClusterFromRegion3x4(p2eg::crystal (*) [20]) src/L1Trigger/L1CaloTrigger/interface/Phase2L1RCT.h:1441
    #2 0x14e16aa9cd7c in Phase2L1CaloEGammaEmulator::produce(edm::Event&, edm::EventSetup const&) src/L1Trigger/L1CaloTrigger/plugins/Phase2L1CaloEGammaEmulator.cc:337

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 14, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-46372/42213

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @smuzaffar for master.

It involves the following packages:

  • L1Trigger/L1CaloTrigger (l1, upgrade)

@Moanwar, @aloeliger, @cmsbuild, @epalencia, @srimanob, @subirsarkar can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @missirol, @mmusich this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@smuzaffar
Copy link
Contributor Author

please test

@smuzaffar
Copy link
Contributor Author

test parameters:

  • workflow = 23634.0

@smuzaffar
Copy link
Contributor Author

please test for CMSSW_14_2_UBSAN_X

@cmsbuild
Copy link
Contributor

+1

Size: This PR adds an extra 24KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-08e216/42161/summary.html
COMMIT: 5fa788a
CMSSW: CMSSW_14_2_X_2024-10-14-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/46372/42161/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

@cmsbuild
Copy link
Contributor

@smuzaffar
Copy link
Contributor Author

@cms-sw/l1-l2 , can you please review this PR ?

@aloeliger
Copy link
Contributor

Some of this goes beyond my ability to comment on. @skkwan Could you please take a look at this and comment on these changes?

@skkwan
Copy link
Contributor

skkwan commented Oct 16, 2024

Hi @smuzaffar, thanks for raising this issue and thanks Andrew for the tag. The out-of-bounds error can be resolved by increasing the temp array size and I'm happy to open a PR with just a few changes to fix it, if that's OK with you and @aloeliger. The reasoning for the +7 index in temp[i + 1][j + 7] comes from the window size for the bremsstrahlung correction; I made a few schematics to illustrate why it needs to be +7 and not +4. [1]

Regarding the for loops, our reasoning is to keep the emulator as close as possible to the firmware, because it also helps us debug the performance in the future. There are multiple places where the emulator has suboptimal usage of C++ resources, but it accurately reflects the firmware code which meets timing and resource constraints. Sascha Savin ([email protected]) can also elaborate on how the firmware will evolve if it's helpful. Thanks!

[1] https://www.dropbox.com/scl/fi/epzhvcon2ohd7ouvj56bw/2024-10-15-Indexing.pdf?rlkey=d6szhm3wdv8afdiwtht13git3&dl=0

@smuzaffar
Copy link
Contributor Author

thanks @skkwan for looking in to it. Feel free to open a PR to apply the changes you feel necessary to avoid the out of bound issue

@skkwan
Copy link
Contributor

skkwan commented Oct 22, 2024

Hi @smuzaffar, @aloeliger, I've opened PR #46472 regarding the out-of-bounds. I've checked that it doesn't affect the original outputs, but please let me know if this solves the error in your tests. Thanks.

@smuzaffar
Copy link
Contributor Author

closing this in favor of #46472

@smuzaffar smuzaffar closed this Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants