-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use file size in file offset -> virtual offset translation #875
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We were seeing an issue where normalization + file offset symbolization was reporting an incorrect symbolization result for certain addresses in a certain binary. In short, in a binary with segments: > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align > LOAD 0x949be40 0x000000000969be40 0x000000000969be40 0x11d6c930 0x11d6c930 R E 0x200000 > LOAD 0x1b208780 0x000000001b409780 0x000000001b409780 0x01b5e8 0x01c1f0 RW 0x1000 > LOAD 0x1b224980 0x000000001b426980 0x000000001b426980 0x343888 0xca594c RW 0x1000 > LOAD 0x1b600000 0x000000001c200000 0x000000001c200000 0x4a358ac 0x4a358ac R E 0x200000 We were normalizing absolute address 0x1c23b4d0 to file offset 0x1b63b4d0. That's based on the calculation > 0x1c23b4d0 - 0x1c200000 /*VirtAddr*/ + 0x1b600000 /*Offset*/ = 0x1b63b4d0 When subsequently symbolizing the file offset we were looking up the wrong segment, because we took into account the segment's memory size: > 0x1b63b4d0 is in [0x1b224980 /*Offset*/, 0x1b224980 + 0xca594c /*MemSiz*/ = 0x1beca2cc] Because this first fitting segment comes before the one we used as part of normalization, we continued symbolizing virtual offset > 0x1b63b4d0 - 0x1b224980 /*Offset*/ + 0x1b426980 /*VirtAddr*/ = 0x1b83d4d0 which lead to a wrong symbolization. By matching up segments based on the file size instead, we pick the correct segment instead: > 0x1b63b4d0 is *not* in [0x1b224980 /*Offset*/, 0x1b224980 + 0x0343888 /*FileSiz*/ = 0x1b568208] > 0x1b63b4d0 is in [0x1b600000 /*Offset*/, 0x1b600000 + 0x4a358ac /*FileSiz*/ = 0x200358ac] So we end up symbolizing virtual offset > 0x1b63b4d0 - 0x1b600000 /*Offset*/ + 0x1c200000 /*VirtAddr*/ = 0x1c23b4d0 which leads to the correct symbolization result. Hence, fix the logic to work with the segment's file size instead. Signed-off-by: Daniel Müller <[email protected]>
2aa5ca4
to
a545999
Compare
anakryiko
approved these changes
Oct 29, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
1954f9d
to
53b1623
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #875 +/- ##
=======================================
Coverage 95.39% 95.39%
=======================================
Files 55 55
Lines 10522 10522
=======================================
Hits 10037 10037
Misses 485 485 ☔ View full report in Codecov by Sentry. |
d-e-s-o
added a commit
to d-e-s-o/blazesym
that referenced
this pull request
Oct 30, 2024
Add a regression test for the change introduced with pull request libbpf#875. It's unfortunately very involved and cumbersome to come up with a complete end-to-end test and so with this change we add a unit test based on partial data from the binary in question. Signed-off-by: Daniel Müller <[email protected]>
d-e-s-o
added a commit
that referenced
this pull request
Oct 30, 2024
Add a regression test for the change introduced with pull request #875. It's unfortunately very involved and cumbersome to come up with a complete end-to-end test and so with this change we add a unit test based on partial data from the binary in question. Signed-off-by: Daniel Müller <[email protected]>
danielocfb
pushed a commit
to d-e-s-o/blazesym
that referenced
this pull request
Oct 31, 2024
Add an end-to-end test for the fix provided by pull request libbpf#875. The test basically normalizes an address in a specially crafted binary and symbolizes the resulting file offset. It fails without commit 1a4e107 ("Use file size in file offset -> virtual offset translation"), because then the file offset to virtual offset translation produces a virtual offset that can't be symbolized to the expected _start function. Signed-off-by: Daniel Müller <[email protected]>
danielocfb
pushed a commit
to d-e-s-o/blazesym
that referenced
this pull request
Oct 31, 2024
Add an end-to-end test for the fix provided by pull request libbpf#875. The test basically normalizes an address in a specially crafted binary and symbolizes the resulting file offset. It fails without commit 1a4e107 ("Use file size in file offset -> virtual offset translation"), because then the file offset to virtual offset translation produces a virtual offset that can't be symbolized to the expected _start function. Signed-off-by: Daniel Müller <[email protected]>
danielocfb
pushed a commit
to d-e-s-o/blazesym
that referenced
this pull request
Oct 31, 2024
Add an end-to-end test for the fix provided by pull request libbpf#875. The test basically normalizes an address in a specially crafted binary and symbolizes the resulting file offset. It fails without commit 1a4e107 ("Use file size in file offset -> virtual offset translation"), because then the file offset to virtual offset translation produces a virtual offset that can't be symbolized to the expected _start function. For the record, the binary looks roughly as follows: $ readelf --segments --wide test-block.bin > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align > LOAD 0x000000 0x00000000000ff000 0x00000000000ff000 0x000200 0x301000 RW 0x1000 > LOAD 0x000200 0x0000000000400200 0x0000000000400200 0x000030 0x000030 R 0x1000 > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x00005b 0x00005b R E 0x1000 > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x000050 0x000050 R 0x1000 > [...] Signed-off-by: Daniel Müller <[email protected]>
danielocfb
pushed a commit
to d-e-s-o/blazesym
that referenced
this pull request
Oct 31, 2024
Add an end-to-end test for the fix provided by pull request libbpf#875. The test basically normalizes an address in a specially crafted binary and symbolizes the resulting file offset. It fails without commit 1a4e107 ("Use file size in file offset -> virtual offset translation"), because then the file offset to virtual offset translation produces a virtual offset that can't be symbolized to the expected _start function. For the record, the binary looks roughly as follows: $ readelf --segments --wide test-block.bin > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align > LOAD 0x000000 0x00000000000ff000 0x00000000000ff000 0x000200 0x301000 RW 0x1000 > LOAD 0x000200 0x0000000000400200 0x0000000000400200 0x000030 0x000030 R 0x1000 > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x00005b 0x00005b R E 0x1000 > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x000050 0x000050 R 0x1000 > [...] Signed-off-by: Daniel Müller <[email protected]>
danielocfb
pushed a commit
to d-e-s-o/blazesym
that referenced
this pull request
Oct 31, 2024
Add an end-to-end test for the fix provided by pull request libbpf#875. The test basically normalizes an address in a specially crafted binary and symbolizes the resulting file offset. It fails without commit 1a4e107 ("Use file size in file offset -> virtual offset translation"), because then the file offset to virtual offset translation produces a virtual offset that can't be symbolized to the expected _start function. For the record, the binary looks roughly as follows: $ readelf --segments --wide test-block.bin > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align > LOAD 0x000000 0x00000000000ff000 0x00000000000ff000 0x0001c8 0x301000 RW 0x1000 > LOAD 0x0001c8 0x00000000004001c8 0x00000000004001c8 0x000030 0x000030 R 0x1000 > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x00005b 0x00005b R E 0x1000 > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x000038 0x000038 R 0x1000 > [...] Signed-off-by: Daniel Müller <[email protected]>
danielocfb
pushed a commit
that referenced
this pull request
Oct 31, 2024
Add an end-to-end test for the fix provided by pull request #875. The test basically normalizes an address in a specially crafted binary and symbolizes the resulting file offset. It fails without commit 1a4e107 ("Use file size in file offset -> virtual offset translation"), because then the file offset to virtual offset translation produces a virtual offset that can't be symbolized to the expected _start function. For the record, the binary looks roughly as follows: $ readelf --segments --wide test-block.bin > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align > LOAD 0x000000 0x00000000000ff000 0x00000000000ff000 0x0001c8 0x301000 RW 0x1000 > LOAD 0x0001c8 0x00000000004001c8 0x00000000004001c8 0x000030 0x000030 R 0x1000 > LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x00005b 0x00005b R E 0x1000 > LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x000038 0x000038 R 0x1000 > [...] Signed-off-by: Daniel Müller <[email protected]>
javierhonduco
added a commit
to javierhonduco/lightswitch
that referenced
this pull request
Nov 15, 2024
Saw libbpf/blazesym#875 by pure chance and we have the same bug in our normalization logic. Thanks so much, @danielocfb! Test Plan ========= Integration test pass. Would be good to add a regression tests for this.
javierhonduco
added a commit
to javierhonduco/lightswitch
that referenced
this pull request
Nov 15, 2024
Saw libbpf/blazesym#875 by pure chance and we have the same bug in our normalization logic. Thanks so much, @danielocfb! Test Plan ========= Integration test pass. Would be good to add a regression tests for this.
javierhonduco
added a commit
to javierhonduco/lightswitch
that referenced
this pull request
Nov 15, 2024
Saw libbpf/blazesym#875 by pure chance and we have the same bug in our normalization logic. Thanks so much, @danielocfb! Test Plan ========= Integration test pass. Would be good to add a regression tests for this.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We were seeing an issue where normalization + file offset symbolization was reporting an incorrect symbolization result for certain addresses in a certain binary.
In short, in a binary with segments:
We were normalizing absolute address 0x1c23b4d0 to file offset 0x1b63b4d0. That's based on the calculation
When subsequently symbolizing the file offset we were looking up the wrong segment, because we took into account the segment's memory size:
Because this first fitting segment comes before the one we used as part of normalization, we continued symbolizing virtual offset
which lead to a wrong symbolization. By matching up segments based on the file size instead, we pick the correct segment instead:
So we end up symbolizing virtual offset
which leads to the correct symbolization result.