-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LLDB][ELF] Fix section unification to not just use names. #90099
Conversation
Section unification cannot just use names, because it's valid for ELF binaries to have multiple sections with the same name. We should check other section properties too. rdar://124467787
@llvm/pr-subscribers-lldb Author: Alastair Houghton (al45tair) ChangesSection unification cannot just use names, because it's valid for ELF binaries to have multiple sections with the same name. We should check other section properties too. Fixes #88001. rdar://124467787 Full diff: https://github.com/llvm/llvm-project/pull/90099.diff 1 Files Affected:
diff --git a/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp b/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
index 0d95a1c12bde35..60fc68c96bcc1c 100644
--- a/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
+++ b/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
@@ -1854,6 +1854,49 @@ class VMAddressProvider {
};
}
+namespace {
+ // We have to do this because ELF doesn't have section IDs, and also
+ // doesn't require section names to be unique. (We use the section index
+ // for section IDs, but that isn't guaranteed to be the same in separate
+ // debug images.)
+ SectionSP FindMatchingSection(const SectionList §ion_list,
+ SectionSP section) {
+ SectionSP sect_sp;
+
+ addr_t vm_addr = section->GetFileAddress();
+ ConstString name = section->GetName();
+ offset_t file_size = section->GetFileSize();
+ offset_t byte_size = section->GetByteSize();
+ SectionType type = section->GetType();
+ bool thread_specific = section->IsThreadSpecific();
+ uint32_t permissions = section->GetPermissions();
+ uint32_t alignment = section->GetLog2Align();
+
+ for (auto sect_iter = section_list.begin();
+ sect_iter != section_list.end();
+ ++sect_iter) {
+ if ((*sect_iter)->GetName() == name
+ && (*sect_iter)->GetType() == type
+ && (*sect_iter)->IsThreadSpecific() == thread_specific
+ && (*sect_iter)->GetPermissions() == permissions
+ && (*sect_iter)->GetFileSize() == file_size
+ && (*sect_iter)->GetByteSize() == byte_size
+ && (*sect_iter)->GetFileAddress() == vm_addr
+ && (*sect_iter)->GetLog2Align() == alignment) {
+ sect_sp = *sect_iter;
+ break;
+ } else {
+ sect_sp = FindMatchingSection((*sect_iter)->GetChildren(),
+ section);
+ if (sect_sp)
+ break;
+ }
+ }
+
+ return sect_sp;
+ }
+}
+
void ObjectFileELF::CreateSections(SectionList &unified_section_list) {
if (m_sections_up)
return;
@@ -2067,10 +2110,8 @@ unsigned ObjectFileELF::ParseSymbols(Symtab *symtab, user_id_t start_id,
SectionList *module_section_list =
module_sp ? module_sp->GetSectionList() : nullptr;
- // Local cache to avoid doing a FindSectionByName for each symbol. The "const
- // char*" key must came from a ConstString object so they can be compared by
- // pointer
- std::unordered_map<const char *, lldb::SectionSP> section_name_to_section;
+ // Cache the section mapping
+ std::unordered_map<lldb::SectionSP, lldb::SectionSP> section_map;
unsigned i;
for (i = 0; i < num_symbols; ++i) {
@@ -2275,14 +2316,15 @@ unsigned ObjectFileELF::ParseSymbols(Symtab *symtab, user_id_t start_id,
if (symbol_section_sp && module_section_list &&
module_section_list != section_list) {
- ConstString sect_name = symbol_section_sp->GetName();
- auto section_it = section_name_to_section.find(sect_name.GetCString());
- if (section_it == section_name_to_section.end())
+ auto section_it = section_map.find(symbol_section_sp);
+ if (section_it == section_map.end()) {
section_it =
- section_name_to_section
- .emplace(sect_name.GetCString(),
- module_section_list->FindSectionByName(sect_name))
- .first;
+ section_map
+ .emplace(symbol_section_sp,
+ FindMatchingSection(*module_section_list,
+ symbol_section_sp))
+ .first;
+ }
if (section_it->second)
symbol_section_sp = section_it->second;
}
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a test for this with obj2yaml?
&& (*sect_iter)->GetType() == type | ||
&& (*sect_iter)->IsThreadSpecific() == thread_specific | ||
&& (*sect_iter)->GetPermissions() == permissions | ||
&& (*sect_iter)->GetFileSize() == file_size | ||
&& (*sect_iter)->GetByteSize() == byte_size | ||
&& (*sect_iter)->GetFileAddress() == vm_addr | ||
&& (*sect_iter)->GetLog2Align() == alignment) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it could be an operator ==
in the Section class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to do that, because I didn't want to give the impression that this was a general purpose way to compare two Section
s. While it checks many of the section attributes, it doesn't check all of them (intentionally in the case of which object they come from), and I think that would be surprising behaviour from an ==
operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct in understanding that this code is used when unifying the sections of a separate debug file with a stripped file? If so, then I believe this comparison is too strict. An object file produced by objcopy --only-keep-debug
will only have placeholder sections instead of the sections that contained actual code. That means (at least) their file size and file offset will be different from that in the original file.
I think it would be sufficient to use the file address (called virtual address in ELF), possibly confirmed by section name, as the unification key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did wonder about exactly how strict we want to be (and exactly which things to compare). This isn't testing file offset, but you're right that it does check file size and maybe it shouldn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@labath Let's get rid of the file sizes; the other things should all be the same, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The section type can also change. You can see that with .text, because there we have a name-based fallback, but with anything else, it changes from "code" to "regular":
Showing 1 subsections
Index: 0
ID: 0x6
Name: .foo
Type: regular
Permissions: r-x
Thread specific: no
$ bin/lldb-test object-file /tmp/a.out | grep -e foo -C 3
Showing 1 subsections
Index: 0
ID: 0x6
Name: .foo
Type: code
Permissions: r-x
Thread specific: no
The other fields are probably fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I've removed the section type test as well.
Not sure what you're thinking here. We can't use |
Fixed a couple of nits from review, and fixed up formatting. Also added a test. rdar://124467787
ccad626
to
9653351
Compare
this test seems to indicate that's possible (the trick appears to me in |
I've actually just included a binary file that exhibits the problem. |
As pointed out by @labath, if you use `objcopy --only-keep-debug`, only placeholder sections will be present so the file sizes won't match. We already ignore file offsets when doing the comparison. rdar://124467787
I saw that, but a textual test is definitely preferable, particularly after the linux xz fiasco. This wouldn't be the first test binary in our repo, but in this case, I think it actually adds a lot of value, since yaml2obj operates on the same level as what you are testing (so you can see the input of the test eyeball-verify that the expected output makes sense). |
:-) Fair point (though this is very much not like the xz incident — I'm a colleague of Jonas's at Apple, albeit in a different team, with a fairly longstanding history of contributing to OSS projects, and doing something like that would be… very career limiting). I'll have a go at changing things to use |
Rather than including a binary, use `yaml2obj` for the duplicate section name test. rdar://124467787
The new test looks great. |
@labath @JDevlieghere Are we happy with this PR now? If so, I'll cherry pick it to the various Apple forks; I need it in order to merge swiftlang/swift#72061, which we need for both my fully statically linked Swift SDK work and for Amazon Linux 2023 support in Swift. |
The section type can change when using `objcopy --only-keep-debug`, apparently. rdar://124467787
83e4594
to
5e28741
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me as well.
Section unification cannot just use names, because it's valid for ELF binaries to have multiple sections with the same name. We should check other section properties too.
Fixes #88001.
rdar://124467787