-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lld][ELF] Demote symbols in discarded sections to Undefined. #68777
Conversation
@llvm/pr-subscribers-lld-elf @llvm/pr-subscribers-lld Author: Bevin Hansson (bevin-hansson) ChangesAfter linker script processing, it may be the case that sections This patch goes through the symbols after placement and changes See #58891. Full diff: https://github.com/llvm/llvm-project/pull/68777.diff 5 Files Affected:
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 6272276e94b2d35..a49e72ee96ef466 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -3062,6 +3062,30 @@ void LinkerDriver::link(opt::InputArgList &args) {
script->addOrphanSections();
}
+ // Explicitly demote symbols which didn't get placed. If we don't, any
+ // symbol in a discarded section will still be considered defined, even
+ // though it didn't end up in the output, and we get silently broken
+ // binaries.
+ if (script->hasSectionsCommand) {
+ llvm::TimeTraceScope timeScope("Demote symbols in discarded sections");
+ parallelForEach(symtab.getSymbols(), [](Symbol *sym) {
+ if (Defined *d = dyn_cast<Defined>(sym))
+ if (d->section && !d->section->isLive()) {
+ uint32_t secIdx = 0;
+ if (d->file)
+ secIdx = d->file->getSections().drop_while([=](auto *s) {
+ return s != d->section;
+ }).size();
+ // If we don't change the binding from WEAK to GLOBAL here, the
+ // undefined symbol reporting will think this is undefined weak and
+ // not give a warning.
+ Undefined(d->file, sym->getName(), sym->isWeak() ? (uint8_t)STB_GLOBAL
+ : sym->binding,
+ sym->stOther, sym->type, secIdx).overwrite(*sym);
+ }
+ });
+ }
+
{
llvm::TimeTraceScope timeScope("Merge/finalize input sections");
diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index ee27cc15e040a49..44111161619605a 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -507,8 +507,7 @@ int64_t RelocationScanner::computeMipsAddend(const RelTy &rel, RelExpr expr,
template <class ELFT>
static std::string maybeReportDiscarded(Undefined &sym) {
auto *file = dyn_cast_or_null<ObjFile<ELFT>>(sym.file);
- if (!file || !sym.discardedSecIdx ||
- file->getSections()[sym.discardedSecIdx] != &InputSection::discarded)
+ if (!file || !sym.discardedSecIdx)
return "";
ArrayRef<typename ELFT::Shdr> objSections =
file->template getELFShdrs<ELFT>();
diff --git a/lld/ELF/Symbols.cpp b/lld/ELF/Symbols.cpp
index 07061d3a1223e9e..3f16ede772cb4a4 100644
--- a/lld/ELF/Symbols.cpp
+++ b/lld/ELF/Symbols.cpp
@@ -321,7 +321,7 @@ void elf::maybeWarnUnorderableSymbol(const Symbol *sym) {
//
// Note, ld.bfd --symbol-ordering-file= does not warn on undefined symbols,
// but we don't have to be compatible here.
- if (sym->isUndefined() &&
+ if (sym->isUndefined() && !cast<Undefined>(sym)->discardedSecIdx &&
config->unresolvedSymbols == UnresolvedPolicy::Ignore)
return;
@@ -330,9 +330,12 @@ void elf::maybeWarnUnorderableSymbol(const Symbol *sym) {
auto report = [&](StringRef s) { warn(toString(file) + s + sym->getName()); };
- if (sym->isUndefined())
- report(": unable to order undefined symbol: ");
- else if (sym->isShared())
+ if (sym->isUndefined()) {
+ if (cast<Undefined>(sym)->discardedSecIdx)
+ report(": unable to order discarded symbol: ");
+ else
+ report(": unable to order undefined symbol: ");
+ } else if (sym->isShared())
report(": unable to order shared symbol: ");
else if (d && !d->section)
report(": unable to order absolute symbol: ");
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index ce72189015348a4..7593bb14e1ffda8 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -714,6 +714,22 @@ template <class ELFT> void Writer<ELFT>::copyLocalSymbols() {
// No reason to keep local undefined symbol in symtab.
if (!dr)
continue;
+
+ // Demote locals which did not end up in any partition. This is similar
+ // to what we do in Driver.cpp, but that only works on globals.
+ // Don't do this for section symbols; they are expected to be lost.
+ if (script->hasSectionsCommand && dr->section && !dr->isSection() &&
+ !dr->section->isLive()) {
+ uint32_t secIdx = 0;
+ if (dr->file)
+ secIdx = dr->file->getSections().drop_while([=](auto *s) {
+ return s != dr->section;
+ }).size();
+ Undefined(dr->file, b->getName(), b->binding,
+ b->stOther, b->type, secIdx).overwrite(*b);
+ continue;
+ }
+
if (includeInSymtab(*b) && shouldKeepInSymtab(*dr))
in.symTab->addSymbol(b);
}
diff --git a/lld/test/ELF/linkerscript/discard-symbol.s b/lld/test/ELF/linkerscript/discard-symbol.s
new file mode 100644
index 000000000000000..739cd585af00372
--- /dev/null
+++ b/lld/test/ELF/linkerscript/discard-symbol.s
@@ -0,0 +1,16 @@
+# REQUIRES: x86
+# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t
+# RUN: echo "SECTIONS { /DISCARD/ : { *(.aaa*) } }" > %t.script
+# RUN: not ld.lld -o %t1 --script %t.script %t | FileCheck %s
+
+# CHECK: error: relocation refers to a symbol in a discarded section: aab
+# CHECK-NEXT: defined in {{.*}}/discard-symbol.s.tmp
+# CHECK-NEXT: referenced by {{.*}}/discard-symbol.s.tmp:(.zzz+0x0)
+
+.global aab
+.section .aaa,"a"
+aab:
+ .quad 0
+
+.section .zzz,"a"
+ .quad aab
|
lld/ELF/Writer.cpp
Outdated
|
||
// Demote locals which did not end up in any partition. This is similar | ||
// to what we do in Driver.cpp, but that only works on globals. | ||
// Don't do this for section symbols; they are expected to be lost. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit unsure about this. I had to prevent this for section symbols, because a few tests fail otherwise, mostly ones related to arm exidx sections. I don't understand the mechanisms at play enough to know exactly what is supposed to happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised about test failures in .ARM.exidx. As there shouldn't be any relocations to a .ARM.exidx input section. There will be relocations from a .ARM.exidx section, but given that when a section is discarded, any section with a SHF_LINK_ORDER dependency on it is discarded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can disable the check and push so you can see what tests are failing on it.
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does GNU ld permit this, or does it result in dangling relocations? If GNU ld does permit this then there's a chance that this might break some programs where the dangling relocations are benign. I guess that we may need to add a command line option to suppress the errors, although maybe not initially.
How does this work with uses of /DISCARD/ in a ld -r relocatable link? I guess a follow up question is what is it supposed to do for ld -r relocatable links. It would still be user error to /DISCARD/ a section where some other section had relocations to that section, but I don't think undefined symbols are checked for.
A possible alternative implementation could be to test each relocation in scanOne to see if the symbol is defined in a discarded section. You could then make a specific error message, or do the work to make the symbol undefined at that point. Not sure how well that would work out, but could be worth a prototype.
@@ -0,0 +1,16 @@ | |||
# REQUIRES: x86 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test looks only to check global symbols. The patch makes modifications for local symbols and symbol ordering files, please can you add test cases for these as well?
Will be worth adding a test case for relocatable links too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is true. The reason I couldn't add a test for the local symbol case is because when you have local symbols, llvm-mc seems to emit them as section symbols with an offset addend rather than an actual symbol relocation. So it runs afoul of the exception for section symbols, and doesn't trigger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the file after Rebase on top of 557299c
lld/ELF/Driver.cpp
Outdated
// symbol in a discarded section will still be considered defined, even | ||
// though it didn't end up in the output, and we get silently broken | ||
// binaries. | ||
if (script->hasSectionsCommand) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be worth adding a script->seenDiscard variable so that this potentially expensive operation is only done when someone has used /DISCARD/ ?
lld/ELF/Writer.cpp
Outdated
|
||
// Demote locals which did not end up in any partition. This is similar | ||
// to what we do in Driver.cpp, but that only works on globals. | ||
// Don't do this for section symbols; they are expected to be lost. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised about test failures in .ARM.exidx. As there shouldn't be any relocations to a .ARM.exidx input section. There will be relocations from a .ARM.exidx section, but given that when a section is discarded, any section with a SHF_LINK_ORDER dependency on it is discarded.
When I tested a while back, GNU ld seems to keep the section if there is still a reference to it, even though it was marked /DISCARD/. I don't think this is possible in lld, since the relocation processing is done after the linker script handling, and we've already made the call to keep or not keep the section at that point.
I haven't tested -r. I would assume that discarding something in an -r invocation today does not trigger these types of errors, so they wouldn't with this patch either.
Hm, would it? Could you not discard a section that contains something which is referenced from the current file, and have that reference be resolved from some other object at the final link?
Yeah, it doesn't have to go through this logic. It's just that the logic for emitting these errors already existed (and only worked for Undefined), so I went that path. |
I had a quick check and GNU ld on my machine at least gives a similar error message to what you are proposing.
It looks like GNU ld does trigger these errors with ld -r at least with a trivial test case. LLD seems to give a warning in this case and outputs a R_*_NONE relocation in its place.
I think it depends on whether the linker resolves the relocation at relocatable link time, and whether it is a global definition or not. For example if we make a local symbol undefined that is likely to be unrecoverable.
I've found out why the tests are failing anyway. The ELF/linkerscript/discard-section.s contains a local reference that we would explicitly want to catch, so I think this test would need updating. The ARM.exidx tests are failing due a missing test for liveness in scanRelocations. The .ARM.exidx SyntheticSection has a list of input .ARM.exidx sections. When some, but not all of these input sections are discarded we need the liveness check. This should fix that:
With this you should be able remove the special case for section symbols. It should be possible to use yaml2obj to make a test case with a relocation to a local symbol, but with the special case removed I don't think you'll need them. |
Interesting. I'll have to try it again myself.
Ah, that's true. Unresolved locals would be a problem.
Great, big thanks for finding this! I pushed the fix and test additions, along with some other suggested changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for the updates. This looks good to me know. I think it is worth the maintainer @MaskRay taking a look first before merging.
After linker script processing, it may be the case that sections were discarded via /DISCARD/, but relocations to symbols in those sections remain. Currently, such symbols are still considered Defined even though their section is not live and will not be placed anywhere. This results in a silently broken binary. This patch goes through the symbols after placement and changes them from Defined to Undefined if their section is no longer live at that point. During relocation processing, we will catch such undefined symbols and report an error. See llvm#58891.
528f8f5
to
8fc4eaa
Compare
I rebased, adapted the test and addressed the comments. |
secIdx = idx; | ||
break; | ||
} | ||
idx++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the section index code in both places. It was not giving the right index. I'm not sure why I didn't notice it earlier. I'm pretty sure it worked before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a performance issue. If an object file has many discarded symbols, the quadratic time complexity will not be acceptable. I have fixed this in https://github.com/maskray/llvm-project/tree/lld-symbols-in-discard
I think we need a different approach to implement symbol demoting. I re-implemented the stuff in https://github.com/maskray/llvm-project/tree/lld-symbols-in-discard and added you as a co-author :) |
secIdx = idx; | ||
break; | ||
} | ||
idx++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a performance issue. If an object file has many discarded symbols, the quadratic time complexity will not be acceptable. I have fixed this in https://github.com/maskray/llvm-project/tree/lld-symbols-in-discard
|
||
// Demote locals which did not end up in any partition. This is similar | ||
// to what we do in Driver.cpp, but that only works on globals. | ||
if (script->seenDiscard && dr->section && !dr->section->isLive()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not need this copy. This can be unified with Driver.cpp. See https://github.com/maskray/llvm-project/tree/lld-symbols-in-discard
# RUN: ld.lld -r -T a.lds a.o b.o -o /dev/null 2>&1 | count 0 | ||
# RUN: not ld.lld --threads=1 -T a.lds a.o b.o -z undefs -o /dev/null 2>&1 | FileCheck %s --check-prefix=SECTION --implicit-check-not=error: | ||
# RUN: not ld.lld --threads=1 -T a.lds a.o b.o -o /dev/null 2>&1 | FileCheck %s --check-prefixes=SECTION,SYMBOL --implicit-check-not=error: | ||
# RUN: ld.lld -r -T a.lds a.o b.o -o /dev/null 2>&1 | FileCheck %s --check-prefix=WARNING --implicit-check-not=error: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--implicit-check-not=warning:
exit code 0 means that there cannot be any error.
@@ -3061,6 +3061,37 @@ void LinkerDriver::link(opt::InputArgList &args) { | |||
script->addOrphanSections(); | |||
} | |||
|
|||
// Demote symbols defined relative to input sections that are discarded by | |||
// /DISCARD/ so that relocations referencing them will get reported. | |||
if (script->seenDiscard) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a corner case but I think is worth testing. The demotion logic applies --gc-sections
as well but we only run the code when /DISCARD/
is seen, so there will be some behavior differences.
I have updated lld/test/ELF/gc-sections-tls.s
in https://github.com/MaskRay/llvm-project/tree/lld-symbols-in-discard to test this.
Actually, we probably should do #69295 so that all non-local symbols in discarded sections are demoted, regardless of whether /DISCARD/
is seen? @smithp35
Your alternate approach seems good, I was also worried about the performance impact of looking through all the symbols like that. The TLS case was a bit unexpected to me. I wouldn't have thought that the demotion could cause any issues related to --gc-sections, since sections with references to them cannot be discarded by it. |
Superseded by #69295. |
After linker script processing, it may be the case that sections
were discarded via /DISCARD/, but relocations to symbols in those
sections remain. Currently, such symbols are still considered
Defined even though their section is not live and will not be
placed anywhere. This results in a silently broken binary.
This patch goes through the symbols after placement and changes
them from Defined to Undefined if their section is no longer
live at that point. During relocation processing, we will catch
such undefined symbols and report an error.
See #58891.