Skip to content

Commit

Permalink
[ELF] Relax R_RISCV_ALIGN
Browse files Browse the repository at this point in the history
Alternative to D125036. Implement R_RISCV_ALIGN relaxation so that we can handle
-mrelax object files (i.e. -mno-relax is no longer needed) and creates a
framework for future relaxation.

`relaxAux` is placed in a union with InputSectionBase::jumpInstrMod, storing
auxiliary information for relaxation. In the first pass, `relaxAux` is allocated.
The main data structure is `relocDeltas`: when referencing `relocations[i]`, the
actual offset is `r_offset - (i ? relocDeltas[i-1] : 0)`.

`relaxOnce` performs one relaxation pass. It computes `relocDeltas` for all text
section. Then, adjust st_value/st_size for symbols relative to this section
based on `SymbolAnchor`. `bytesDropped` is set so that `assignAddresses` knows
that the size has changed.

Run `relaxOnce` in the `finalizeAddressDependentContent` loop to wait for
convergence of text sections and other address dependent sections (e.g.
SHT_RELR). Note: extrating `relaxOnce` into a separate loop works for many cases
but has issues in some linker script edge cases.

After convergence, compute section contents: shrink the NOP sequence of each
R_RISCV_ALIGN as appropriate. Instead of deleting bytes, we run a sequence of
memcpy on the content delimitered by relocation locations. For R_RISCV_ALIGN let
the next memcpy skip the desired number of bytes. Section content computation is
parallelizable, but let's ensure the implementation is mature before
optimizations. Technically we can save a copy if we interleave some code with
`OutputSection::writeTo`, but let's not pollute the generic code (we don't have
templated relocation resolving, so using conditions can impose overhead to
non-RISCV.)

Tested:
`make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- LLVM=1 defconfig all` built Linux kernel using -mrelax is bootable.
FreeBSD RISCV64 system using -mrelax is bootable.
bash/curl/firefox/libevent/vim/tmux using -mrelax works.

Differential Revision: https://reviews.llvm.org/D127581
  • Loading branch information
MaskRay committed Jul 7, 2022
1 parent ef7aed3 commit 6611d58
Show file tree
Hide file tree
Showing 10 changed files with 515 additions and 42 deletions.
237 changes: 231 additions & 6 deletions lld/ELF/Arch/RISCV.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,11 @@
//===----------------------------------------------------------------------===//

#include "InputFiles.h"
#include "OutputSections.h"
#include "Symbols.h"
#include "SyntheticSections.h"
#include "Target.h"
#include "llvm/Support/TimeProfiler.h"

using namespace llvm;
using namespace llvm::object;
Expand All @@ -36,6 +38,7 @@ class RISCV final : public TargetInfo {
const uint8_t *loc) const override;
void relocate(uint8_t *loc, const Relocation &rel,
uint64_t val) const override;
bool relaxOnce(int pass) const override;
};

} // end anonymous namespace
Expand Down Expand Up @@ -271,12 +274,7 @@ RelExpr RISCV::getRelExpr(const RelType type, const Symbol &s,
case R_RISCV_TPREL_ADD:
return R_NONE;
case R_RISCV_ALIGN:
// Not just a hint; always padded to the worst-case number of NOPs, so may
// not currently be aligned, and without linker relaxation support we can't
// delete NOPs to realign.
errorOrWarn(getErrorLocation(loc) + "relocation R_RISCV_ALIGN requires "
"unimplemented linker relaxation; recompile with -mno-relax");
return R_NONE;
return R_RELAX_HINT;
default:
error(getErrorLocation(loc) + "unknown relocation (" + Twine(type) +
") against symbol " + toString(s));
Expand Down Expand Up @@ -476,6 +474,233 @@ void RISCV::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {
}
}

namespace {
struct SymbolAnchor {
uint64_t offset;
Defined *d;
bool end; // true for the anchor of st_value+st_size
};
} // namespace

struct elf::RISCVRelaxAux {
// This records symbol start and end offsets which will be adjusted according
// to the nearest relocDeltas element.
SmallVector<SymbolAnchor, 0> anchors;
// For relocations[i], the actual offset is r_offset - (i ? relocDeltas[i-1] :
// 0).
std::unique_ptr<uint32_t[]> relocDeltas;
};

static void initSymbolAnchors() {
SmallVector<InputSection *, 0> storage;
for (OutputSection *osec : outputSections) {
if (!(osec->flags & SHF_EXECINSTR))
continue;
for (InputSection *sec : getInputSections(*osec, storage)) {
sec->relaxAux = make<RISCVRelaxAux>();
if (sec->relocations.size())
sec->relaxAux->relocDeltas =
std::make_unique<uint32_t[]>(sec->relocations.size());
}
}
// Store anchors (st_value and st_value+st_size) for symbols relative to text
// sections.
for (InputFile *file : ctx->objectFiles)
for (Symbol *sym : file->getSymbols()) {
auto *d = dyn_cast<Defined>(sym);
if (!d || d->file != file)
continue;
if (auto *sec = dyn_cast_or_null<InputSection>(d->section))
if (sec->flags & SHF_EXECINSTR && sec->relaxAux) {
// If sec is discarded, relaxAux will be nullptr.
sec->relaxAux->anchors.push_back({d->value, d, false});
sec->relaxAux->anchors.push_back({d->value + d->size, d, true});
}
}
// Sort anchors by offset so that we can find the closest relocation
// efficiently. For a zero size symbol, ensure that its start anchor precedes
// its end anchor. For two symbols with anchors at the same offset, their
// order does not matter.
for (OutputSection *osec : outputSections) {
if (!(osec->flags & SHF_EXECINSTR))
continue;
for (InputSection *sec : getInputSections(*osec, storage)) {
llvm::sort(sec->relaxAux->anchors, [](auto &a, auto &b) {
return std::make_pair(a.offset, a.end) <
std::make_pair(b.offset, b.end);
});
}
}
}

static bool relax(InputSection &sec) {
const uint64_t secAddr = sec.getVA();
auto &aux = *sec.relaxAux;
bool changed = false;

// Restore original st_value for symbols relative to this section.
ArrayRef<SymbolAnchor> sa = makeArrayRef(aux.anchors);
uint32_t delta = 0;
for (auto it : llvm::enumerate(sec.relocations)) {
for (; sa.size() && sa[0].offset <= it.value().offset; sa = sa.slice(1))
if (!sa[0].end)
sa[0].d->value += delta;
delta = aux.relocDeltas[it.index()];
}
for (const SymbolAnchor &sa : sa)
if (!sa.end)
sa.d->value += delta;
sa = makeArrayRef(aux.anchors);
delta = 0;

for (auto it : llvm::enumerate(sec.relocations)) {
Relocation &r = it.value();
const size_t i = it.index();
const uint64_t loc = secAddr + r.offset - delta;
uint32_t &cur = aux.relocDeltas[i], remove = 0;
switch (r.type) {
case R_RISCV_ALIGN: {
const uint64_t nextLoc = loc + r.addend;
const uint64_t align = PowerOf2Ceil(r.addend + 2);
// All bytes beyond the alignment boundary should be removed.
remove = nextLoc - ((loc + align - 1) & -align);
assert(static_cast<int32_t>(remove) >= 0 &&
"R_RISCV_ALIGN needs expanding the content");
break;
}
}

// For all anchors whose offsets are <= r.offset, they are preceded by
// the previous relocation whose `relocDeltas` value equals `delta`.
// Decrease their st_value and update their st_size.
if (remove) {
for (; sa.size() && sa[0].offset <= r.offset; sa = sa.slice(1)) {
if (sa[0].end)
sa[0].d->size = sa[0].offset - delta - sa[0].d->value;
else
sa[0].d->value -= delta;
}
}
delta += remove;
if (delta != cur) {
cur = delta;
changed = true;
}
}

for (const SymbolAnchor &a : sa) {
if (a.end)
a.d->size = a.offset - delta - a.d->value;
else
a.d->value -= delta;
}
// Inform assignAddresses that the size has changed.
if (!isUInt<16>(delta))
fatal("section size decrease is too large");
sec.bytesDropped = delta;
return changed;
}

// When relaxing just R_RISCV_ALIGN, relocDeltas is usually changed only once in
// the absence of a linker script. For call and load/store R_RISCV_RELAX, code
// shrinkage may reduce displacement and make more relocations eligible for
// relaxation. Code shrinkage may increase displacement to a call/load/store
// target at a higher fixed address, invalidating an earlier relaxation. Any
// change in section sizes can have cascading effect and require another
// relaxation pass.
bool RISCV::relaxOnce(int pass) const {
llvm::TimeTraceScope timeScope("RISC-V relaxOnce");
if (config->relocatable)
return false;

if (pass == 0)
initSymbolAnchors();

SmallVector<InputSection *, 0> storage;
bool changed = false;
for (OutputSection *osec : outputSections) {
if (!(osec->flags & SHF_EXECINSTR))
continue;
for (InputSection *sec : getInputSections(*osec, storage))
changed |= relax(*sec);
}
return changed;
}

void elf::riscvFinalizeRelax(int passes) {
llvm::TimeTraceScope timeScope("Finalize RISC-V relaxation");
log("relaxation passes: " + Twine(passes));
SmallVector<InputSection *, 0> storage;
for (OutputSection *osec : outputSections) {
if (!(osec->flags & SHF_EXECINSTR))
continue;
for (InputSection *sec : getInputSections(*osec, storage)) {
RISCVRelaxAux &aux = *sec->relaxAux;
if (!aux.relocDeltas)
continue;

auto &rels = sec->relocations;
ArrayRef<uint8_t> old = sec->rawData;
size_t newSize =
old.size() - aux.relocDeltas[sec->relocations.size() - 1];
uint8_t *p = context().bAlloc.Allocate<uint8_t>(newSize);
uint64_t offset = 0;
int64_t delta = 0;
sec->rawData = makeArrayRef(p, newSize);
sec->bytesDropped = 0;

// Update section content: remove NOPs for R_RISCV_ALIGN and rewrite
// instructions for relaxed relocations.
for (size_t i = 0, e = rels.size(); i != e; ++i) {
uint32_t remove = aux.relocDeltas[i] - delta;
delta = aux.relocDeltas[i];
if (remove == 0)
continue;

// Copy from last location to the current relocated location.
const Relocation &r = rels[i];
uint64_t size = r.offset - offset;
memcpy(p, old.data() + offset, size);
p += size;

// For R_RISCV_ALIGN, we will place `offset` in a location (among NOPs)
// to satisfy the alignment requirement. If `remove` is a multiple of 4,
// it is as if we have skipped some NOPs. Otherwise we are in the middle
// of a 4-byte NOP, and we need to rewrite the NOP sequence.
int64_t skip = 0;
if (r.type == R_RISCV_ALIGN) {
if (remove % 4 != 0) {
skip = r.addend - remove;
int64_t j = 0;
for (; j + 4 <= skip; j += 4)
write32le(p + j, 0x00000013); // nop
if (j != skip) {
assert(j + 2 == skip);
write16le(p + j, 0x0001); // c.nop
}
}
}

p += skip;
offset = r.offset + skip + remove;
}
memcpy(p, old.data() + offset, old.size() - offset);

// Substract the previous relocDeltas value from the relocation offset.
// For a pair of R_RISCV_CALL/R_RISCV_RELAX with the same offset, decrease
// their r_offset by the same delta.
delta = 0;
for (size_t i = 0, e = rels.size(); i != e;) {
uint64_t cur = rels[i].offset;
do {
rels[i].offset -= delta;
} while (++i != e && rels[i].offset == cur);
delta = aux.relocDeltas[i - 1];
}
}
}
}

TargetInfo *elf::getRISCVTargetInfo() {
static RISCV target;
return &target;
Expand Down
4 changes: 4 additions & 0 deletions lld/ELF/InputSection.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -622,6 +622,8 @@ uint64_t InputSectionBase::getRelocTargetVA(const InputFile *file, RelType type,
return sym.getVA(a);
case R_ADDEND:
return a;
case R_RELAX_HINT:
return 0;
case R_ARM_SBREL:
return sym.getVA(a) - getARMStaticBase(sym);
case R_GOT:
Expand Down Expand Up @@ -987,6 +989,8 @@ void InputSectionBase::relocateAlloc(uint8_t *buf, uint8_t *bufEnd) {
*rel.sym, rel.expr),
bits);
switch (rel.expr) {
case R_RELAX_HINT:
continue;
case R_RELAX_GOT_PC:
case R_RELAX_GOT_PC_NOPIC:
target.relaxGot(bufLoc, rel, targetVA);
Expand Down
29 changes: 19 additions & 10 deletions lld/ELF/InputSection.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@
#define LLD_ELF_INPUT_SECTION_H

#include "Relocations.h"
#include "lld/Common/CommonLinkerContext.h"
#include "lld/Common/LLVM.h"
#include "lld/Common/Memory.h"
#include "llvm/ADT/CachedHashString.h"
#include "llvm/ADT/DenseSet.h"
#include "llvm/ADT/TinyPtrVector.h"
Expand Down Expand Up @@ -97,6 +99,8 @@ class SectionBase {
link(link), info(info) {}
};

struct RISCVRelaxAux;

// This corresponds to a section of an input file.
class InputSectionBase : public SectionBase {
public:
Expand Down Expand Up @@ -129,11 +133,10 @@ class InputSectionBase : public SectionBase {
return cast_or_null<ObjFile<ELFT>>(file);
}

// If basic block sections are enabled, many code sections could end up with
// one or two jump instructions at the end that could be relaxed to a smaller
// instruction. The members below help trimming the trailing jump instruction
// and shrinking a section.
uint8_t bytesDropped = 0;
// Used by --optimize-bb-jumps and RISC-V linker relaxation temporarily to
// indicate the number of bytes which is not counted in the size. This should
// be reset to zero after uses.
uint16_t bytesDropped = 0;

// Whether the section needs to be padded with a NOP filler due to
// deleteFallThruJmpInsn.
Expand Down Expand Up @@ -201,11 +204,17 @@ class InputSectionBase : public SectionBase {
// This vector contains such "cooked" relocations.
SmallVector<Relocation, 0> relocations;

// These are modifiers to jump instructions that are necessary when basic
// block sections are enabled. Basic block sections creates opportunities to
// relax jump instructions at basic block boundaries after reordering the
// basic blocks.
JumpInstrMod *jumpInstrMod = nullptr;
union {
// These are modifiers to jump instructions that are necessary when basic
// block sections are enabled. Basic block sections creates opportunities
// to relax jump instructions at basic block boundaries after reordering the
// basic blocks.
JumpInstrMod *jumpInstrMod = nullptr;

// Auxiliary information for RISC-V linker relaxation. RISC-V does not use
// jumpInstrMod.
RISCVRelaxAux *relaxAux;
};

// A function compiled with -fsplit-stack calling a function
// compiled without -fsplit-stack needs its prologue adjusted. Find
Expand Down
9 changes: 5 additions & 4 deletions lld/ELF/Relocations.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -958,8 +958,8 @@ bool RelocationScanner::isStaticLinkTimeConstant(RelExpr e, RelType type,
const Symbol &sym,
uint64_t relOff) const {
// These expressions always compute a constant
if (oneof<R_GOTPLT, R_GOT_OFF, R_MIPS_GOT_LOCAL_PAGE, R_MIPS_GOTREL,
R_MIPS_GOT_OFF, R_MIPS_GOT_OFF32, R_MIPS_GOT_GP_PC,
if (oneof<R_GOTPLT, R_GOT_OFF, R_RELAX_HINT, R_MIPS_GOT_LOCAL_PAGE,
R_MIPS_GOTREL, R_MIPS_GOT_OFF, R_MIPS_GOT_OFF32, R_MIPS_GOT_GP_PC,
R_AARCH64_GOT_PAGE_PC, R_GOT_PC, R_GOTONLY_PC, R_GOTPLTONLY_PC,
R_PLT_PC, R_PLT_GOTPLT, R_PPC32_PLTREL, R_PPC64_CALL_PLT,
R_PPC64_RELAX_TOC, R_RISCV_ADD, R_AARCH64_GOT_PAGE>(e))
Expand Down Expand Up @@ -2118,7 +2118,9 @@ bool ThunkCreator::normalizeExistingThunk(Relocation &rel, uint64_t src) {
// made no changes. If the target requires range extension thunks, currently
// ARM, then any future change in offset between caller and callee risks a
// relocation out of range error.
bool ThunkCreator::createThunks(ArrayRef<OutputSection *> outputSections) {
bool ThunkCreator::createThunks(uint32_t pass,
ArrayRef<OutputSection *> outputSections) {
this->pass = pass;
bool addressesChanged = false;

if (pass == 0 && target->getThunkSectionSpacing())
Expand Down Expand Up @@ -2180,7 +2182,6 @@ bool ThunkCreator::createThunks(ArrayRef<OutputSection *> outputSections) {

// Merge all created synthetic ThunkSections back into OutputSection
mergeThunks(outputSections);
++pass;
return addressesChanged;
}

Expand Down
Loading

0 comments on commit 6611d58

Please sign in to comment.