-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MachineOutliner][NFC] Refactor #105398
[MachineOutliner][NFC] Refactor #105398
Conversation
@llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-backend-aarch64 Author: Kyungwoo Lee (kyulee-com) ChangesThis is NFC for the prep of the global outlining that uses CGData. This depends on #101461. Full diff: https://github.com/llvm/llvm-project/pull/105398.diff 5 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/MachineOutliner.h b/llvm/include/llvm/CodeGen/MachineOutliner.h
index eaba6c9b18f2bb..84937a8b563ac0 100644
--- a/llvm/include/llvm/CodeGen/MachineOutliner.h
+++ b/llvm/include/llvm/CodeGen/MachineOutliner.h
@@ -234,11 +234,11 @@ struct OutlinedFunction {
unsigned FrameConstructionID = 0;
/// Return the number of candidates for this \p OutlinedFunction.
- unsigned getOccurrenceCount() const { return Candidates.size(); }
+ virtual unsigned getOccurrenceCount() const { return Candidates.size(); }
/// Return the number of bytes it would take to outline this
/// function.
- unsigned getOutliningCost() const {
+ virtual unsigned getOutliningCost() const {
unsigned CallOverhead = 0;
for (const Candidate &C : Candidates)
CallOverhead += C.getCallOverhead();
@@ -272,6 +272,7 @@ struct OutlinedFunction {
}
OutlinedFunction() = delete;
+ virtual ~OutlinedFunction() = default;
};
} // namespace outliner
} // namespace llvm
diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
index 882cadea223695..a833a541e4e025 100644
--- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
@@ -2088,14 +2088,22 @@ class TargetInstrInfo : public MCInstrInfo {
/// Returns a \p outliner::OutlinedFunction struct containing target-specific
/// information for a set of outlining candidates. Returns std::nullopt if the
- /// candidates are not suitable for outlining.
+ /// candidates are not suitable for outlining. \p MinRep is the minimum
+ /// number of times the instruction sequence must be repeated.
virtual std::optional<outliner::OutlinedFunction> getOutliningCandidateInfo(
const MachineModuleInfo &MMI,
- std::vector<outliner::Candidate> &RepeatedSequenceLocs) const {
+ std::vector<outliner::Candidate> &RepeatedSequenceLocs,
+ unsigned MipRep) const {
llvm_unreachable(
"Target didn't implement TargetInstrInfo::getOutliningCandidateInfo!");
}
+ virtual std::optional<outliner::OutlinedFunction> getOutliningCandidateInfo(
+ const MachineModuleInfo &MMI,
+ std::vector<outliner::Candidate> &RepeatedSequenceLocs) const {
+ return getOutliningCandidateInfo(MMI, RepeatedSequenceLocs, /*MipRep=*/2);
+ }
+
/// Optional target hook to create the LLVM IR attributes for the outlined
/// function. If overridden, the overriding function must call the default
/// implementation.
diff --git a/llvm/lib/CodeGen/MachineOutliner.cpp b/llvm/lib/CodeGen/MachineOutliner.cpp
index 4b56a467b8d076..eecf27613a2c31 100644
--- a/llvm/lib/CodeGen/MachineOutliner.cpp
+++ b/llvm/lib/CodeGen/MachineOutliner.cpp
@@ -456,8 +456,9 @@ struct MachineOutliner : public ModulePass {
/// \param Mapper Contains outlining mapping information.
/// \param[out] FunctionList Filled with a list of \p OutlinedFunctions
/// each type of candidate.
- void findCandidates(InstructionMapper &Mapper,
- std::vector<OutlinedFunction> &FunctionList);
+ void
+ findCandidates(InstructionMapper &Mapper,
+ std::vector<std::unique_ptr<OutlinedFunction>> &FunctionList);
/// Replace the sequences of instructions represented by \p OutlinedFunctions
/// with calls to functions.
@@ -465,7 +466,9 @@ struct MachineOutliner : public ModulePass {
/// \param M The module we are outlining from.
/// \param FunctionList A list of functions to be inserted into the module.
/// \param Mapper Contains the instruction mappings for the module.
- bool outline(Module &M, std::vector<OutlinedFunction> &FunctionList,
+ /// \param[out] OutlinedFunctionNum The outlined function number.
+ bool outline(Module &M,
+ std::vector<std::unique_ptr<OutlinedFunction>> &FunctionList,
InstructionMapper &Mapper, unsigned &OutlinedFunctionNum);
/// Creates a function for \p OF and inserts it into the module.
@@ -583,7 +586,8 @@ void MachineOutliner::emitOutlinedFunctionRemark(OutlinedFunction &OF) {
}
void MachineOutliner::findCandidates(
- InstructionMapper &Mapper, std::vector<OutlinedFunction> &FunctionList) {
+ InstructionMapper &Mapper,
+ std::vector<std::unique_ptr<OutlinedFunction>> &FunctionList) {
FunctionList.clear();
SuffixTree ST(Mapper.UnsignedVec, OutlinerLeafDescendants);
@@ -684,7 +688,7 @@ void MachineOutliner::findCandidates(
continue;
}
- FunctionList.push_back(*OF);
+ FunctionList.push_back(std::make_unique<OutlinedFunction>(*OF));
}
}
@@ -827,10 +831,9 @@ MachineFunction *MachineOutliner::createOutlinedFunction(
return &MF;
}
-bool MachineOutliner::outline(Module &M,
- std::vector<OutlinedFunction> &FunctionList,
- InstructionMapper &Mapper,
- unsigned &OutlinedFunctionNum) {
+bool MachineOutliner::outline(
+ Module &M, std::vector<std::unique_ptr<OutlinedFunction>> &FunctionList,
+ InstructionMapper &Mapper, unsigned &OutlinedFunctionNum) {
LLVM_DEBUG(dbgs() << "*** Outlining ***\n");
LLVM_DEBUG(dbgs() << "NUMBER OF POTENTIAL FUNCTIONS: " << FunctionList.size()
<< "\n");
@@ -838,23 +841,23 @@ bool MachineOutliner::outline(Module &M,
// Sort by priority where priority := getNotOutlinedCost / getOutliningCost.
// The function with highest priority should be outlined first.
- stable_sort(FunctionList,
- [](const OutlinedFunction &LHS, const OutlinedFunction &RHS) {
- return LHS.getNotOutlinedCost() * RHS.getOutliningCost() >
- RHS.getNotOutlinedCost() * LHS.getOutliningCost();
- });
+ stable_sort(FunctionList, [](const std::unique_ptr<OutlinedFunction> &LHS,
+ const std::unique_ptr<OutlinedFunction> &RHS) {
+ return LHS->getNotOutlinedCost() * RHS->getOutliningCost() >
+ RHS->getNotOutlinedCost() * LHS->getOutliningCost();
+ });
// Walk over each function, outlining them as we go along. Functions are
// outlined greedily, based off the sort above.
auto *UnsignedVecBegin = Mapper.UnsignedVec.begin();
LLVM_DEBUG(dbgs() << "WALKING FUNCTION LIST\n");
- for (OutlinedFunction &OF : FunctionList) {
+ for (auto &OF : FunctionList) {
#ifndef NDEBUG
- auto NumCandidatesBefore = OF.Candidates.size();
+ auto NumCandidatesBefore = OF->Candidates.size();
#endif
// If we outlined something that overlapped with a candidate in a previous
// step, then we can't outline from it.
- erase_if(OF.Candidates, [&UnsignedVecBegin](Candidate &C) {
+ erase_if(OF->Candidates, [&UnsignedVecBegin](Candidate &C) {
return std::any_of(UnsignedVecBegin + C.getStartIdx(),
UnsignedVecBegin + C.getEndIdx() + 1, [](unsigned I) {
return I == static_cast<unsigned>(-1);
@@ -862,36 +865,36 @@ bool MachineOutliner::outline(Module &M,
});
#ifndef NDEBUG
- auto NumCandidatesAfter = OF.Candidates.size();
+ auto NumCandidatesAfter = OF->Candidates.size();
LLVM_DEBUG(dbgs() << "PRUNED: " << NumCandidatesBefore - NumCandidatesAfter
<< "/" << NumCandidatesBefore << " candidates\n");
#endif
// If we made it unbeneficial to outline this function, skip it.
- if (OF.getBenefit() < OutlinerBenefitThreshold) {
- LLVM_DEBUG(dbgs() << "SKIP: Expected benefit (" << OF.getBenefit()
+ if (OF->getBenefit() < OutlinerBenefitThreshold) {
+ LLVM_DEBUG(dbgs() << "SKIP: Expected benefit (" << OF->getBenefit()
<< " B) < threshold (" << OutlinerBenefitThreshold
<< " B)\n");
continue;
}
- LLVM_DEBUG(dbgs() << "OUTLINE: Expected benefit (" << OF.getBenefit()
+ LLVM_DEBUG(dbgs() << "OUTLINE: Expected benefit (" << OF->getBenefit()
<< " B) > threshold (" << OutlinerBenefitThreshold
<< " B)\n");
// It's beneficial. Create the function and outline its sequence's
// occurrences.
- OF.MF = createOutlinedFunction(M, OF, Mapper, OutlinedFunctionNum);
- emitOutlinedFunctionRemark(OF);
+ OF->MF = createOutlinedFunction(M, *OF, Mapper, OutlinedFunctionNum);
+ emitOutlinedFunctionRemark(*OF);
FunctionsCreated++;
OutlinedFunctionNum++; // Created a function, move to the next name.
- MachineFunction *MF = OF.MF;
+ MachineFunction *MF = OF->MF;
const TargetSubtargetInfo &STI = MF->getSubtarget();
const TargetInstrInfo &TII = *STI.getInstrInfo();
// Replace occurrences of the sequence with calls to the new function.
LLVM_DEBUG(dbgs() << "CREATE OUTLINED CALLS\n");
- for (Candidate &C : OF.Candidates) {
+ for (Candidate &C : OF->Candidates) {
MachineBasicBlock &MBB = *C.getMBB();
MachineBasicBlock::iterator StartIt = C.begin();
MachineBasicBlock::iterator EndIt = std::prev(C.end());
@@ -1180,7 +1183,7 @@ bool MachineOutliner::doOutline(Module &M, unsigned &OutlinedFunctionNum) {
// Prepare instruction mappings for the suffix tree.
populateMapper(Mapper, M);
- std::vector<OutlinedFunction> FunctionList;
+ std::vector<std::unique_ptr<OutlinedFunction>> FunctionList;
// Find all of the outlining candidates.
findCandidates(Mapper, FunctionList);
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 697ae510a95655..156ab6568f833e 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -8687,7 +8687,8 @@ static bool outliningCandidatesV8_3OpsConsensus(const outliner::Candidate &a,
std::optional<outliner::OutlinedFunction>
AArch64InstrInfo::getOutliningCandidateInfo(
const MachineModuleInfo &MMI,
- std::vector<outliner::Candidate> &RepeatedSequenceLocs) const {
+ std::vector<outliner::Candidate> &RepeatedSequenceLocs,
+ unsigned MinRep) const {
unsigned SequenceSize = 0;
for (auto &MI : RepeatedSequenceLocs[0])
SequenceSize += getInstSizeInBytes(MI);
@@ -8801,7 +8802,7 @@ AArch64InstrInfo::getOutliningCandidateInfo(
llvm::erase_if(RepeatedSequenceLocs, hasIllegalSPModification);
// If the sequence doesn't have enough candidates left, then we're done.
- if (RepeatedSequenceLocs.size() < 2)
+ if (RepeatedSequenceLocs.size() < MinRep)
return std::nullopt;
}
@@ -9048,7 +9049,7 @@ AArch64InstrInfo::getOutliningCandidateInfo(
}
// If we dropped all of the candidates, bail out here.
- if (RepeatedSequenceLocs.size() < 2) {
+ if (RepeatedSequenceLocs.size() < MinRep) {
RepeatedSequenceLocs.clear();
return std::nullopt;
}
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.h b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
index a1f2fbff016312..762fb9873065e6 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.h
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
@@ -473,7 +473,8 @@ class AArch64InstrInfo final : public AArch64GenInstrInfo {
bool OutlineFromLinkOnceODRs) const override;
std::optional<outliner::OutlinedFunction> getOutliningCandidateInfo(
const MachineModuleInfo &MMI,
- std::vector<outliner::Candidate> &RepeatedSequenceLocs) const override;
+ std::vector<outliner::Candidate> &RepeatedSequenceLocs,
+ unsigned MinRep) const override;
void mergeOutliningCandidateAttributes(
Function &F, std::vector<outliner::Candidate> &Candidates) const override;
outliner::InstrType getOutliningTypeImpl(const MachineModuleInfo &MMI,
|
virtual std::optional<outliner::OutlinedFunction> getOutliningCandidateInfo( | ||
const MachineModuleInfo &MMI, | ||
std::vector<outliner::Candidate> &RepeatedSequenceLocs) const { | ||
std::vector<outliner::Candidate> &RepeatedSequenceLocs, | ||
unsigned MipRep) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsigned MipRep) const { | |
unsigned MinRep) const { |
Since characters are cheap, can we expand this like MinRepeates
? Or maybe MinSequenceCount
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, can we do unsigned MinRep=2
to set a default value instead of creating a new function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just made MinRepeates
explicit for getOutliningCandidateInfo
. I don't want it to be a field/property as TargetInstrInfo as this value can be overridden by the client (either for creating candidates for the local outliner vs. for the global outliner).
llvm/lib/CodeGen/MachineOutliner.cpp
Outdated
@@ -684,7 +688,7 @@ void MachineOutliner::findCandidates( | |||
continue; | |||
} | |||
|
|||
FunctionList.push_back(*OF); | |||
FunctionList.push_back(std::make_unique<OutlinedFunction>(*OF)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work?
FunctionList.push_back(std::make_unique<OutlinedFunction>(*OF)); | |
FunctionList.emplace_back(*OF); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't quite right, but either case created implicit copy from the local OutlinedFunction instance. Instead, I also make it unique when it's created in getOutliningCandidateInfo
so that we can simply move it here.
Can you elaborate more in the PR/commit message about why you're making some of these changes? Things like the 2 -> MinRep. Taken as an individual change it would be good to have a rational without depending on future changes to make sense of it. |
Yeah. I updated the summary with some details + the link for the next candidate PR. |
Can someone take a look at this NFC (hopefully straightforward) code? |
It sounds like there is an assumption that there will be future overrides by GlobalOutlinedFunction, but it isn't clear (in this patch) when or even if this will ever happen. Maybe it would be better to do this in the same patch where someone is trying to override GlobalOutlinedFunction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In multiple places there is typo: MinRepeates
-> MinRepeats
?
Yeah. I removed the code and updated the summary accordingly. |
Thanks for the catch! I blindly copied & pasted it from the prior suggestion. |
llvm/lib/CodeGen/MachineOutliner.cpp
Outdated
@@ -670,21 +674,24 @@ void MachineOutliner::findCandidates( | |||
const TargetInstrInfo *TII = | |||
CandidatesForRepeatedSeq[0].getMF()->getSubtarget().getInstrInfo(); | |||
|
|||
std::optional<OutlinedFunction> OF = | |||
TII->getOutliningCandidateInfo(*MMI, CandidatesForRepeatedSeq); | |||
unsigned MinRepeats = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to hoist this value above L669 since it also references 2
. Alternatively, you could leave the hardcoding of 2 everywhere and just parameterize getOutliningCandidateInfo
(since that's the function needed from the global outliner). My preference is to hardcode it because I'm not sure if there are any other parts of the codebase that also make the assumption of 2 that need to stored in a variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hoisted it. I keep using the variable instead of a hard-coded value for better readability. I believe the efficiency-wise, it's the same as the compiler will optimize it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Thank you everyone for your comments! I believe I have addressed all the comments. If there's anything else that needs attention, I will take care of it in the next PR. |
- When `getOutliningCandidateInfo()` returns `std::nullopt` (meaning no `OutlinedFunction` is created), there is no need to clear the input argument, `RepeatedSequenceLocs`, as it's already beaing cleared in the main loop of `findCandidates()`. - Replaced `2` by `MinRepeats` as I missed this instance from llvm#105398
- When `getOutliningCandidateInfo()` returns `std::nullopt` (meaning no `OutlinedFunction` is created), there is no need to clear the input argument, `RepeatedSequenceLocs`, as it's already beaing cleared in the main loop of `findCandidates()`. - Replaced `2` by `MinRepeats` as I missed this instance from llvm#105398
#106171) - When `getOutliningCandidateInfo()` returns `std::nullopt` (meaning no `OutlinedFunction` is created), there is no need to clear the input argument, `RepeatedSequenceLocs`, as it's already being cleared in the main loop of `findCandidates()`. - Replaced `2` by `MinRepeats`, which I missed from #105398
This patch prepares the NFC groundwork for global outlining using CGData, which will follow llvm#90074. - The `MinRepeats` parameter is now explicitly passed to the `getOutliningCandidateInfo` function, rather than relying on a default value of 2. For local outlining, the minimum number of repetitions is typically 2, but for the global outlining (mentioned above), we will optimistically create a single `Candidate` for each `OutlinedFunction` if stable hashes match a specific code sequence. This parameter is adjusted accordingly in global outlining scenarios. - I have also implemented `unique_ptr` for `OutlinedFunction` to ensure safe and efficient memory management within `FunctionList`, avoiding unnecessary implicit copies. This depends on llvm#101461. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.
llvm#106171) - When `getOutliningCandidateInfo()` returns `std::nullopt` (meaning no `OutlinedFunction` is created), there is no need to clear the input argument, `RepeatedSequenceLocs`, as it's already being cleared in the main loop of `findCandidates()`. - Replaced `2` by `MinRepeats`, which I missed from llvm#105398
This commit introduces support for outlining functions across modules using codegen data generated from previous codegen. The codegen data currently manages the outlined hash tree, which records outlining instances that occurred locally in the past. The machine outliner now operates in one of three modes: 1. CGDataMode::None: This is the default outliner mode that uses the suffix tree to identify (local) outlining candidates within a module. This mode is also used by (full)LTO to maintain optimal behavior with the combined module. 2. CGDataMode::Write (`-codegen-data-generate`): This mode is identical to the default mode, but it also publishes the stable hash sequences of instructions in the outlined functions into a local outlined hash tree. It then encodes this into the `__llvm_outline` section, which will be dead-stripped at link time. 3. CGDataMode::Read (`-codegen-data-use-path={.cgdata}`): This mode reads a codegen data file (.cgdata) and initializes a global outlined hash tree. This tree is used to generate global outlining candidates. Note that the codegen data file has been post-processed with the raw `__llvm_outline` sections from all native objects using the `llvm-cgdata` tool (or a linker, `LLD`, or a new ThinLTO pipeline later). This depends on #105398. After this PR, LLD (#90166) and Clang (#90304) will follow for each client side support. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.
This commit introduces support for outlining functions across modules using codegen data generated from previous codegen. The codegen data currently manages the outlined hash tree, which records outlining instances that occurred locally in the past. The machine outliner now operates in one of three modes: 1. CGDataMode::None: This is the default outliner mode that uses the suffix tree to identify (local) outlining candidates within a module. This mode is also used by (full)LTO to maintain optimal behavior with the combined module. 2. CGDataMode::Write (`-codegen-data-generate`): This mode is identical to the default mode, but it also publishes the stable hash sequences of instructions in the outlined functions into a local outlined hash tree. It then encodes this into the `__llvm_outline` section, which will be dead-stripped at link time. 3. CGDataMode::Read (`-codegen-data-use-path={.cgdata}`): This mode reads a codegen data file (.cgdata) and initializes a global outlined hash tree. This tree is used to generate global outlining candidates. Note that the codegen data file has been post-processed with the raw `__llvm_outline` sections from all native objects using the `llvm-cgdata` tool (or a linker, `LLD`, or a new ThinLTO pipeline later). This depends on llvm#105398. After this PR, LLD (llvm#90166) and Clang (llvm#90304) will follow for each client side support. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.
This patch prepares the NFC groundwork for global outlining using CGData, which will follow #90074.
MinRepeats
parameter is now explicitly passed to thegetOutliningCandidateInfo
function, rather than relying on a default value of 2. For local outlining, the minimum number of repetitions is typically 2, but for the global outlining (mentioned above), we will optimistically create a singleCandidate
for eachOutlinedFunction
if stable hashes match a specific code sequence. This parameter is adjusted accordingly in global outlining scenarios.unique_ptr
forOutlinedFunction
to ensure safe and efficient memory management withinFunctionList
, avoiding unnecessary implicit copies.This depends on #101461.
This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.