-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPIR-V] Fix BB ordering & register lifetime #111026
Conversation
f9d7e3e
to
0150087
Compare
@VyacheslavLevytskyy FYI, this should fix the ordering issues & others. A bit messy for now tho. |
0150087
to
4212f3b
Compare
@llvm/pr-subscribers-backend-spir-v Author: Nathan Gauër (Keenuts) ChangesThe "topological" sorting was behaving incorrectly in some cases: Fixing this ordering alone broke other parts of the structurizer, which Added more test cases, especially to test basic patterns. I also needed to tweak/disable some tests for 2 reasons:
Patch is 141.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/111026.diff 97 Files Affected:
diff --git a/llvm/lib/Target/SPIRV/SPIRVMergeRegionExitTargets.cpp b/llvm/lib/Target/SPIRV/SPIRVMergeRegionExitTargets.cpp
index 9930d067173df7..c22492ec43b095 100644
--- a/llvm/lib/Target/SPIRV/SPIRVMergeRegionExitTargets.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVMergeRegionExitTargets.cpp
@@ -130,6 +130,13 @@ class SPIRVMergeRegionExitTargets : public FunctionPass {
assert(false && "Unhandled terminator type.");
}
+ AllocaInst *CreateVariable(Function &F, Type *Type,
+ BasicBlock::iterator Position) {
+ const DataLayout &DL = F.getDataLayout();
+ return new AllocaInst(Type, DL.getAllocaAddrSpace(), nullptr, "reg",
+ Position);
+ }
+
// Run the pass on the given convergence region, ignoring the sub-regions.
// Returns true if the CFG changed, false otherwise.
bool runOnConvergenceRegionNoRecurse(LoopInfo &LI,
@@ -152,6 +159,9 @@ class SPIRVMergeRegionExitTargets : public FunctionPass {
auto NewExitTarget = BasicBlock::Create(F->getContext(), "new.exit", F);
IRBuilder<> Builder(NewExitTarget);
+ AllocaInst *Variable = CreateVariable(*F, Builder.getInt32Ty(),
+ F->begin()->getFirstInsertionPt());
+
// CodeGen output needs to be stable. Using the set as-is would order
// the targets differently depending on the allocation pattern.
// Sorting per basic-block ordering in the function.
@@ -176,18 +186,16 @@ class SPIRVMergeRegionExitTargets : public FunctionPass {
std::vector<std::pair<BasicBlock *, Value *>> ExitToVariable;
for (auto Exit : SortedExits) {
llvm::Value *Value = createExitVariable(Exit, TargetToValue);
+ IRBuilder<> B2(Exit);
+ B2.SetInsertPoint(Exit->getFirstInsertionPt());
+ B2.CreateStore(Value, Variable);
ExitToVariable.emplace_back(std::make_pair(Exit, Value));
}
- // Gather the correct value depending on the exit we came from.
- llvm::PHINode *node =
- Builder.CreatePHI(Builder.getInt32Ty(), ExitToVariable.size());
- for (auto [BB, Value] : ExitToVariable) {
- node->addIncoming(Value, BB);
- }
+ llvm::Value *Load = Builder.CreateLoad(Builder.getInt32Ty(), Variable);
// Creating the switch to jump to the correct exit target.
- llvm::SwitchInst *Sw = Builder.CreateSwitch(node, SortedExitTargets[0],
+ llvm::SwitchInst *Sw = Builder.CreateSwitch(Load, SortedExitTargets[0],
SortedExitTargets.size() - 1);
for (size_t i = 1; i < SortedExitTargets.size(); i++) {
BasicBlock *BB = SortedExitTargets[i];
diff --git a/llvm/lib/Target/SPIRV/SPIRVStructurizer.cpp b/llvm/lib/Target/SPIRV/SPIRVStructurizer.cpp
index 211a060ee103bc..5b6d31782c2093 100644
--- a/llvm/lib/Target/SPIRV/SPIRVStructurizer.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVStructurizer.cpp
@@ -87,6 +87,8 @@ BasicBlock *getExitFor(const ConvergenceRegion *CR) {
// Returns the merge block designated by I if I is a merge instruction, nullptr
// otherwise.
BasicBlock *getDesignatedMergeBlock(Instruction *I) {
+ if (I == nullptr)
+ return nullptr;
IntrinsicInst *II = dyn_cast<IntrinsicInst>(I);
if (II == nullptr)
return nullptr;
@@ -102,6 +104,8 @@ BasicBlock *getDesignatedMergeBlock(Instruction *I) {
// Returns the continue block designated by I if I is an OpLoopMerge, nullptr
// otherwise.
BasicBlock *getDesignatedContinueBlock(Instruction *I) {
+ if (I == nullptr)
+ return nullptr;
IntrinsicInst *II = dyn_cast<IntrinsicInst>(I);
if (II == nullptr)
return nullptr;
@@ -284,18 +288,6 @@ void replaceBranchTargets(BasicBlock *BB, BasicBlock *OldTarget,
assert(false && "Unhandled terminator type.");
}
-// Replaces basic bloc operands |OldSrc| or OpPhi instructions in |BB| by
-// |NewSrc|. This function does not simplify the OpPhi instruction once
-// transformed.
-void replacePhiTargets(BasicBlock *BB, BasicBlock *OldSrc, BasicBlock *NewSrc) {
- for (PHINode &Phi : BB->phis()) {
- int index = Phi.getBasicBlockIndex(OldSrc);
- if (index == -1)
- continue;
- Phi.setIncomingBlock(index, NewSrc);
- }
-}
-
} // anonymous namespace
// Given a reducible CFG, produces a structurized CFG in the SPIR-V sense,
@@ -447,48 +439,40 @@ class SPIRVStructurizer : public FunctionPass {
// clang-format on
std::vector<Edge>
createAliasBlocksForComplexEdges(std::vector<Edge> Edges) {
- std::unordered_map<BasicBlock *, BasicBlock *> Seen;
+ std::unordered_set<BasicBlock *> Seen;
std::vector<Edge> Output;
Output.reserve(Edges.size());
for (auto &[Src, Dst] : Edges) {
- auto [iterator, inserted] = Seen.insert({Src, Dst});
- if (inserted) {
- Output.emplace_back(Src, Dst);
- continue;
+ auto [iterator, inserted] = Seen.insert(Src);
+ if (!inserted) {
+ // Src already a source node. Cannot have 2 edges from A to B.
+ // Creating alias source block.
+ BasicBlock *NewSrc =
+ BasicBlock::Create(F.getContext(), "new.src", &F);
+ replaceBranchTargets(Src, Dst, NewSrc);
+ IRBuilder<> Builder(NewSrc);
+ Builder.CreateBr(Dst);
+ Src = NewSrc;
}
- // The exact same edge was already seen. Ignoring.
- if (iterator->second == Dst)
- continue;
-
- // The same Src block branches to 2 distinct blocks. This will be an
- // issue for the generated OpPhi. Creating alias block.
- BasicBlock *NewSrc =
- BasicBlock::Create(F.getContext(), "new.exit.src", &F);
- replaceBranchTargets(Src, Dst, NewSrc);
- replacePhiTargets(Dst, Src, NewSrc);
-
- IRBuilder<> Builder(NewSrc);
- Builder.CreateBr(Dst);
-
- Seen.emplace(NewSrc, Dst);
- Output.emplace_back(NewSrc, Dst);
+ Output.emplace_back(Src, Dst);
}
return Output;
}
+ AllocaInst *CreateVariable(Function &F, Type *Type,
+ BasicBlock::iterator Position) {
+ const DataLayout &DL = F.getDataLayout();
+ return new AllocaInst(Type, DL.getAllocaAddrSpace(), nullptr, "reg",
+ Position);
+ }
+
// Given a construct defined by |Header|, and a list of exiting edges
// |Edges|, creates a new single exit node, fixing up those edges.
BasicBlock *createSingleExitNode(BasicBlock *Header,
std::vector<Edge> &Edges) {
- auto NewExit = BasicBlock::Create(F.getContext(), "new.exit", &F);
- IRBuilder<> ExitBuilder(NewExit);
-
- std::vector<BasicBlock *> Dsts;
- std::unordered_map<BasicBlock *, ConstantInt *> DstToIndex;
-
// Given 2 edges: Src1 -> Dst, Src2 -> Dst:
// If Dst has an PHI node, and Src1 and Src2 are both operands, both Src1
// and Src2 cannot be hidden by NewExit. Create 2 new nodes: Alias1,
@@ -496,6 +480,10 @@ class SPIRVStructurizer : public FunctionPass {
// Dst PHI node to look for Alias1 and Alias2.
std::vector<Edge> FixedEdges = createAliasBlocksForComplexEdges(Edges);
+ std::vector<BasicBlock *> Dsts;
+ std::unordered_map<BasicBlock *, ConstantInt *> DstToIndex;
+ auto NewExit = BasicBlock::Create(F.getContext(), "new.exit", &F);
+ IRBuilder<> ExitBuilder(NewExit);
for (auto &[Src, Dst] : FixedEdges) {
if (DstToIndex.count(Dst) != 0)
continue;
@@ -506,33 +494,34 @@ class SPIRVStructurizer : public FunctionPass {
if (Dsts.size() == 1) {
for (auto &[Src, Dst] : FixedEdges) {
replaceBranchTargets(Src, Dst, NewExit);
- replacePhiTargets(Dst, Src, NewExit);
}
ExitBuilder.CreateBr(Dsts[0]);
return NewExit;
}
- PHINode *PhiNode =
- ExitBuilder.CreatePHI(ExitBuilder.getInt32Ty(), FixedEdges.size());
-
+ AllocaInst *Variable = CreateVariable(F, ExitBuilder.getInt32Ty(),
+ F.begin()->getFirstInsertionPt());
for (auto &[Src, Dst] : FixedEdges) {
- PhiNode->addIncoming(DstToIndex[Dst], Src);
+ IRBuilder<> B2(Src);
+ B2.SetInsertPoint(Src->getFirstInsertionPt());
+ B2.CreateStore(DstToIndex[Dst], Variable);
replaceBranchTargets(Src, Dst, NewExit);
- replacePhiTargets(Dst, Src, NewExit);
}
+ llvm::Value *Load =
+ ExitBuilder.CreateLoad(ExitBuilder.getInt32Ty(), Variable);
+
// If we can avoid an OpSwitch, generate an OpBranch. Reason is some
// OpBranch are allowed to exist without a new OpSelectionMerge if one of
// the branch is the parent's merge node, while OpSwitches are not.
if (Dsts.size() == 2) {
- Value *Condition = ExitBuilder.CreateCmp(CmpInst::ICMP_EQ,
- DstToIndex[Dsts[0]], PhiNode);
+ Value *Condition =
+ ExitBuilder.CreateCmp(CmpInst::ICMP_EQ, DstToIndex[Dsts[0]], Load);
ExitBuilder.CreateCondBr(Condition, Dsts[0], Dsts[1]);
return NewExit;
}
- SwitchInst *Sw =
- ExitBuilder.CreateSwitch(PhiNode, Dsts[0], Dsts.size() - 1);
+ SwitchInst *Sw = ExitBuilder.CreateSwitch(Load, Dsts[0], Dsts.size() - 1);
for (auto It = Dsts.begin() + 1; It != Dsts.end(); ++It) {
Sw->addCase(DstToIndex[*It], *It);
}
@@ -576,7 +565,7 @@ class SPIRVStructurizer : public FunctionPass {
// Creates a new basic block in F with a single OpUnreachable instruction.
BasicBlock *CreateUnreachable(Function &F) {
- BasicBlock *BB = BasicBlock::Create(F.getContext(), "new.exit", &F);
+ BasicBlock *BB = BasicBlock::Create(F.getContext(), "unreachable", &F);
IRBuilder<> Builder(BB);
Builder.CreateUnreachable();
return BB;
@@ -1127,6 +1116,18 @@ class SPIRVStructurizer : public FunctionPass {
continue;
Modified = true;
+
+ if (Merge == nullptr) {
+ Merge = *successors(Header).begin();
+ IRBuilder<> Builder(Header);
+ Builder.SetInsertPoint(Header->getTerminator());
+
+ auto MergeAddress = BlockAddress::get(Merge->getParent(), Merge);
+ SmallVector<Value *, 1> Args = {MergeAddress};
+ Builder.CreateIntrinsic(Intrinsic::spv_selection_merge, {}, {Args});
+ continue;
+ }
+
Instruction *SplitInstruction = Merge->getTerminator();
if (isMergeInstruction(SplitInstruction->getPrevNode()))
SplitInstruction = SplitInstruction->getPrevNode();
diff --git a/llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp b/llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp
index e5384b2eb2c2c1..133a98375d840f 100644
--- a/llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp
@@ -29,6 +29,7 @@
#include "llvm/MC/TargetRegistry.h"
#include "llvm/Pass.h"
#include "llvm/Target/TargetOptions.h"
+#include "llvm/Transforms/Scalar/Reg2Mem.h"
#include "llvm/Transforms/Utils.h"
#include <optional>
@@ -162,6 +163,8 @@ void SPIRVPassConfig::addIRPasses() {
TargetPassConfig::addIRPasses();
if (TM.getSubtargetImpl()->isVulkanEnv()) {
+ addPass(createRegToMemWrapperPass());
+
// 1. Simplify loop for subsequent transformations. After this steps, loops
// have the following properties:
// - loops have a single entry edge (pre-header to loop header).
@@ -169,13 +172,21 @@ void SPIRVPassConfig::addIRPasses() {
// - loops have a single back-edge.
addPass(createLoopSimplifyPass());
- // 2. Merge the convergence region exit nodes into one. After this step,
+ // 2. Removes registers whose lifetime spans across basic blocks. Also
+ // removes phi nodes. This will greatly simplify the next steps.
+ addPass(createRegToMemWrapperPass());
+
+ // 3. Merge the convergence region exit nodes into one. After this step,
// regions are single-entry, single-exit. This will help determine the
// correct merge block.
addPass(createSPIRVMergeRegionExitTargetsPass());
- // 3. Structurize.
+ // 4. Structurize.
addPass(createSPIRVStructurizerPass());
+
+ // 5. Reduce the amount of variables required by pushing some operations
+ // back to virtual registers.
+ addPass(createPromoteMemoryToRegisterPass());
}
addPass(createSPIRVRegularizerPass());
diff --git a/llvm/lib/Target/SPIRV/SPIRVUtils.cpp b/llvm/lib/Target/SPIRV/SPIRVUtils.cpp
index dff33b16b9cfcf..f9b361e163c909 100644
--- a/llvm/lib/Target/SPIRV/SPIRVUtils.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVUtils.cpp
@@ -460,53 +460,98 @@ PartialOrderingVisitor::getReachableFrom(BasicBlock *Start) {
return Output;
}
-size_t PartialOrderingVisitor::visit(BasicBlock *BB, size_t Rank) {
- if (Visited.count(BB) != 0)
- return Rank;
+bool PartialOrderingVisitor::CanBeVisited(BasicBlock *BB) const {
+ for (BasicBlock *P : predecessors(BB)) {
+ // Ignore back-edges.
+ if (DT.dominates(BB, P))
+ continue;
- Loop *L = LI.getLoopFor(BB);
- const bool isLoopHeader = LI.isLoopHeader(BB);
+ // One of the predecessor hasn't been visited. Not ready yet.
+ if (BlockToOrder.count(P) == 0)
+ return false;
- if (BlockToOrder.count(BB) == 0) {
- OrderInfo Info = {Rank, Visited.size()};
- BlockToOrder.emplace(BB, Info);
- } else {
- BlockToOrder[BB].Rank = std::max(BlockToOrder[BB].Rank, Rank);
+ // If the block is a loop exit, the loop must be finished before
+ // we can continue.
+ Loop *L = LI.getLoopFor(P);
+ if (L == nullptr || L->contains(BB))
+ continue;
+
+ // SPIR-V requires a single back-edge. And the backend first
+ // step transforms loops into the simplified format. If we have
+ // more than 1 back-edge, something is wrong.
+ assert(L->getNumBackEdges() <= 1);
+
+ // If the loop has no latch, loop's rank won't matter, so we can
+ // proceed.
+ BasicBlock *Latch = L->getLoopLatch();
+ assert(Latch);
+ if (Latch == nullptr)
+ continue;
+
+ // The latch is not ready yet, let's wait.
+ if (BlockToOrder.count(Latch) == 0)
+ return false;
}
- for (BasicBlock *Predecessor : predecessors(BB)) {
- if (isLoopHeader && L->contains(Predecessor)) {
+ return true;
+}
+
+size_t PartialOrderingVisitor::GetNodeRank(BasicBlock *BB) const {
+ size_t result = 0;
+
+ for (BasicBlock *P : predecessors(BB)) {
+ // Ignore back-edges.
+ if (DT.dominates(BB, P))
continue;
- }
- if (BlockToOrder.count(Predecessor) == 0) {
- return Rank;
+ auto Iterator = BlockToOrder.end();
+ Loop *L = LI.getLoopFor(P);
+ BasicBlock *Latch = L ? L->getLoopLatch() : nullptr;
+
+ // If the predecessor is either outside a loop, or part of
+ // the same loop, simply take its rank + 1.
+ if (L == nullptr || L->contains(BB) || Latch == nullptr) {
+ Iterator = BlockToOrder.find(P);
+ } else {
+ // Otherwise, take the loop's rank (highest rank in the loop) as base.
+ // Since loops have a single latch, highest rank is easy to find.
+ // If the loop has no latch, then it doesn't matter.
+ Iterator = BlockToOrder.find(Latch);
}
+
+ assert(Iterator != BlockToOrder.end());
+ result = std::max(result, Iterator->second.Rank + 1);
}
- Visited.insert(BB);
+ return result;
+}
+
+size_t PartialOrderingVisitor::visit(BasicBlock *BB, size_t Unused) {
+ ToVisit.push(BB);
+ Queued.insert(BB);
- SmallVector<BasicBlock *, 2> OtherSuccessors;
- SmallVector<BasicBlock *, 2> LoopSuccessors;
+ while (ToVisit.size() != 0) {
+ BasicBlock *BB = ToVisit.front();
+ ToVisit.pop();
- for (BasicBlock *Successor : successors(BB)) {
- // Ignoring back-edges.
- if (DT.dominates(Successor, BB))
+ if (!CanBeVisited(BB)) {
+ ToVisit.push(BB);
continue;
+ }
- if (isLoopHeader && L->contains(Successor)) {
- LoopSuccessors.push_back(Successor);
- } else
- OtherSuccessors.push_back(Successor);
- }
+ size_t Rank = GetNodeRank(BB);
+ OrderInfo Info = {Rank, BlockToOrder.size()};
+ BlockToOrder.emplace(BB, Info);
- for (BasicBlock *BB : LoopSuccessors)
- Rank = std::max(Rank, visit(BB, Rank + 1));
+ for (BasicBlock *S : successors(BB)) {
+ if (Queued.count(S) != 0)
+ continue;
+ ToVisit.push(S);
+ Queued.insert(S);
+ }
+ }
- size_t OutputRank = Rank;
- for (BasicBlock *Item : OtherSuccessors)
- OutputRank = std::max(OutputRank, visit(Item, Rank + 1));
- return OutputRank;
+ return 0;
}
PartialOrderingVisitor::PartialOrderingVisitor(Function &F) {
diff --git a/llvm/lib/Target/SPIRV/SPIRVUtils.h b/llvm/lib/Target/SPIRV/SPIRVUtils.h
index 83e717e6ea58fd..11fd3a5c61dcae 100644
--- a/llvm/lib/Target/SPIRV/SPIRVUtils.h
+++ b/llvm/lib/Target/SPIRV/SPIRVUtils.h
@@ -18,6 +18,7 @@
#include "llvm/IR/Dominators.h"
#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/TypedPointerType.h"
+#include <queue>
#include <string>
#include <unordered_set>
@@ -62,7 +63,9 @@ class SPIRVSubtarget;
class PartialOrderingVisitor {
DomTreeBuilder::BBDomTree DT;
LoopInfo LI;
- std::unordered_set<BasicBlock *> Visited = {};
+
+ std::unordered_set<BasicBlock *> Queued = {};
+ std::queue<BasicBlock *> ToVisit = {};
struct OrderInfo {
size_t Rank;
@@ -80,6 +83,9 @@ class PartialOrderingVisitor {
// Visits |BB| with the current rank being |Rank|.
size_t visit(BasicBlock *BB, size_t Rank);
+ size_t GetNodeRank(BasicBlock *BB) const;
+ bool CanBeVisited(BasicBlock *BB) const;
+
public:
// Build the visitor to operate on the function F.
PartialOrderingVisitor(Function &F);
diff --git a/llvm/test/CodeGen/SPIRV/HlslBufferLoad.ll b/llvm/test/CodeGen/SPIRV/HlslBufferLoad.ll
index fe960f0d6f2f9a..2c9ad1a657a1a5 100644
--- a/llvm/test/CodeGen/SPIRV/HlslBufferLoad.ll
+++ b/llvm/test/CodeGen/SPIRV/HlslBufferLoad.ll
@@ -1,4 +1,4 @@
-; RUN: llc -verify-machineinstrs -O0 -mtriple=spirv-vulkan-library %s -o - | FileCheck %s
+; RUN: llc -O0 -mtriple=spirv-vulkan-library %s -o - | FileCheck %s
; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv-vulkan-library %s -o - -filetype=obj | spirv-val %}
; CHECK-DAG: OpDecorate [[IntBufferVar:%[0-9]+]] DescriptorSet 16
@@ -18,13 +18,13 @@
; CHECK: {{%[0-9]+}} = OpFunction {{%[0-9]+}} DontInline {{%[0-9]+}}
; CHECK-NEXT: OpLabel
define void @RWBufferLoad() #0 {
-; CHECK-NEXT: [[buffer:%[0-9]+]] = OpLoad [[RWBufferTypeInt]] [[IntBufferVar]]
+; CHECK: [[buffer:%[0-9]+]] = OpLoad [[RWBufferTypeInt]] [[IntBufferVar]]
%buffer0 = call target("spirv.Image", i32, 5, 2, 0, 0, 2, 24)
@llvm.spv.handle.fromBinding.tspirv.Image_f32_5_2_0_0_2_24(
i32 16, i32 7, i32 1, i32 0, i1 false)
; Make sure we use the same variable with multiple loads.
-; CHECK-NEXT: [[buffer:%[0-9]+]] = OpLoad [[RWBufferTypeInt]] [[IntBufferVar]]
+; CHECK: [[buffer:%[0-9]+]] = OpLoad [[RWBufferTypeInt]] [[IntBufferVar]]
%buffer1 = call target("spirv.Image", i32, 5, 2, 0, 0, 2, 24)
@llvm.spv.handle.fromBinding.tspirv.Image_f32_5_2_0_0_2_24(
i32 16, i32 7, i32 1, i32 0, i1 false)
@@ -36,7 +36,7 @@ define void @RWBufferLoad() #0 {
define void @UseDifferentGlobalVar() #0 {
; Make sure we use a different variable from the first function. They have
; different types.
-; CHECK-NEXT: [[buffer:%[0-9]+]] = OpLoad [[RWBufferTypeFloat]] [[FloatBufferVar]]
+; CHECK: [[buffer:%[0-9]+]] = OpLoad [[RWBufferTypeFloat]] [[FloatBufferVar]]
%buffer0 = call target("spirv.Image", float, 5, 2, 0, 0, 2, 3)
@llvm.spv.handle.fromBinding.tspirv.Image_f32_5_2_0_0_2_3(
i32 16, i32 7, i32 1, i32 0, i1 false)
@@ -48,7 +48,7 @@ define void @UseDifferentGlobalVar() #0 {
define void @ReuseGlobalVarFromFirstFunction() #0 {
; Make sure we use the same variable as the first function. They should be the
; same in case one function calls the other.
-; CHECK-NEXT: [[buffer:%[0-9]+]] = OpLoad [[RWBufferTypeInt]] [[IntBufferVar]]
+; CHECK: [[buffer:%[0-9]+]] = OpLoad [[RWBufferTypeInt]] [[IntBufferVar]]
%buffer1 = call target("spirv.Image", i32, 5, 2, 0, 0, 2, 24)
@llvm.spv.handle.fromBinding.tspirv.Image_f32_5_2_0_0_2_24(
i32 16, i32 7, i32 1, i32 0, i1 false)
diff --git a/llvm/test/CodeGen/SPIRV/OpVariable_order.ll b/llvm/test/CodeGen/SPIRV/OpVariable_order.ll
index 6057bf38d4c4c4..c68250697c4a7b 100644
--- a/llvm/test/Code...
[truncated]
|
@Keenuts Sorry for delay with reviewing. Looks good overall, just some minor things/questions. |
The "topological" sorting was behaving incorrectly in some cases: the exit of a loop could have a lower rank than a node in the loop. This causes issues when structurizing some patterns, and also codegen issues as we could generate BBs in the incorrect order in regard to the SPIR-V spec. Fixing this ordering alone broke other parts of the structurizer, which by luck worked. Had to fix those. Added more test cases, especially to test basic patterns. I also needed to tweak/disable some tests for 2 reasons: - SPIR-V now required reg2mem/mem2reg to run. Meaning dead stores are optimized away. Some tests require tweaks to avoid having the whole function removed. - Mem2Reg will generate variable & load/stores. This generates G_BITCAST in several cases. And there is currently something wrong we do with G_BITCAST which causes MIR verifier to complain. Until this is resolved, I disabled -verify-machineinstrs flag on those tests. Signed-off-by: Nathan Gauër <[email protected]>
Signed-off-by: Nathan Gauër <[email protected]>
Signed-off-by: Nathan Gauër <[email protected]>
Signed-off-by: Nathan Gauër <[email protected]>
Signed-off-by: Nathan Gauër <[email protected]>
Signed-off-by: Nathan Gauër <[email protected]>
Signed-off-by: Nathan Gauër <[email protected]>
Signed-off-by: Nathan Gauër <[email protected]>
df4dd41
to
08443a3
Compare
Thanks all, all feedback addressed. Rebased on main. |
It seems enabling mem2reg has introduced what looks like a bug, when doing SPIRV codegen. define noundef i32 @firstbituhigh_i32(i32 noundef %a) {
entry:
%0 = call i32 @llvm.spv.track.constant.i32.i32(i32 0, metadata i32 0)
%"reg2mem alloca point" = call i32 @llvm.spv.bitcast.i32.i32(i32 %0)
call void (i32, ...) @llvm.spv.assign.name.i32(i32 %"reg2mem alloca point", i32 845636978, i32 544040301, i32 1869376609, i32 1881170275, i32 1953393007, i32 0)
call void @llvm.spv.assign.type.i32(i32 %"reg2mem alloca point", metadata i32 poison)
// ^^ these lines seem to be added by mem2reg pass
%elt.firstbituhigh = call i32 @llvm.spv.firstbituhigh.i32(i32 %a)
call void (i32, ...) @llvm.spv.assign.name.i32(i32 %elt.firstbituhigh, i32 779381861, i32 1936877926, i32 1953063540, i32 1734961269, i32 104)
call void @llvm.spv.assign.type.i32(i32 %elt.firstbituhigh, metadata i32 poison)
ret i32 %elt.firstbituhigh The bitcast added by mem2reg manifests in this bitcast (%25) %1 = OpTypeInt 32 0
%16 = OpConstant %1 0
%24 = OpFunction %1 None %2 ; -- Begin function firstbituhigh_i32
%23 = OpFunctionParameter %1
%84 = OpLabel
%25 = OpBitcast %1 %16
%26 = OpExtInst %1 %27 FindUMsb %23
OpReturnValue %26
OpFunctionEnd This is affecting this test in a PR I have been working on: https://github.com/llvm/llvm-project/pull/111082/files#diff-b0dbfa444f4440132c78ce0a3d8ecd397efab944b4f5d83679691361d6cf0afa. After rebasing the bitcast appears in all of the tests in that file. |
Without looking into the case, just as an idea, the resolution probably should be better done in lib/Target/SPIRV/SPIRVEmitIntrinsics.cpp to avoid @spall Can you please create an issue with a small reproducer or just insert this here. |
I created an issue here: #114482 |
Fixes an "undefined reference to `llvm::createRegToMemWrapperPass()'" linker error introduced by cba7055 ("[SPIR-V] Fix BB ordering & register lifetime (llvm#111026)", 2024-10-30).
The "topological" sorting was behaving incorrectly in some cases: the exit of a loop could have a lower rank than a node in the loop. This causes issues when structurizing some patterns, and also codegen issues as we could generate BBs in the incorrect order in regard to the SPIR-V spec. Fixing this ordering alone broke other parts of the structurizer, which by luck worked. Had to fix those. Added more test cases, especially to test basic patterns. I also needed to tweak/disable some tests for 2 reasons: - SPIR-V now required reg2mem/mem2reg to run. Meaning dead stores are optimized away. Some tests require tweaks to avoid having the whole function removed. - Mem2Reg will generate variable & load/stores. This generates G_BITCAST in several cases. And there is currently something wrong we do with G_BITCAST which causes MIR verifier to complain. Until this is resolved, I disabled -verify-machineinstrs flag on those tests. --------- Signed-off-by: Nathan Gauër <[email protected]>
Fixes an "undefined reference to `llvm::createRegToMemWrapperPass()'" linker error introduced by cba7055 ("[SPIR-V] Fix BB ordering & register lifetime (llvm#111026)", 2024-10-30).
The "topological" sorting was behaving incorrectly in some cases:
the exit of a loop could have a lower rank than a node in the loop.
This causes issues when structurizing some patterns, and also codegen
issues as we could generate BBs in the incorrect order in regard to the
SPIR-V spec.
Fixing this ordering alone broke other parts of the structurizer, which
by luck worked. Had to fix those.
Added more test cases, especially to test basic patterns.
I also needed to tweak/disable some tests for 2 reasons:
are optimized away. Some tests require tweaks to avoid having the
whole function removed.
G_BITCAST in several cases. And there is currently something wrong
we do with G_BITCAST which causes MIR verifier to complain.
Until this is resolved, I disabled -verify-machineinstrs flag on
those tests.