-
Notifications
You must be signed in to change notification settings - Fork 12.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGCN] Use ZExt when handling indices in insertment element #85718
[AMDGCN] Use ZExt when handling indices in insertment element #85718
Conversation
…tion. This patch fixes llvm#85717 Signed-off-by: Peter Rong <[email protected]>
@llvm/pr-subscribers-backend-amdgpu Author: Peter Rong (DataCorrupted) ChangesWhen i1 true is used as an index, SExt extends it to i32 -1. This would cause BitVector to overflow. This patch fixes #85717 Full diff: https://github.com/llvm/llvm-project/pull/85718.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp b/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
index f1cc4b524460e2..bddf3d958a1ae6 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
@@ -1749,7 +1749,7 @@ static bool isInterestingPHIIncomingValue(const Value *V) {
// Non constant index/out of bounds index -> folding is unlikely.
// The latter is more of a sanity check because canonical IR should just
// have replaced those with poison.
- if (!Idx || Idx->getSExtValue() >= FVT->getNumElements())
+ if (!Idx || Idx->getZExtValue() >= FVT->getNumElements())
return false;
const auto *VecSrc = IE->getOperand(0);
@@ -1761,7 +1761,7 @@ static bool isInterestingPHIIncomingValue(const Value *V) {
return false;
CurVal = VecSrc;
- EltsCovered.set(Idx->getSExtValue());
+ EltsCovered.set(Idx->getZExtValue());
// All elements covered.
if (EltsCovered.all())
diff --git a/llvm/test/CodeGen/AMDGPU/pr85717.ll b/llvm/test/CodeGen/AMDGPU/pr85717.ll
new file mode 100644
index 00000000000000..d0483125d9355e
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/pr85717.ll
@@ -0,0 +1,24 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
+;RUN: llc -mtriple=amdgcn -mcpu=gfx1036 < %s | FileCheck %s
+define void @test(i1 %Bool, ptr %Ptr, <32 x float> %Vec1, <32 x float> %Vec2) {
+; CHECK-LABEL: test:
+; CHECK: ; %bb.0: ; %BB0
+; CHECK-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CHECK-NEXT: s_mov_b32 vcc_lo, exec_lo
+; CHECK-NEXT: .LBB0_1: ; %BB1
+; CHECK-NEXT: ; =>This Inner Loop Header: Depth=1
+; CHECK-NEXT: s_cbranch_vccnz .LBB0_1
+; CHECK-NEXT: ; %bb.2: ; %DummyReturnBlock
+; CHECK-NEXT: s_setpc_b64 s[30:31]
+BB0:
+ %I = insertelement <32 x float> %Vec1, float 4.200000e+01, i1 true
+ br label %BB1
+
+BB1: ; preds = %BB0, %BB1, %BB2
+ %PHI = phi <32 x float> [ %I, %BB2 ], [ %Vec2, %BB1 ], [ zeroinitializer, %BB0 ]
+ store <32 x float> %PHI, ptr %Ptr, align 128
+ br i1 %Bool, label %BB1, label %BB2
+
+BB2: ; preds = %BB1
+ br label %BB1
+}
|
llvm/test/CodeGen/AMDGPU/pr85717.ll
Outdated
@@ -0,0 +1,24 @@ | |||
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4 | |||
;RUN: llc -mtriple=amdgcn -mcpu=gfx1036 < %s | FileCheck %s | |||
define void @test(i1 %Bool, ptr %Ptr, <32 x float> %Vec1, <32 x float> %Vec2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name test pr85717? Can this merge in with the existing test for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to test/CodeGen/AMDGPU/amdgpu-codegenprepare-break-large-phis.ll
and adapted to size-four-vector to avoid excessive OPT-NEXT
. Please check if the test is desirable.
Signed-off-by: Peter Rong <[email protected]>
Signed-off-by: Peter Rong <[email protected]>
@@ -1197,3 +1197,54 @@ reallyfinally: | |||
store <5 x double> %val, ptr %out, align 1 | |||
ret void | |||
} | |||
|
|||
define amdgpu_kernel void @zext_i1_as_index(i1 %Bool, ptr %Ptr, <4 x float> %Vec1, <4 x float> %Vec2) { | |||
; OPT-LABEL: @test( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Label name doesn't match. Also use the issue number
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: Peter Rong <[email protected]>
Kindly let me know if there are anything else. |
…5718) When i1 true is used as an index, SExt extends it to i32 -1. This would cause BitVector to overflow. The language manual have specified that the index shall be treated as an unsigned number, this patch fixes that. (https://llvm.org/docs/LangRef.html#insertelement-instruction) This patch fixes llvm#85717 --------- Signed-off-by: Peter Rong <[email protected]>
When i1 true is used as an index, SExt extends it to i32 -1. This would cause BitVector to overflow.
The language manual have specified that the index shall be treated as an unsigned number, this patch fixes that. (https://llvm.org/docs/LangRef.html#insertelement-instruction)
This patch fixes #85717