Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clang/AMDGPU: Fix accidental behavior change for __builtin_amdgcn_ldexph #66340

Merged
merged 1 commit into from
Sep 14, 2023

Conversation

arsenm
Copy link
Contributor

@arsenm arsenm commented Sep 14, 2023

eece6ba swapped this to use the generic intrinsic which doesn't have the same behavior for exponent values outside of the range of short. The instruction will implicitly truncate, but the generic instruction clamps the value. The builtin should probably have used a short argument type to begin with.

eece6ba swapped this to use the generic intrinsic
which doesn't have the same behavior for exponent values outside of the range of
short. The instruction will implicitly truncate, but the generic instruction clamps
the value. The builtin should probably have used a short argument type to begin with.
@arsenm arsenm requested review from yxsamliu, cdevadas and a team September 14, 2023 08:34
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen labels Sep 14, 2023
@llvmbot
Copy link
Member

llvmbot commented Sep 14, 2023

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Changes eece6ba swapped this to use the generic intrinsic which doesn't have the same behavior for exponent values outside of the range of short. The instruction will implicitly truncate, but the generic instruction clamps the value. The builtin should probably have used a short argument type to begin with. -- Full diff: https://github.com//pull/66340.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+10-2)
  • (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-vi.cl (+2-1)
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 037a2f9f7b15322..52868ca260290b7 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -17392,14 +17392,22 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID,
   case AMDGPU::BI__builtin_amdgcn_log_clampf:
     return emitUnaryBuiltin(*this, E, Intrinsic::amdgcn_log_clamp);
   case AMDGPU::BI__builtin_amdgcn_ldexp:
-  case AMDGPU::BI__builtin_amdgcn_ldexpf:
-  case AMDGPU::BI__builtin_amdgcn_ldexph: {
+  case AMDGPU::BI__builtin_amdgcn_ldexpf: {
     llvm::Value *Src0 = EmitScalarExpr(E->getArg(0));
     llvm::Value *Src1 = EmitScalarExpr(E->getArg(1));
     llvm::Function *F =
         CGM.getIntrinsic(Intrinsic::ldexp, {Src0->getType(), Src1->getType()});
     return Builder.CreateCall(F, {Src0, Src1});
   }
+  case AMDGPU::BI__builtin_amdgcn_ldexph: {
+    // The raw instruction has a different behavior for out of bounds exponent
+    // values (implicit truncation instead of saturate to short_min/short_max).
+    llvm::Value *Src0 = EmitScalarExpr(E->getArg(0));
+    llvm::Value *Src1 = EmitScalarExpr(E->getArg(1));
+    llvm::Function *F =
+        CGM.getIntrinsic(Intrinsic::ldexp, {Src0->getType(), Int16Ty});
+    return Builder.CreateCall(F, {Src0, Builder.CreateTrunc(Src1, Int16Ty)});
+  }
   case AMDGPU::BI__builtin_amdgcn_frexp_mant:
   case AMDGPU::BI__builtin_amdgcn_frexp_mantf:
   case AMDGPU::BI__builtin_amdgcn_frexp_manth:
diff --git a/clang/test/CodeGenOpenCL/builtins-amdgcn-vi.cl b/clang/test/CodeGenOpenCL/builtins-amdgcn-vi.cl
index 756f90b282a9a2a..ff8618a5c727e21 100644
--- a/clang/test/CodeGenOpenCL/builtins-amdgcn-vi.cl
+++ b/clang/test/CodeGenOpenCL/builtins-amdgcn-vi.cl
@@ -52,7 +52,8 @@ void test_cos_f16(global half* out, half a)
 }
 
 // CHECK-LABEL: @test_ldexp_f16
-// CHECK: call half @llvm.ldexp.f16.i32
+// CHECK: [[TRUNC:%[0-9a-z]+]] = trunc i32
+// CHECK: call half @llvm.ldexp.f16.i16(half %a, i16 [[TRUNC]])
 void test_ldexp_f16(global half* out, half a, int b)
 {
   *out = __builtin_amdgcn_ldexph(a, b);

@arsenm arsenm merged commit ddc3346 into llvm:main Sep 14, 2023
@arsenm arsenm deleted the builtin-amdgcn-ldexph-fix branch September 14, 2023 15:17
kstoimenov pushed a commit to kstoimenov/llvm-project that referenced this pull request Sep 14, 2023
ZijunZhaoCCK pushed a commit to ZijunZhaoCCK/llvm-project that referenced this pull request Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants