Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit aliases to FP16 conversion routines #45649

Merged
merged 3 commits into from
Jul 20, 2022
Merged

Emit aliases to FP16 conversion routines #45649

merged 3 commits into from
Jul 20, 2022

Conversation

vchuravy
Copy link
Member

  • Prefix Float16 intrinsics
  • Define aliases to FP16 crt in the OJIT

Instead of replacing them late in codegen let LLVM emit these symbols,
but intercept them in the ORC JIT.

I haven't had a chance to test this properly and it is likely that
we will need to emit these aliases also into the system-image since
loading that will not see these aliases here.

@vchuravy vchuravy added this to the 1.8 milestone Jun 11, 2022
@vchuravy vchuravy added backport 1.8 Change should be backported to release-1.8 float16 labels Jun 11, 2022
src/jitlayers.cpp Outdated Show resolved Hide resolved
@vchuravy
Copy link
Member Author

Arrg for whatever reason one can't create aliases to things outside the compilation unit... that's disappointing.

One alternative is --defsym/linker script but that is rather getting ugly...

Base automatically changed from revert-45249-jn/44829c to master June 12, 2022 20:45
@vchuravy vchuravy marked this pull request as ready for review June 12, 2022 20:46
@vchuravy vchuravy force-pushed the vc/fp16 branch 2 times, most recently from aed6adf to ab1aa83 Compare June 12, 2022 21:46
@vchuravy
Copy link
Member Author

Okay the error on AArch64 is interesting:

ror in testset intrinsics:
Test Failed at /buildworker/worker/tester_linuxaarch64/build/share/julia/test/intrinsics.jl:174
  Expression: extendhfsf2(Float16(3.3)) == 3.3007812f0
   Evaluated: -0.00091171265f0 == 3.3007812f0
Error in testset intrinsics:
Test Failed at /buildworker/worker/tester_linuxaarch64/build/share/julia/test/intrinsics.jl:175
  Expression: gnu_h2f_ieee(Float16(3.3)) == 3.3007812f0
   Evaluated: -0.0009498596f0 == 3.3007812f0
Error in testset intrinsics:
Test Failed at /buildworker/worker/tester_linuxaarch64/build/share/julia/test/intrinsics.jl:176
  Expression: truncsfhf2(3.3f0) == Float16(3.3)
   Evaluated: Float16(0.225) == Float16(3.3)
Error in testset intrinsics:
Test Failed at /buildworker/worker/tester_linuxaarch64/build/share/julia/test/intrinsics.jl:177
  Expression: gnu_f2h_ieee(3.3f0) == Float16(3.3)
   Evaluated: Float16(0.225) == Float16(3.3)
Error in testset intrinsics:
Test Failed at /buildworker/worker/tester_linuxaarch64/build/share/julia/test/intrinsics.jl:178
  Expression: truncdfhf2(3.3) == Float16(3.3)
   Evaluated: Float16(0.225) == Float16(3.3)

I am testing here explicitly the Int16 ABI and it seems like on aarch64 it doesn't apply, which kinda makes sense. There GCC had _Float16 forever.

@vchuravy
Copy link
Member Author

This looks good to me. @vtjnash would appreciate a quick review

@KristofferC
Copy link
Member

This looks good to me

Buildbot Windows disagrees?

@vchuravy
Copy link
Member Author

One of them is a OOM, and the other one feels like a OOM as well https://build.julialang.org/#/builders/72/builds/7343
I retriggered the latter one.

@staticfloat
Copy link
Member

I'm not sure why we would be OOM'ing, the machines have 32GB of memory available.

Looking at the memory graphs of win64bot1 and win32bot3, I see the following:

Screen Shot 2022-06-13 at 15 10 30

I'm not trying hard to attribute one particular dip with another here; just showing that overall, while we are using significant amounts of memory, we aren't in OOM territory yet. I'm willing to bet that the win32 OOM is an address space exhaustion more than an OOM (similar to what we've been seeing elsewhere on linux32) and that win64 is something else entirely.

@vchuravy
Copy link
Member Author

sigh Thanks Elliot for checking.

@vchuravy
Copy link
Member Author

Can't reproduce the windows failure locally :/

@t-bltg
Copy link
Contributor

t-bltg commented Jun 29, 2022

Fixes #45433, thanks.

@t-bltg
Copy link
Contributor

t-bltg commented Jun 29, 2022

Here is the PR backported to 1.6.6, 1.7.3 and 1.8.0-rc1. I've rebuilt all three julia versions (+ master) and all fp16 related tests are passing locally on ubuntu 22.04 (libgcc 12).

TODO: update these with latest changes, see below.

1.6.6
--- src/APInt-C.cpp  2022-06-29 15:37:58.943951000 +0000
+++ src/APInt-C.cpp  2022-06-29 15:39:56.742904521 +0000
@@ -316,7 +316,7 @@
 void LLVMFPtoInt(unsigned numbits, void *pa, unsigned onumbits, integerPart *pr, bool isSigned, bool *isExact) {
     double Val;
     if (numbits == 16)
-        Val = __gnu_h2f_ieee(*(uint16_t*)pa);
+        Val = julia__gnu_h2f_ieee(*(uint16_t*)pa);
     else if (numbits == 32)
         Val = *(float*)pa;
     else if (numbits == 64)
@@ -391,7 +391,7 @@
         val = a.roundToDouble(true);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)
@@ -408,7 +408,7 @@
         val = a.roundToDouble(false);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)

--- src/aotcompile.cpp  2022-06-29 15:37:58.943951000 +0000
+++ src/aotcompile.cpp  2022-06-29 16:27:31.074065990 +0000
@@ -51,6 +51,7 @@
 #include <llvm/Support/CodeGen.h>
 #endif
 
+#include <llvm/IR/IRBuilder.h>
 #include <llvm/IR/LegacyPassManagers.h>
 #include <llvm/Transforms/Utils/Cloning.h>
 
@@ -276,6 +277,24 @@
     *ci_out = codeinst;
 }
 
+static void injectCRTAlias(Module &M, StringRef name, StringRef alias, FunctionType *FT)
+{
+    Function *target = M.getFunction(alias);
+    if (!target) {
+        target = Function::Create(FT, Function::ExternalLinkage, alias, M);
+    }
+    // Weak so that this does not get discarded
+    // maybe use llvm.compiler.used instead?
+    Function *interposer = Function::Create(FT, Function::WeakAnyLinkage, name, M);
+
+    llvm::IRBuilder<> builder(BasicBlock::Create(M.getContext(), "top", interposer));
+    SmallVector<Value *, 4> CallArgs;
+    for (auto &arg : interposer->args())
+        CallArgs.push_back(&arg);
+    auto val = builder.CreateCall(target, CallArgs);
+    builder.CreateRet(val);
+}
+
 // takes the running content that has collected in the shadow module and dump it to disk
 // this builds the object file portion of the sysimage files for fast startup, and can
 // also be used be extern consumers like GPUCompiler.jl to obtain a module containing
@@ -554,6 +573,20 @@
                                      "jl_RTLD_DEFAULT_handle_pointer"));
     }
 
+    // We would like to emit an alias or an weakref alias to redirect these symbols
+    // but LLVM doesn't let us emit a GlobalAlias to a declaration...
+    // So for now we inject a definition of these functions that calls our runtime functions.
+    injectCRTAlias(*data->M, "__gnu_h2f_ieee", "julia__gnu_h2f_ieee",
+            FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+    injectCRTAlias(*data->M, "__extendhfsf2", "julia__gnu_h2f_ieee",
+            FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+    injectCRTAlias(*data->M, "__gnu_f2h_ieee", "julia__gnu_f2h_ieee",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+    injectCRTAlias(*data->M, "__truncsfhf2", "julia__gnu_f2h_ieee",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+    injectCRTAlias(*data->M, "__truncdfhf2", "julia__truncdfhf2",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getDoubleTy(Context) }, false));
+
     // do the actual work
     auto add_output = [&] (Module &M, StringRef unopt_bc_Name, StringRef bc_Name, StringRef obj_Name, StringRef asm_Name) {
         PM.run(M);

--- src/julia.expmap  2022-06-29 15:37:58.987952000 +0000
+++ src/julia.expmap  2022-06-29 15:40:28.643715568 +0000
@@ -42,12 +42,6 @@
     environ;
     __progname;
 
-    /* compiler run-time intrinsics */
-    __gnu_h2f_ieee;
-    __extendhfsf2;
-    __gnu_f2h_ieee;
-    __truncdfhf2;
-
   local:
     *;
 };

--- src/julia_internal.h  2022-06-29 15:37:58.991953000 +0000
+++ src/julia_internal.h  2022-06-29 15:42:47.155284019 +0000
@@ -1363,8 +1363,9 @@
   #define JL_GC_ASSERT_LIVE(x) (void)(x)
 #endif
 
-float __gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
-uint16_t __gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__truncdfhf2(double param) JL_NOTSAFEPOINT;
 
 #ifdef __cplusplus
 }

--- src/runtime_intrinsics.c  2022-06-29 15:37:59.003953000 +0000
+++ src/runtime_intrinsics.c  2022-06-29 15:43:48.056873802 +0000
@@ -169,9 +169,9 @@
     }
 
 #define fp_select(a, func) \
-    sizeof(a) == sizeof(float) ? func##f((float)a) : func(a)
+    sizeof(a) <= sizeof(float) ? func##f((float)a) : func(a)
 #define fp_select2(a, b, func) \
-    sizeof(a) == sizeof(float) ? func##f(a, b) : func(a, b)
+    sizeof(a) <= sizeof(float) ? func##f(a, b) : func(a, b)
 
 // fast-function generators //
 
@@ -215,11 +215,11 @@
 static inline void name(unsigned osize, void *pa, void *pr) JL_NOTSAFEPOINT \
 { \
     uint16_t a = *(uint16_t*)pa; \
-    float A = __gnu_h2f_ieee(a); \
+    float A = julia__gnu_h2f_ieee(a); \
     if (osize == 16) { \
         float R; \
         OP(&R, A); \
-        *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
     } else { \
         OP((uint16_t*)pr, A); \
     } \
@@ -243,11 +243,11 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     float R = OP(A, B); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 // float or integer inputs, bool output
@@ -268,8 +268,8 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     return OP(A, B); \
 }
@@ -309,12 +309,12 @@
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
     uint16_t c = *(uint16_t*)pc; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
-    float C = __gnu_h2f_ieee(c); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
+    float C = julia__gnu_h2f_ieee(c); \
     runtime_nbits = 16; \
     float R = OP(A, B, C); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 
@@ -832,7 +832,7 @@
 fpiseq_n(float, 32)
 fpiseq_n(double, 64)
 #define fpiseq(a,b) \
-    sizeof(a) == sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
+    sizeof(a) <= sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
 
 #define fpislt_n(c_type, nbits)                                         \
     static inline int fpislt##nbits(c_type a, c_type b) JL_NOTSAFEPOINT \
@@ -903,7 +903,7 @@
         if (!(osize < 8 * sizeof(a))) \
             jl_error("fptrunc: output bitsize must be < input bitsize"); \
         else if (osize == 16) \
-            *(uint16_t*)pr = __gnu_f2h_ieee(a); \
+            *(uint16_t*)pr = julia__gnu_f2h_ieee(a); \
         else if (osize == 32) \
             *(float*)pr = a; \
         else if (osize == 64) \

--- src/jitlayers.cpp  2022-06-29 15:37:58.975952000 +0000
+++ src/jitlayers.cpp  2022-06-29 15:45:50.344097088 +0000
@@ -737,12 +737,26 @@
     }
 
     JD.addToLinkOrder(GlobalJD, orc::JITDylibLookupFlags::MatchExportedSymbolsOnly);
+
+    orc::SymbolAliasMap jl_crt = {
+        { mangle("__gnu_h2f_ieee"), { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__extendhfsf2"),  { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__gnu_f2h_ieee"), { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncsfhf2"),   { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncdfhf2"),   { mangle("julia__truncdfhf2"),   JITSymbolFlags::Exported } }
+    };
+    cantFail(GlobalJD.define(orc::symbolAliases(jl_crt)));
 }
 
-void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+orc::SymbolStringPtr JuliaOJIT::mangle(StringRef Name)
 {
     std::string MangleName = getMangledName(Name);
-    cantFail(JD.define(orc::absoluteSymbols({{ES.intern(MangleName), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
+    return ES.intern(MangleName);
+}
+
+void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+{
+    cantFail(JD.define(orc::absoluteSymbols({{mangle(Name), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
 }
 
 void JuliaOJIT::addModule(std::unique_ptr<Module> M)

--- src/jitlayers.h  2022-06-29 15:37:58.975952000 +0000
+++ src/jitlayers.h  2022-06-29 15:46:24.985016703 +0000
@@ -185,6 +185,7 @@
                          const object::ObjectFile &Obj,
                          const RuntimeDyld::LoadedObjectInfo &LoadedObjectInfo);
 #endif
+    orc::SymbolStringPtr mangle(StringRef Name);
     void addGlobalMapping(StringRef Name, uint64_t Addr);
     void addModule(std::unique_ptr<Module> M);
 #if JL_LLVM_VERSION < 120000

--- src/intrinsics.cpp  2022-06-29 16:28:06.923128000 +0000
+++ src/intrinsics.cpp  2022-06-29 16:30:30.343357962 +0000
@@ -1476,22 +1476,17 @@
 
 #if !defined(_OS_DARWIN_)   // xcode already links compiler-rt
 
-extern "C" JL_DLLEXPORT float __gnu_h2f_ieee(uint16_t param)
+extern "C" JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param)
 {
     return half_to_float(param);
 }
 
-extern "C" JL_DLLEXPORT float __extendhfsf2(uint16_t param)
-{
-    return half_to_float(param);
-}
-
-extern "C" JL_DLLEXPORT uint16_t __gnu_f2h_ieee(float param)
+extern "C" JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param)
 {
     return float_to_half(param);
 }
 
-extern "C" JL_DLLEXPORT uint16_t __truncdfhf2(double param)
+extern "C" JL_DLLEXPORT uint16_t julia__truncdfhf2(double param)
 {
     return float_to_half((float)param);
 }

--- test/intrinsics.jl  2022-06-29 15:37:59.139956000 +0000
+++ test/intrinsics.jl  2022-06-29 15:49:07.285356548 +0000
@@ -152,3 +152,27 @@
     @test_intrinsic Core.Intrinsics.fptosi Int Float16(3.3) 3
     @test_intrinsic Core.Intrinsics.fptoui UInt Float16(3.3) UInt(3)
 end
+
+if Sys.ARCH == :aarch64
+    # On AArch64 we are following the `_Float16` ABI. Buthe these functions expect `Int16`.
+    # TODO: SHould we have `Chalf == Int16` and `Cfloat16 == Float16`?
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    truncsfhf2(x::Float32) = reinterpret(Float16, ccall("extern __truncsfhf2", llvmcall, Int16, (Float32,), x))
+    gnu_f2h_ieee(x::Float32) = reinterpret(Float16, ccall("extern __gnu_f2h_ieee", llvmcall, Int16, (Float32,), x))
+    truncdfhf2(x::Float64) = reinterpret(Float16, ccall("extern __truncdfhf2", llvmcall, Int16, (Float64,), x))
+else
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Float16,), x)
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Float16,), x)
+    truncsfhf2(x::Float32) = ccall("extern __truncsfhf2", llvmcall, Float16, (Float32,), x)
+    gnu_f2h_ieee(x::Float32) = ccall("extern __gnu_f2h_ieee", llvmcall, Float16, (Float32,), x)
+    truncdfhf2(x::Float64) = ccall("extern __truncdfhf2", llvmcall, Float16, (Float64,), x)
+end
+
+@testset "Float16 intrinsics (crt)" begin
+    @test extendhfsf2(Float16(3.3)) == 3.3007812f0
+    @test gnu_h2f_ieee(Float16(3.3)) == 3.3007812f0
+    @test truncsfhf2(3.3f0) == Float16(3.3)
+    @test gnu_f2h_ieee(3.3f0) == Float16(3.3)
+    @test truncdfhf2(3.3) == Float16(3.3)
+end
1.7.3
--- src/APInt-C.cpp  2022-06-29 15:38:07.412161000 +0000
+++ src/APInt-C.cpp  2022-06-29 16:03:07.396275264 +0000
@@ -316,7 +316,7 @@
 void LLVMFPtoInt(unsigned numbits, void *pa, unsigned onumbits, integerPart *pr, bool isSigned, bool *isExact) {
     double Val;
     if (numbits == 16)
-        Val = __gnu_h2f_ieee(*(uint16_t*)pa);
+        Val = julia__gnu_h2f_ieee(*(uint16_t*)pa);
     else if (numbits == 32)
         Val = *(float*)pa;
     else if (numbits == 64)
@@ -391,7 +391,7 @@
         val = a.roundToDouble(true);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)
@@ -408,7 +408,7 @@
         val = a.roundToDouble(false);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)

--- src/aotcompile.cpp  2022-06-29 15:38:07.416161000 +0000
+++ src/aotcompile.cpp  2022-06-29 16:36:32.101927553 +0000
@@ -50,6 +50,7 @@
 #include <llvm/MC/MCCodeEmitter.h>
 #include <llvm/Support/CodeGen.h>
 
+#include <llvm/IR/IRBuilder.h>
 #include <llvm/IR/LegacyPassManagers.h>
 #include <llvm/Transforms/Utils/Cloning.h>
 
@@ -446,6 +447,24 @@
     jl_safe_printf("ERROR: failed to emit output file %s\n", err.c_str());
 }
 
+static void injectCRTAlias(Module &M, StringRef name, StringRef alias, FunctionType *FT)
+{
+    Function *target = M.getFunction(alias);
+    if (!target) {
+        target = Function::Create(FT, Function::ExternalLinkage, alias, M);
+    }
+    // Weak so that this does not get discarded
+    // maybe use llvm.compiler.used instead?
+    Function *interposer = Function::Create(FT, Function::WeakAnyLinkage, name, M);
+
+    llvm::IRBuilder<> builder(BasicBlock::Create(M.getContext(), "top", interposer));
+    SmallVector<Value *, 4> CallArgs;
+    for (auto &arg : interposer->args())
+        CallArgs.push_back(&arg);
+    auto val = builder.CreateCall(target, CallArgs);
+    builder.CreateRet(val);
+}
+
 
 // takes the running content that has collected in the shadow module and dump it to disk
 // this builds the object file portion of the sysimage files for fast startup
@@ -551,6 +570,20 @@
                                      "jl_RTLD_DEFAULT_handle_pointer"));
     }
 
+    // We would like to emit an alias or an weakref alias to redirect these symbols
+    // but LLVM doesn't let us emit a GlobalAlias to a declaration...
+    // So for now we inject a definition of these functions that calls our runtime functions.
+    injectCRTAlias(*data->M, "__gnu_h2f_ieee", "julia__gnu_h2f_ieee",
+            FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+    injectCRTAlias(*data->M, "__extendhfsf2", "julia__gnu_h2f_ieee",
+            FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+    injectCRTAlias(*data->M, "__gnu_f2h_ieee", "julia__gnu_f2h_ieee",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+    injectCRTAlias(*data->M, "__truncsfhf2", "julia__gnu_f2h_ieee",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+    injectCRTAlias(*data->M, "__truncdfhf2", "julia__truncdfhf2",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getDoubleTy(Context) }, false));
+
     // do the actual work
     auto add_output = [&] (Module &M, StringRef unopt_bc_Name, StringRef bc_Name, StringRef obj_Name, StringRef asm_Name) {
         PM.run(M);

--- src/julia.expmap  2022-06-29 15:38:07.444162000 +0000
+++ src/julia.expmap  2022-06-29 16:02:38.471479518 +0000
@@ -42,12 +42,6 @@
     environ;
     __progname;
 
-    /* compiler run-time intrinsics */
-    __gnu_h2f_ieee;
-    __extendhfsf2;
-    __gnu_f2h_ieee;
-    __truncdfhf2;
-
   local:
     *;
 };

--- src/julia_internal.h  2022-06-29 15:38:07.448162000 +0000
+++ src/julia_internal.h  2022-06-29 16:03:58.453680503 +0000
@@ -1427,8 +1427,9 @@
   #define JL_GC_ASSERT_LIVE(x) (void)(x)
 #endif
 
-float __gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
-uint16_t __gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__truncdfhf2(double param) JL_NOTSAFEPOINT;
 
 #ifdef __cplusplus
 }

--- src/runtime_intrinsics.c  2022-06-29 15:38:07.456162000 +0000
+++ src/runtime_intrinsics.c  2022-06-29 16:05:46.116645907 +0000
@@ -338,9 +338,9 @@
     }
 
 #define fp_select(a, func) \
-    sizeof(a) == sizeof(float) ? func##f((float)a) : func(a)
+    sizeof(a) <= sizeof(float) ? func##f((float)a) : func(a)
 #define fp_select2(a, b, func) \
-    sizeof(a) == sizeof(float) ? func##f(a, b) : func(a, b)
+    sizeof(a) <= sizeof(float) ? func##f(a, b) : func(a, b)
 
 // fast-function generators //
 
@@ -384,11 +384,11 @@
 static inline void name(unsigned osize, void *pa, void *pr) JL_NOTSAFEPOINT \
 { \
     uint16_t a = *(uint16_t*)pa; \
-    float A = __gnu_h2f_ieee(a); \
+    float A = julia__gnu_h2f_ieee(a); \
     if (osize == 16) { \
         float R; \
         OP(&R, A); \
-        *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
     } else { \
         OP((uint16_t*)pr, A); \
     } \
@@ -412,11 +412,11 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     float R = OP(A, B); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 // float or integer inputs, bool output
@@ -437,8 +437,8 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     return OP(A, B); \
 }
@@ -478,12 +478,12 @@
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
     uint16_t c = *(uint16_t*)pc; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
-    float C = __gnu_h2f_ieee(c); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
+    float C = julia__gnu_h2f_ieee(c); \
     runtime_nbits = 16; \
     float R = OP(A, B, C); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 
@@ -1001,7 +1001,7 @@
 fpiseq_n(float, 32)
 fpiseq_n(double, 64)
 #define fpiseq(a,b) \
-    sizeof(a) == sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
+    sizeof(a) <= sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
 
 bool_fintrinsic(eq,eq_float)
 bool_fintrinsic(ne,ne_float)
@@ -1050,7 +1050,7 @@
         if (!(osize < 8 * sizeof(a))) \
             jl_error("fptrunc: output bitsize must be < input bitsize"); \
         else if (osize == 16) \
-            *(uint16_t*)pr = __gnu_f2h_ieee(a); \
+            *(uint16_t*)pr = julia__gnu_f2h_ieee(a); \
         else if (osize == 32) \
             *(float*)pr = a; \
         else if (osize == 64) \

--- src/jitlayers.cpp  2022-06-29 15:38:07.440162000 +0000
+++ src/jitlayers.cpp  2022-06-29 16:38:19.841056942 +0000
@@ -728,12 +728,26 @@
     }
 
     JD.addToLinkOrder(GlobalJD, orc::JITDylibLookupFlags::MatchExportedSymbolsOnly);
+
+    orc::SymbolAliasMap jl_crt = {
+        { mangle("__gnu_h2f_ieee"), { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__extendhfsf2"),  { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__gnu_f2h_ieee"), { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncsfhf2"),   { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncdfhf2"),   { mangle("julia__truncdfhf2"),   JITSymbolFlags::Exported } }
+    };
+    cantFail(GlobalJD.define(orc::symbolAliases(jl_crt)));
 }
 
-void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+orc::SymbolStringPtr JuliaOJIT::mangle(StringRef Name)
 {
     std::string MangleName = getMangledName(Name);
-    cantFail(JD.define(orc::absoluteSymbols({{ES.intern(MangleName), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
+    return ES.intern(MangleName);
+}
+
+void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+{
+    cantFail(JD.define(orc::absoluteSymbols({{mangle(Name), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
 }
 
 void JuliaOJIT::addModule(std::unique_ptr<Module> M)

--- src/jitlayers.h  2022-06-29 15:38:07.440162000 +0000
+++ src/jitlayers.h  2022-06-29 16:08:04.044478978 +0000
@@ -182,6 +182,7 @@
                          const object::ObjectFile &Obj,
                          const RuntimeDyld::LoadedObjectInfo &LoadedObjectInfo);
 #endif
+    orc::SymbolStringPtr mangle(StringRef Name);
     void addGlobalMapping(StringRef Name, uint64_t Addr);
     void addModule(std::unique_ptr<Module> M);
 #if JL_LLVM_VERSION < 120000

--- src/intrinsics.cpp  2022-06-29 16:26:53.104938000 +0000
+++ src/intrinsics.cpp  2022-06-29 16:31:32.729189496 +0000
@@ -1635,22 +1635,17 @@
 
 #if !defined(_OS_DARWIN_)   // xcode already links compiler-rt
 
-extern "C" JL_DLLEXPORT float __gnu_h2f_ieee(uint16_t param)
+extern "C" JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param)
 {
     return half_to_float(param);
 }
 
-extern "C" JL_DLLEXPORT float __extendhfsf2(uint16_t param)
-{
-    return half_to_float(param);
-}
-
-extern "C" JL_DLLEXPORT uint16_t __gnu_f2h_ieee(float param)
+extern "C" JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param)
 {
     return float_to_half(param);
 }
 
-extern "C" JL_DLLEXPORT uint16_t __truncdfhf2(double param)
+extern "C" JL_DLLEXPORT uint16_t julia__truncdfhf2(double param)
 {
     float res = (float)param;
     uint32_t resi;

--- test/intrinsics.jl  2022-06-29 15:38:07.584165000 +0000
+++ test/intrinsics.jl  2022-06-29 16:56:50.640396691 +0000
@@ -284,3 +284,27 @@
         @test r2 isa IntWrap && r2.x === 103 === r[].x && r2 !== r[]
     end
 end)()
+
+if Sys.ARCH == :aarch64
+    # On AArch64 we are following the `_Float16` ABI. Buthe these functions expect `Int16`.
+    # TODO: SHould we have `Chalf == Int16` and `Cfloat16 == Float16`?
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    truncsfhf2(x::Float32) = reinterpret(Float16, ccall("extern __truncsfhf2", llvmcall, Int16, (Float32,), x))
+    gnu_f2h_ieee(x::Float32) = reinterpret(Float16, ccall("extern __gnu_f2h_ieee", llvmcall, Int16, (Float32,), x))
+    truncdfhf2(x::Float64) = reinterpret(Float16, ccall("extern __truncdfhf2", llvmcall, Int16, (Float64,), x))
+else
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Float16,), x)
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Float16,), x)
+    truncsfhf2(x::Float32) = ccall("extern __truncsfhf2", llvmcall, Float16, (Float32,), x)
+    gnu_f2h_ieee(x::Float32) = ccall("extern __gnu_f2h_ieee", llvmcall, Float16, (Float32,), x)
+    truncdfhf2(x::Float64) = ccall("extern __truncdfhf2", llvmcall, Float16, (Float64,), x)
+end
+
+@testset "Float16 intrinsics (crt)" begin
+    @test extendhfsf2(Float16(3.3)) == 3.3007812f0
+    @test gnu_h2f_ieee(Float16(3.3)) == 3.3007812f0
+    @test truncsfhf2(3.3f0) == Float16(3.3)
+    @test gnu_f2h_ieee(3.3f0) == Float16(3.3)
+    @test truncdfhf2(3.3) == Float16(3.3)
+end
1.8.0-rc1
--- src/aotcompile.cpp  2022-06-29 15:38:07.416161000 +0000
+++ src/aotcompile.cpp  2022-06-29 16:36:32.101927553 +0000
@@ -50,6 +50,7 @@
 #include <llvm/MC/MCCodeEmitter.h>
 #include <llvm/Support/CodeGen.h>
 
+#include <llvm/IR/IRBuilder.h>
 #include <llvm/IR/LegacyPassManagers.h>
 #include <llvm/Transforms/Utils/Cloning.h>
 
@@ -446,6 +447,24 @@
     jl_safe_printf("ERROR: failed to emit output file %s\n", err.c_str());
 }
 
+static void injectCRTAlias(Module &M, StringRef name, StringRef alias, FunctionType *FT)
+{
+    Function *target = M.getFunction(alias);
+    if (!target) {
+        target = Function::Create(FT, Function::ExternalLinkage, alias, M);
+    }
+    // Weak so that this does not get discarded
+    // maybe use llvm.compiler.used instead?
+    Function *interposer = Function::Create(FT, Function::WeakAnyLinkage, name, M);
+
+    llvm::IRBuilder<> builder(BasicBlock::Create(M.getContext(), "top", interposer));
+    SmallVector<Value *, 4> CallArgs;
+    for (auto &arg : interposer->args())
+        CallArgs.push_back(&arg);
+    auto val = builder.CreateCall(target, CallArgs);
+    builder.CreateRet(val);
+}
+
 
 // takes the running content that has collected in the shadow module and dump it to disk
 // this builds the object file portion of the sysimage files for fast startup
@@ -551,6 +570,20 @@
                                      "jl_RTLD_DEFAULT_handle_pointer"));
     }
 
+    // We would like to emit an alias or an weakref alias to redirect these symbols
+    // but LLVM doesn't let us emit a GlobalAlias to a declaration...
+    // So for now we inject a definition of these functions that calls our runtime functions.
+    injectCRTAlias(*data->M, "__gnu_h2f_ieee", "julia__gnu_h2f_ieee",
+            FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+    injectCRTAlias(*data->M, "__extendhfsf2", "julia__gnu_h2f_ieee",
+            FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+    injectCRTAlias(*data->M, "__gnu_f2h_ieee", "julia__gnu_f2h_ieee",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+    injectCRTAlias(*data->M, "__truncsfhf2", "julia__gnu_f2h_ieee",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+    injectCRTAlias(*data->M, "__truncdfhf2", "julia__truncdfhf2",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getDoubleTy(Context) }, false));
+
     // do the actual work
     auto add_output = [&] (Module &M, StringRef unopt_bc_Name, StringRef bc_Name, StringRef obj_Name, StringRef asm_Name) {
         PM.run(M);

--- src/jitlayers.cpp  2022-06-29 15:38:07.440162000 +0000
+++ src/jitlayers.cpp  2022-06-29 16:38:19.841056942 +0000
@@ -728,12 +728,26 @@
     }
 
     JD.addToLinkOrder(GlobalJD, orc::JITDylibLookupFlags::MatchExportedSymbolsOnly);
+
+    orc::SymbolAliasMap jl_crt = {
+        { mangle("__gnu_h2f_ieee"), { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__extendhfsf2"),  { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__gnu_f2h_ieee"), { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncsfhf2"),   { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncdfhf2"),   { mangle("julia__truncdfhf2"),   JITSymbolFlags::Exported } }
+    };
+    cantFail(GlobalJD.define(orc::symbolAliases(jl_crt)));
 }
 
-void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+orc::SymbolStringPtr JuliaOJIT::mangle(StringRef Name)
 {
     std::string MangleName = getMangledName(Name);
-    cantFail(JD.define(orc::absoluteSymbols({{ES.intern(MangleName), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
+    return ES.intern(MangleName);
+}
+
+void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+{
+    cantFail(JD.define(orc::absoluteSymbols({{mangle(Name), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
 }
 
 void JuliaOJIT::addModule(std::unique_ptr<Module> M)

--- src/jitlayers.h 2022-06-29 18:41:05.689863399 +0200
+++ src/jitlayers.h  2022-06-29 18:45:27.071795560 +0200
@@ -204,6 +204,7 @@
     void RegisterJITEventListener(JITEventListener *L);
 #endif
 
+    orc::SymbolStringPtr mangle(StringRef Name);
     void addGlobalMapping(StringRef Name, uint64_t Addr);
     void addModule(std::unique_ptr<Module> M);

--- test/intrinsics.jl  2022-06-29 15:38:07.584165000 +0000
+++ test/intrinsics.jl  2022-06-29 16:56:50.640396691 +0000
@@ -284,3 +284,27 @@
         @test r2 isa IntWrap && r2.x === 103 === r[].x && r2 !== r[]
     end
 end)()
+
+if Sys.ARCH == :aarch64
+    # On AArch64 we are following the `_Float16` ABI. Buthe these functions expect `Int16`.
+    # TODO: SHould we have `Chalf == Int16` and `Cfloat16 == Float16`?
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    truncsfhf2(x::Float32) = reinterpret(Float16, ccall("extern __truncsfhf2", llvmcall, Int16, (Float32,), x))
+    gnu_f2h_ieee(x::Float32) = reinterpret(Float16, ccall("extern __gnu_f2h_ieee", llvmcall, Int16, (Float32,), x))
+    truncdfhf2(x::Float64) = reinterpret(Float16, ccall("extern __truncdfhf2", llvmcall, Int16, (Float64,), x))
+else
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Float16,), x)
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Float16,), x)
+    truncsfhf2(x::Float32) = ccall("extern __truncsfhf2", llvmcall, Float16, (Float32,), x)
+    gnu_f2h_ieee(x::Float32) = ccall("extern __gnu_f2h_ieee", llvmcall, Float16, (Float32,), x)
+    truncdfhf2(x::Float64) = ccall("extern __truncdfhf2", llvmcall, Float16, (Float64,), x)
+end
+
+@testset "Float16 intrinsics (crt)" begin
+    @test extendhfsf2(Float16(3.3)) == 3.3007812f0
+    @test gnu_h2f_ieee(Float16(3.3)) == 3.3007812f0
+    @test truncsfhf2(3.3f0) == Float16(3.3)
+    @test gnu_f2h_ieee(3.3f0) == Float16(3.3)
+    @test truncdfhf2(3.3) == Float16(3.3)
+end

mstorsjo pushed a commit to llvm/llvm-project that referenced this pull request Jul 16, 2022
When using weak symbols, the WinCOFFObjectWriter keeps a list (`WeakDefaults`)
that's used to make names unique. This list should be reset when the object
writer is reset, because otherwise reuse of the object writer can result in
freed symbols being accessed. With some added output, this becomes clear when
using `llc` in `--run-twice` mode:

```
$ ./llc --compile-twice -mtriple=x86_64-pc-win32 trivial.ll -filetype=obj

DefineSymbol::WeakDefaults
 - .weak.foo.default
 - .weak.bar.default

DefineSymbol::WeakDefaults
 - .weak.foo.default
 - áÑJij⌂  p§┼Ø┐☺
 - .debug_macinfo.dw
 - .weak.bar.default
```

This does not seem to leak into the output object file though, so I couldn't
come up with a test. I added one that just does `--run-twice` (and verified
that it does access freed memory), which should result in detecting the
invalid memory accesses when running under ASAN.

Observed in a Julia PR where we started using weak symbols:
JuliaLang/julia#45649

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D129840
@maleadt maleadt changed the base branch from master to tb/llvm July 16, 2022 18:36
Base automatically changed from tb/llvm to master July 17, 2022 18:54
@maleadt
Copy link
Member

maleadt commented Jul 19, 2022

CI failures:

  • freebsd: fcntl(): Bad file descriptor, also happens on master
  • linux x64: also happens on other PRs
    Pkg                                               (5) |         failed at 2022-07-19T11:41:06.090
    Test Failed at /cache/build/default-amdci5-4/julialang/julia-master/julia-ab30809523/share/julia/stdlib/v1.9/Pkg/test/registry.jl:359
    Expression: isfile(joinpath(DEPOT_PATH[1], "registries", "General.tar.gz")) != something(unpack, false)
    Evaluated: false != false
    Test Failed at /cache/build/default-amdci5-4/julialang/julia-master/julia-ab30809523/share/julia/stdlib/v1.9/Pkg/test/registry.jl:378
    Expression: isempty(readdir(joinpath(DEPOT_PATH[1], "registries")))
    Evaluated: isempty(["General"])
    
  • macos x64: unlikely to be related, and didn't occur on retry
    InteractiveUtils                                 (13) |         failed at 2022-07-19T06:27:39.462
    ProcessExitedException(13)
    

Crucially both windows bots are happy 🎉 So this looks good to go for me.

@vchuravy vchuravy removed the DO NOT MERGE Do not merge this PR! label Jul 19, 2022
@vchuravy
Copy link
Member Author

vchuravy commented Jul 19, 2022

Thanks Tim for getting this across the finish line! LGTM!

vtjnash and others added 3 commits July 19, 2022 12:43
- Put the interposer in llvm.compiler.used.
- Injecting the aliases after optimization:
  Our multiversioning pass interacts badly with the llvm.compiler.used gvar.

Co-authored-by: Tim Besard <[email protected]>
Co-authored-by: Valentin Churavy <[email protected]>
@vchuravy vchuravy merged commit adf2e1b into master Jul 20, 2022
@vchuravy vchuravy deleted the vc/fp16 branch July 20, 2022 00:16
Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@t-bltg
Copy link
Contributor

t-bltg commented Jul 22, 2022

@KristofferC, if you need it, here is a manual backport of the merged PR for all concerned versions, with tests passing:

1.6.7
--- src/APInt-C.cpp  2022-06-29 17:37:58.943951000 +0200
+++ src/APInt-C.cpp  2022-06-29 17:39:56.742904521 +0200
@@ -316,7 +316,7 @@
 void LLVMFPtoInt(unsigned numbits, void *pa, unsigned onumbits, integerPart *pr, bool isSigned, bool *isExact) {
     double Val;
     if (numbits == 16)
-        Val = __gnu_h2f_ieee(*(uint16_t*)pa);
+        Val = julia__gnu_h2f_ieee(*(uint16_t*)pa);
     else if (numbits == 32)
         Val = *(float*)pa;
     else if (numbits == 64)
@@ -391,7 +391,7 @@
         val = a.roundToDouble(true);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)
@@ -408,7 +408,7 @@
         val = a.roundToDouble(false);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)

--- src/aotcompile.cpp  2022-06-29 17:37:58.943951000 +0200
+++ src/aotcompile.cpp  2022-07-22 10:09:59.465318017 +0200
@@ -51,8 +51,10 @@
 #include <llvm/Support/CodeGen.h>
 #endif
 
+#include <llvm/IR/IRBuilder.h>
 #include <llvm/IR/LegacyPassManagers.h>
 #include <llvm/Transforms/Utils/Cloning.h>
+#include <llvm/Transforms/Utils/ModuleUtils.h>
 
 
 using namespace llvm;
@@ -276,6 +278,23 @@
     *ci_out = codeinst;
 }
 
+static void injectCRTAlias(Module &M, StringRef name, StringRef alias, FunctionType *FT)
+{
+    Function *target = M.getFunction(alias);
+    if (!target) {
+        target = Function::Create(FT, Function::ExternalLinkage, alias, M);
+    }
+    Function *interposer = Function::Create(FT, Function::WeakAnyLinkage, name, M);
+    appendToCompilerUsed(M, {interposer});
+
+    llvm::IRBuilder<> builder(BasicBlock::Create(M.getContext(), "top", interposer));
+    SmallVector<Value *, 4> CallArgs;
+    for (auto &arg : interposer->args())
+        CallArgs.push_back(&arg);
+    auto val = builder.CreateCall(target, CallArgs);
+    builder.CreateRet(val);
+}
+
 // takes the running content that has collected in the shadow module and dump it to disk
 // this builds the object file portion of the sysimage files for fast startup, and can
 // also be used be extern consumers like GPUCompiler.jl to obtain a module containing
@@ -556,7 +575,22 @@
 
     // do the actual work
     auto add_output = [&] (Module &M, StringRef unopt_bc_Name, StringRef bc_Name, StringRef obj_Name, StringRef asm_Name) {
+        // We would like to emit an alias or an weakref alias to redirect these symbols
+        // but LLVM doesn't let us emit a GlobalAlias to a declaration...
+        // So for now we inject a definition of these functions that calls our runtime functions.
+        injectCRTAlias(M, "__gnu_h2f_ieee", "julia__gnu_h2f_ieee",
+                FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+        injectCRTAlias(M, "__extendhfsf2", "julia__gnu_h2f_ieee",
+                FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+        injectCRTAlias(M, "__gnu_f2h_ieee", "julia__gnu_f2h_ieee",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+        injectCRTAlias(M, "__truncsfhf2", "julia__gnu_f2h_ieee",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+        injectCRTAlias(M, "__truncdfhf2", "julia__truncdfhf2",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getDoubleTy(Context) }, false));
+
         PM.run(M);
+
         if (unopt_bc_fname)
             emit_result(unopt_bc_Archive, unopt_bc_Buffer, unopt_bc_Name, outputs);
         if (bc_fname)

--- src/julia.expmap  2022-06-29 17:37:58.987952000 +0200
+++ src/julia.expmap  2022-06-29 17:40:28.643715568 +0200
@@ -42,12 +42,6 @@
     environ;
     __progname;
 
-    /* compiler run-time intrinsics */
-    __gnu_h2f_ieee;
-    __extendhfsf2;
-    __gnu_f2h_ieee;
-    __truncdfhf2;
-
   local:
     *;
 };

--- src/julia_internal.h  2022-06-29 17:37:58.991953000 +0200
+++ src/julia_internal.h  2022-06-29 17:42:47.155284019 +0200
@@ -1363,8 +1363,9 @@
   #define JL_GC_ASSERT_LIVE(x) (void)(x)
 #endif
 
-float __gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
-uint16_t __gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__truncdfhf2(double param) JL_NOTSAFEPOINT;
 
 #ifdef __cplusplus
 }

--- src/runtime_intrinsics.c  2022-06-29 17:37:59.003953000 +0200
+++ src/runtime_intrinsics.c  2022-07-19 18:37:28.928908192 +0200
@@ -169,9 +169,9 @@
     }
 
 #define fp_select(a, func) \
-    sizeof(a) == sizeof(float) ? func##f((float)a) : func(a)
+    sizeof(a) <= sizeof(float) ? func##f((float)a) : func(a)
 #define fp_select2(a, b, func) \
-    sizeof(a) == sizeof(float) ? func##f(a, b) : func(a, b)
+    sizeof(a) <= sizeof(float) ? func##f(a, b) : func(a, b)
 
 // fast-function generators //
 
@@ -215,11 +215,11 @@
 static inline void name(unsigned osize, void *pa, void *pr) JL_NOTSAFEPOINT \
 { \
     uint16_t a = *(uint16_t*)pa; \
-    float A = __gnu_h2f_ieee(a); \
+    float A = julia__gnu_h2f_ieee(a); \
     if (osize == 16) { \
         float R; \
         OP(&R, A); \
-        *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
     } else { \
         OP((uint16_t*)pr, A); \
     } \
@@ -243,11 +243,11 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     float R = OP(A, B); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 // float or integer inputs, bool output
@@ -268,8 +268,8 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     return OP(A, B); \
 }
@@ -309,12 +309,12 @@
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
     uint16_t c = *(uint16_t*)pc; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
-    float C = __gnu_h2f_ieee(c); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
+    float C = julia__gnu_h2f_ieee(c); \
     runtime_nbits = 16; \
     float R = OP(A, B, C); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 
@@ -832,7 +832,7 @@
 fpiseq_n(float, 32)
 fpiseq_n(double, 64)
 #define fpiseq(a,b) \
-    sizeof(a) == sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
+    sizeof(a) <= sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
 
 #define fpislt_n(c_type, nbits)                                         \
     static inline int fpislt##nbits(c_type a, c_type b) JL_NOTSAFEPOINT \
@@ -903,7 +903,7 @@
         if (!(osize < 8 * sizeof(a))) \
             jl_error("fptrunc: output bitsize must be < input bitsize"); \
         else if (osize == 16) \
-            *(uint16_t*)pr = __gnu_f2h_ieee(a); \
+            *(uint16_t*)pr = julia__gnu_f2h_ieee(a); \
         else if (osize == 32) \
             *(float*)pr = a; \
         else if (osize == 64) \

--- src/jitlayers.cpp  2022-06-29 17:37:58.975952000 +0200
+++ src/jitlayers.cpp  2022-06-29 17:45:50.344097088 +0200
@@ -737,12 +737,26 @@
     }
 
     JD.addToLinkOrder(GlobalJD, orc::JITDylibLookupFlags::MatchExportedSymbolsOnly);
+
+    orc::SymbolAliasMap jl_crt = {
+        { mangle("__gnu_h2f_ieee"), { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__extendhfsf2"),  { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__gnu_f2h_ieee"), { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncsfhf2"),   { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncdfhf2"),   { mangle("julia__truncdfhf2"),   JITSymbolFlags::Exported } }
+    };
+    cantFail(GlobalJD.define(orc::symbolAliases(jl_crt)));
 }
 
-void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+orc::SymbolStringPtr JuliaOJIT::mangle(StringRef Name)
 {
     std::string MangleName = getMangledName(Name);
-    cantFail(JD.define(orc::absoluteSymbols({{ES.intern(MangleName), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
+    return ES.intern(MangleName);
+}
+
+void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+{
+    cantFail(JD.define(orc::absoluteSymbols({{mangle(Name), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
 }
 
 void JuliaOJIT::addModule(std::unique_ptr<Module> M)

--- src/jitlayers.h  2022-06-29 17:37:58.975952000 +0200
+++ src/jitlayers.h  2022-06-29 17:46:24.985016703 +0200
@@ -185,6 +185,7 @@
                          const object::ObjectFile &Obj,
                          const RuntimeDyld::LoadedObjectInfo &LoadedObjectInfo);
 #endif
+    orc::SymbolStringPtr mangle(StringRef Name);
     void addGlobalMapping(StringRef Name, uint64_t Addr);
     void addModule(std::unique_ptr<Module> M);
 #if JL_LLVM_VERSION < 120000

--- src/intrinsics.cpp  2022-06-29 18:28:06.923128000 +0200
+++ src/intrinsics.cpp  2022-06-29 18:30:30.343357962 +0200
@@ -1476,22 +1476,17 @@
 
 #if !defined(_OS_DARWIN_)   // xcode already links compiler-rt
 
-extern "C" JL_DLLEXPORT float __gnu_h2f_ieee(uint16_t param)
+extern "C" JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param)
 {
     return half_to_float(param);
 }
 
-extern "C" JL_DLLEXPORT float __extendhfsf2(uint16_t param)
-{
-    return half_to_float(param);
-}
-
-extern "C" JL_DLLEXPORT uint16_t __gnu_f2h_ieee(float param)
+extern "C" JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param)
 {
     return float_to_half(param);
 }
 
-extern "C" JL_DLLEXPORT uint16_t __truncdfhf2(double param)
+extern "C" JL_DLLEXPORT uint16_t julia__truncdfhf2(double param)
 {
     return float_to_half((float)param);
 }

--- test/intrinsics.jl  2022-06-29 17:37:59.139956000 +0200
+++ test/intrinsics.jl  2022-06-29 17:49:07.285356548 +0200
@@ -152,3 +152,27 @@
     @test_intrinsic Core.Intrinsics.fptosi Int Float16(3.3) 3
     @test_intrinsic Core.Intrinsics.fptoui UInt Float16(3.3) UInt(3)
 end
+
+if Sys.ARCH == :aarch64
+    # On AArch64 we are following the `_Float16` ABI. Buthe these functions expect `Int16`.
+    # TODO: SHould we have `Chalf == Int16` and `Cfloat16 == Float16`?
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    truncsfhf2(x::Float32) = reinterpret(Float16, ccall("extern __truncsfhf2", llvmcall, Int16, (Float32,), x))
+    gnu_f2h_ieee(x::Float32) = reinterpret(Float16, ccall("extern __gnu_f2h_ieee", llvmcall, Int16, (Float32,), x))
+    truncdfhf2(x::Float64) = reinterpret(Float16, ccall("extern __truncdfhf2", llvmcall, Int16, (Float64,), x))
+else
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Float16,), x)
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Float16,), x)
+    truncsfhf2(x::Float32) = ccall("extern __truncsfhf2", llvmcall, Float16, (Float32,), x)
+    gnu_f2h_ieee(x::Float32) = ccall("extern __gnu_f2h_ieee", llvmcall, Float16, (Float32,), x)
+    truncdfhf2(x::Float64) = ccall("extern __truncdfhf2", llvmcall, Float16, (Float64,), x)
+end
+
+@testset "Float16 intrinsics (crt)" begin
+    @test extendhfsf2(Float16(3.3)) == 3.3007812f0
+    @test gnu_h2f_ieee(Float16(3.3)) == 3.3007812f0
+    @test truncsfhf2(3.3f0) == Float16(3.3)
+    @test gnu_f2h_ieee(3.3f0) == Float16(3.3)
+    @test truncdfhf2(3.3) == Float16(3.3)
+end
1.7.3
--- src/APInt-C.cpp  2022-06-29 17:38:07.412161000 +0200
+++ src/APInt-C.cpp  2022-06-29 18:03:07.396275264 +0200
@@ -316,7 +316,7 @@
 void LLVMFPtoInt(unsigned numbits, void *pa, unsigned onumbits, integerPart *pr, bool isSigned, bool *isExact) {
     double Val;
     if (numbits == 16)
-        Val = __gnu_h2f_ieee(*(uint16_t*)pa);
+        Val = julia__gnu_h2f_ieee(*(uint16_t*)pa);
     else if (numbits == 32)
         Val = *(float*)pa;
     else if (numbits == 64)
@@ -391,7 +391,7 @@
         val = a.roundToDouble(true);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)
@@ -408,7 +408,7 @@
         val = a.roundToDouble(false);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)

--- src/aotcompile.cpp  2022-06-29 17:38:07.416161000 +0200
+++ src/aotcompile.cpp  2022-07-22 10:08:25.371800696 +0200
@@ -50,8 +50,10 @@
 #include <llvm/MC/MCCodeEmitter.h>
 #include <llvm/Support/CodeGen.h>
 
+#include <llvm/IR/IRBuilder.h>
 #include <llvm/IR/LegacyPassManagers.h>
 #include <llvm/Transforms/Utils/Cloning.h>
+#include <llvm/Transforms/Utils/ModuleUtils.h>
 
 
 using namespace llvm;
@@ -446,6 +448,23 @@
     jl_safe_printf("ERROR: failed to emit output file %s\n", err.c_str());
 }
 
+static void injectCRTAlias(Module &M, StringRef name, StringRef alias, FunctionType *FT)
+{
+    Function *target = M.getFunction(alias);
+    if (!target) {
+        target = Function::Create(FT, Function::ExternalLinkage, alias, M);
+    }
+    Function *interposer = Function::Create(FT, Function::WeakAnyLinkage, name, M);
+    appendToCompilerUsed(M, {interposer});
+
+    llvm::IRBuilder<> builder(BasicBlock::Create(M.getContext(), "top", interposer));
+    SmallVector<Value *, 4> CallArgs;
+    for (auto &arg : interposer->args())
+        CallArgs.push_back(&arg);
+    auto val = builder.CreateCall(target, CallArgs);
+    builder.CreateRet(val);
+}
+
 
 // takes the running content that has collected in the shadow module and dump it to disk
 // this builds the object file portion of the sysimage files for fast startup
@@ -553,7 +572,22 @@
 
     // do the actual work
     auto add_output = [&] (Module &M, StringRef unopt_bc_Name, StringRef bc_Name, StringRef obj_Name, StringRef asm_Name) {
+        // We would like to emit an alias or an weakref alias to redirect these symbols
+        // but LLVM doesn't let us emit a GlobalAlias to a declaration...
+        // So for now we inject a definition of these functions that calls our runtime functions.
+        injectCRTAlias(M, "__gnu_h2f_ieee", "julia__gnu_h2f_ieee",
+                FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+        injectCRTAlias(M, "__extendhfsf2", "julia__gnu_h2f_ieee",
+                FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+        injectCRTAlias(M, "__gnu_f2h_ieee", "julia__gnu_f2h_ieee",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+        injectCRTAlias(M, "__truncsfhf2", "julia__gnu_f2h_ieee",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+        injectCRTAlias(M, "__truncdfhf2", "julia__truncdfhf2",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getDoubleTy(Context) }, false));
+
         PM.run(M);
+
         if (unopt_bc_fname)
             emit_result(unopt_bc_Archive, unopt_bc_Buffer, unopt_bc_Name, outputs);
         if (bc_fname)

--- src/julia.expmap  2022-06-29 17:38:07.444162000 +0200
+++ src/julia.expmap  2022-06-29 18:02:38.471479518 +0200
@@ -42,12 +42,6 @@
     environ;
     __progname;
 
-    /* compiler run-time intrinsics */
-    __gnu_h2f_ieee;
-    __extendhfsf2;
-    __gnu_f2h_ieee;
-    __truncdfhf2;
-
   local:
     *;
 };

--- src/julia_internal.h  2022-06-29 17:38:07.448162000 +0200
+++ src/julia_internal.h  2022-06-29 18:03:58.453680503 +0200
@@ -1427,8 +1427,9 @@
   #define JL_GC_ASSERT_LIVE(x) (void)(x)
 #endif
 
-float __gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
-uint16_t __gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__truncdfhf2(double param) JL_NOTSAFEPOINT;
 
 #ifdef __cplusplus
 }

--- src/runtime_intrinsics.c  2022-06-29 17:38:07.456162000 +0200
+++ src/runtime_intrinsics.c  2022-06-29 18:05:46.116645907 +0200
@@ -338,9 +338,9 @@
     }
 
 #define fp_select(a, func) \
-    sizeof(a) == sizeof(float) ? func##f((float)a) : func(a)
+    sizeof(a) <= sizeof(float) ? func##f((float)a) : func(a)
 #define fp_select2(a, b, func) \
-    sizeof(a) == sizeof(float) ? func##f(a, b) : func(a, b)
+    sizeof(a) <= sizeof(float) ? func##f(a, b) : func(a, b)
 
 // fast-function generators //
 
@@ -384,11 +384,11 @@
 static inline void name(unsigned osize, void *pa, void *pr) JL_NOTSAFEPOINT \
 { \
     uint16_t a = *(uint16_t*)pa; \
-    float A = __gnu_h2f_ieee(a); \
+    float A = julia__gnu_h2f_ieee(a); \
     if (osize == 16) { \
         float R; \
         OP(&R, A); \
-        *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
     } else { \
         OP((uint16_t*)pr, A); \
     } \
@@ -412,11 +412,11 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     float R = OP(A, B); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 // float or integer inputs, bool output
@@ -437,8 +437,8 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     return OP(A, B); \
 }
@@ -478,12 +478,12 @@
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
     uint16_t c = *(uint16_t*)pc; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
-    float C = __gnu_h2f_ieee(c); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
+    float C = julia__gnu_h2f_ieee(c); \
     runtime_nbits = 16; \
     float R = OP(A, B, C); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 
@@ -1001,7 +1001,7 @@
 fpiseq_n(float, 32)
 fpiseq_n(double, 64)
 #define fpiseq(a,b) \
-    sizeof(a) == sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
+    sizeof(a) <= sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
 
 bool_fintrinsic(eq,eq_float)
 bool_fintrinsic(ne,ne_float)
@@ -1050,7 +1050,7 @@
         if (!(osize < 8 * sizeof(a))) \
             jl_error("fptrunc: output bitsize must be < input bitsize"); \
         else if (osize == 16) \
-            *(uint16_t*)pr = __gnu_f2h_ieee(a); \
+            *(uint16_t*)pr = julia__gnu_f2h_ieee(a); \
         else if (osize == 32) \
             *(float*)pr = a; \
         else if (osize == 64) \

--- src/jitlayers.cpp  2022-06-29 17:38:07.440162000 +0200
+++ src/jitlayers.cpp  2022-06-29 18:38:19.841056942 +0200
@@ -728,12 +728,26 @@
     }
 
     JD.addToLinkOrder(GlobalJD, orc::JITDylibLookupFlags::MatchExportedSymbolsOnly);
+
+    orc::SymbolAliasMap jl_crt = {
+        { mangle("__gnu_h2f_ieee"), { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__extendhfsf2"),  { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__gnu_f2h_ieee"), { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncsfhf2"),   { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncdfhf2"),   { mangle("julia__truncdfhf2"),   JITSymbolFlags::Exported } }
+    };
+    cantFail(GlobalJD.define(orc::symbolAliases(jl_crt)));
 }
 
-void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+orc::SymbolStringPtr JuliaOJIT::mangle(StringRef Name)
 {
     std::string MangleName = getMangledName(Name);
-    cantFail(JD.define(orc::absoluteSymbols({{ES.intern(MangleName), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
+    return ES.intern(MangleName);
+}
+
+void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+{
+    cantFail(JD.define(orc::absoluteSymbols({{mangle(Name), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
 }
 
 void JuliaOJIT::addModule(std::unique_ptr<Module> M)

--- src/jitlayers.h  2022-06-29 17:38:07.440162000 +0200
+++ src/jitlayers.h  2022-06-29 18:08:04.044478978 +0200
@@ -182,6 +182,7 @@
                          const object::ObjectFile &Obj,
                          const RuntimeDyld::LoadedObjectInfo &LoadedObjectInfo);
 #endif
+    orc::SymbolStringPtr mangle(StringRef Name);
     void addGlobalMapping(StringRef Name, uint64_t Addr);
     void addModule(std::unique_ptr<Module> M);
 #if JL_LLVM_VERSION < 120000

--- src/intrinsics.cpp  2022-06-29 18:26:53.104938000 +0200
+++ src/intrinsics.cpp  2022-06-29 18:31:32.729189496 +0200
@@ -1635,22 +1635,17 @@
 
 #if !defined(_OS_DARWIN_)   // xcode already links compiler-rt
 
-extern "C" JL_DLLEXPORT float __gnu_h2f_ieee(uint16_t param)
+extern "C" JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param)
 {
     return half_to_float(param);
 }
 
-extern "C" JL_DLLEXPORT float __extendhfsf2(uint16_t param)
-{
-    return half_to_float(param);
-}
-
-extern "C" JL_DLLEXPORT uint16_t __gnu_f2h_ieee(float param)
+extern "C" JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param)
 {
     return float_to_half(param);
 }
 
-extern "C" JL_DLLEXPORT uint16_t __truncdfhf2(double param)
+extern "C" JL_DLLEXPORT uint16_t julia__truncdfhf2(double param)
 {
     float res = (float)param;
     uint32_t resi;

--- test/intrinsics.jl  2022-06-29 17:38:07.584165000 +0200
+++ test/intrinsics.jl  2022-06-29 18:56:50.640396691 +0200
@@ -284,3 +284,27 @@
         @test r2 isa IntWrap && r2.x === 103 === r[].x && r2 !== r[]
     end
 end)()
+
+if Sys.ARCH == :aarch64
+    # On AArch64 we are following the `_Float16` ABI. Buthe these functions expect `Int16`.
+    # TODO: SHould we have `Chalf == Int16` and `Cfloat16 == Float16`?
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    truncsfhf2(x::Float32) = reinterpret(Float16, ccall("extern __truncsfhf2", llvmcall, Int16, (Float32,), x))
+    gnu_f2h_ieee(x::Float32) = reinterpret(Float16, ccall("extern __gnu_f2h_ieee", llvmcall, Int16, (Float32,), x))
+    truncdfhf2(x::Float64) = reinterpret(Float16, ccall("extern __truncdfhf2", llvmcall, Int16, (Float64,), x))
+else
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Float16,), x)
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Float16,), x)
+    truncsfhf2(x::Float32) = ccall("extern __truncsfhf2", llvmcall, Float16, (Float32,), x)
+    gnu_f2h_ieee(x::Float32) = ccall("extern __gnu_f2h_ieee", llvmcall, Float16, (Float32,), x)
+    truncdfhf2(x::Float64) = ccall("extern __truncdfhf2", llvmcall, Float16, (Float64,), x)
+end
+
+@testset "Float16 intrinsics (crt)" begin
+    @test extendhfsf2(Float16(3.3)) == 3.3007812f0
+    @test gnu_h2f_ieee(Float16(3.3)) == 3.3007812f0
+    @test truncsfhf2(3.3f0) == Float16(3.3)
+    @test gnu_f2h_ieee(3.3f0) == Float16(3.3)
+    @test truncdfhf2(3.3) == Float16(3.3)
+end
1.8.0-rc3
--- src/APInt-C.cpp  2022-06-29 15:38:07.412161000 +0000
+++ src/APInt-C.cpp  2022-06-29 16:03:07.396275264 +0000
@@ -316,7 +316,7 @@
 void LLVMFPtoInt(unsigned numbits, void *pa, unsigned onumbits, integerPart *pr, bool isSigned, bool *isExact) {
     double Val;
     if (numbits == 16)
-        Val = __gnu_h2f_ieee(*(uint16_t*)pa);
+        Val = julia__gnu_h2f_ieee(*(uint16_t*)pa);
     else if (numbits == 32)
         Val = *(float*)pa;
     else if (numbits == 64)
@@ -391,7 +391,7 @@
         val = a.roundToDouble(true);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)
@@ -408,7 +408,7 @@
         val = a.roundToDouble(false);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)

--- src/aotcompile.cpp  2022-06-29 15:38:07.416161000 +0000
+++ src/aotcompile.cpp  2022-07-19 16:43:52.586543207 +0000
@@ -50,8 +50,10 @@
 #include <llvm/MC/MCCodeEmitter.h>
 #include <llvm/Support/CodeGen.h>
 
+#include <llvm/IR/IRBuilder.h>
 #include <llvm/IR/LegacyPassManagers.h>
 #include <llvm/Transforms/Utils/Cloning.h>
+#include <llvm/Transforms/Utils/ModuleUtils.h>
 
 
 using namespace llvm;
@@ -446,6 +448,23 @@
     jl_safe_printf("ERROR: failed to emit output file %s\n", err.c_str());
 }
 
+static void injectCRTAlias(Module &M, StringRef name, StringRef alias, FunctionType *FT)
+{
+    Function *target = M.getFunction(alias);
+    if (!target) {
+        target = Function::Create(FT, Function::ExternalLinkage, alias, M);
+    }
+    Function *interposer = Function::Create(FT, Function::WeakAnyLinkage, name, M);
+    appendToCompilerUsed(M, {interposer});
+
+    llvm::IRBuilder<> builder(BasicBlock::Create(M.getContext(), "top", interposer));
+    SmallVector<Value *, 4> CallArgs;
+    for (auto &arg : interposer->args())
+        CallArgs.push_back(&arg);
+    auto val = builder.CreateCall(target, CallArgs);
+    builder.CreateRet(val);
+}
+
 
 // takes the running content that has collected in the shadow module and dump it to disk
 // this builds the object file portion of the sysimage files for fast startup
@@ -553,7 +572,22 @@
 
     // do the actual work
     auto add_output = [&] (Module &M, StringRef unopt_bc_Name, StringRef bc_Name, StringRef obj_Name, StringRef asm_Name) {
+        // We would like to emit an alias or an weakref alias to redirect these symbols
+        // but LLVM doesn't let us emit a GlobalAlias to a declaration...
+        // So for now we inject a definition of these functions that calls our runtime functions.
+        injectCRTAlias(M, "__gnu_h2f_ieee", "julia__gnu_h2f_ieee",
+                FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+        injectCRTAlias(M, "__extendhfsf2", "julia__gnu_h2f_ieee",
+                FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+        injectCRTAlias(M, "__gnu_f2h_ieee", "julia__gnu_f2h_ieee",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+        injectCRTAlias(M, "__truncsfhf2", "julia__gnu_f2h_ieee",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+        injectCRTAlias(M, "__truncdfhf2", "julia__truncdfhf2",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getDoubleTy(Context) }, false));
+
         PM.run(M);
+
         if (unopt_bc_fname)
             emit_result(unopt_bc_Archive, unopt_bc_Buffer, unopt_bc_Name, outputs);
         if (bc_fname)

--- src/julia.expmap  2022-06-29 15:38:07.444162000 +0000
+++ src/julia.expmap  2022-06-29 16:02:38.471479518 +0000
@@ -42,12 +42,6 @@
     environ;
     __progname;
 
-    /* compiler run-time intrinsics */
-    __gnu_h2f_ieee;
-    __extendhfsf2;
-    __gnu_f2h_ieee;
-    __truncdfhf2;
-
   local:
     *;
 };

--- src/julia_internal.h  2022-06-29 15:38:07.448162000 +0000
+++ src/julia_internal.h  2022-06-29 16:03:58.453680503 +0000
@@ -1427,8 +1427,9 @@
   #define JL_GC_ASSERT_LIVE(x) (void)(x)
 #endif
 
-float __gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
-uint16_t __gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__truncdfhf2(double param) JL_NOTSAFEPOINT;
 
 #ifdef __cplusplus
 }

--- src/runtime_intrinsics.c  2022-06-29 15:38:07.456162000 +0000
+++ src/runtime_intrinsics.c  2022-06-29 16:05:46.116645907 +0000
@@ -188,22 +188,17 @@
     return h;
 }
 
-JL_DLLEXPORT float __gnu_h2f_ieee(uint16_t param)
+JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param)
 {
     return half_to_float(param);
 }
 
-JL_DLLEXPORT float __extendhfsf2(uint16_t param)
-{
-    return half_to_float(param);
-}
-
-JL_DLLEXPORT uint16_t __gnu_f2h_ieee(float param)
+JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param)
 {
     return float_to_half(param);
 }
 
-JL_DLLEXPORT uint16_t __truncdfhf2(double param)
+JL_DLLEXPORT uint16_t julia__truncdfhf2(double param)
 {
     float res = (float)param;
     uint32_t resi;
@@ -338,9 +338,9 @@
     }
 
 #define fp_select(a, func) \
-    sizeof(a) == sizeof(float) ? func##f((float)a) : func(a)
+    sizeof(a) <= sizeof(float) ? func##f((float)a) : func(a)
 #define fp_select2(a, b, func) \
-    sizeof(a) == sizeof(float) ? func##f(a, b) : func(a, b)
+    sizeof(a) <= sizeof(float) ? func##f(a, b) : func(a, b)
 
 // fast-function generators //
 
@@ -384,11 +384,11 @@
 static inline void name(unsigned osize, void *pa, void *pr) JL_NOTSAFEPOINT \
 { \
     uint16_t a = *(uint16_t*)pa; \
-    float A = __gnu_h2f_ieee(a); \
+    float A = julia__gnu_h2f_ieee(a); \
     if (osize == 16) { \
         float R; \
         OP(&R, A); \
-        *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
     } else { \
         OP((uint16_t*)pr, A); \
     } \
@@ -412,11 +412,11 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     float R = OP(A, B); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 // float or integer inputs, bool output
@@ -437,8 +437,8 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     return OP(A, B); \
 }
@@ -478,12 +478,12 @@
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
     uint16_t c = *(uint16_t*)pc; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
-    float C = __gnu_h2f_ieee(c); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
+    float C = julia__gnu_h2f_ieee(c); \
     runtime_nbits = 16; \
     float R = OP(A, B, C); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 
@@ -1001,7 +1001,7 @@
 fpiseq_n(float, 32)
 fpiseq_n(double, 64)
 #define fpiseq(a,b) \
-    sizeof(a) == sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
+    sizeof(a) <= sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
 
 bool_fintrinsic(eq,eq_float)
 bool_fintrinsic(ne,ne_float)
@@ -1050,7 +1050,7 @@
         if (!(osize < 8 * sizeof(a))) \
             jl_error("fptrunc: output bitsize must be < input bitsize"); \
         else if (osize == 16) \
-            *(uint16_t*)pr = __gnu_f2h_ieee(a); \
+            *(uint16_t*)pr = julia__gnu_f2h_ieee(a); \
         else if (osize == 32) \
             *(float*)pr = a; \
         else if (osize == 64) \

--- src/jitlayers.cpp  2022-06-29 15:38:07.440162000 +0000
+++ src/jitlayers.cpp  2022-06-29 16:38:19.841056942 +0000
@@ -728,12 +728,26 @@
     }
 
     JD.addToLinkOrder(GlobalJD, orc::JITDylibLookupFlags::MatchExportedSymbolsOnly);
+
+    orc::SymbolAliasMap jl_crt = {
+        { mangle("__gnu_h2f_ieee"), { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__extendhfsf2"),  { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__gnu_f2h_ieee"), { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncsfhf2"),   { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncdfhf2"),   { mangle("julia__truncdfhf2"),   JITSymbolFlags::Exported } }
+    };
+    cantFail(GlobalJD.define(orc::symbolAliases(jl_crt)));
 }
 
-void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+orc::SymbolStringPtr JuliaOJIT::mangle(StringRef Name)
 {
     std::string MangleName = getMangledName(Name);
-    cantFail(JD.define(orc::absoluteSymbols({{ES.intern(MangleName), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
+    return ES.intern(MangleName);
+}
+
+void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+{
+    cantFail(JD.define(orc::absoluteSymbols({{mangle(Name), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
 }
 
 void JuliaOJIT::addModule(std::unique_ptr<Module> M)

--- src/jitlayers.h 2022-06-29 18:41:05.689863399 +0200
+++ src/jitlayers.h  2022-06-29 18:45:27.071795560 +0200
@@ -204,6 +204,7 @@
     void RegisterJITEventListener(JITEventListener *L);
 #endif
 
+    orc::SymbolStringPtr mangle(StringRef Name);
     void addGlobalMapping(StringRef Name, uint64_t Addr);
     void addModule(std::unique_ptr<Module> M);

--- test/intrinsics.jl  2022-06-29 15:38:07.584165000 +0000
+++ test/intrinsics.jl  2022-06-29 16:56:50.640396691 +0000
@@ -284,3 +284,27 @@
         @test r2 isa IntWrap && r2.x === 103 === r[].x && r2 !== r[]
     end
 end)()
+
+if Sys.ARCH == :aarch64
+    # On AArch64 we are following the `_Float16` ABI. Buthe these functions expect `Int16`.
+    # TODO: SHould we have `Chalf == Int16` and `Cfloat16 == Float16`?
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    truncsfhf2(x::Float32) = reinterpret(Float16, ccall("extern __truncsfhf2", llvmcall, Int16, (Float32,), x))
+    gnu_f2h_ieee(x::Float32) = reinterpret(Float16, ccall("extern __gnu_f2h_ieee", llvmcall, Int16, (Float32,), x))
+    truncdfhf2(x::Float64) = reinterpret(Float16, ccall("extern __truncdfhf2", llvmcall, Int16, (Float64,), x))
+else
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Float16,), x)
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Float16,), x)
+    truncsfhf2(x::Float32) = ccall("extern __truncsfhf2", llvmcall, Float16, (Float32,), x)
+    gnu_f2h_ieee(x::Float32) = ccall("extern __gnu_f2h_ieee", llvmcall, Float16, (Float32,), x)
+    truncdfhf2(x::Float64) = ccall("extern __truncdfhf2", llvmcall, Float16, (Float64,), x)
+end
+
+@testset "Float16 intrinsics (crt)" begin
+    @test extendhfsf2(Float16(3.3)) == 3.3007812f0
+    @test gnu_h2f_ieee(Float16(3.3)) == 3.3007812f0
+    @test truncsfhf2(3.3f0) == Float16(3.3)
+    @test gnu_f2h_ieee(3.3f0) == Float16(3.3)
+    @test truncdfhf2(3.3) == Float16(3.3)
+end

Or should I open a PR for each backport ?

EDIT: I missed #46110, sorry.

@maleadt
Copy link
Member

maleadt commented Jul 22, 2022

There's already a backport PR for 1.8: #46110. I didn't think we're going to backport this to 1.7 or 1.6 though? It would also need a backport of https://reviews.llvm.org/D129840 to all relevant LLVM branches, which hasn't happened yet.

vchuravy added a commit that referenced this pull request Aug 2, 2022
Backport "Emit aliases for FP16 conversion routines" (#45649) to 1.8
flemairen6 pushed a commit to ljfitz/llvm-project that referenced this pull request Aug 5, 2022
When using weak symbols, the WinCOFFObjectWriter keeps a list (`WeakDefaults`)
that's used to make names unique. This list should be reset when the object
writer is reset, because otherwise reuse of the object writer can result in
freed symbols being accessed. With some added output, this becomes clear when
using `llc` in `--run-twice` mode:

```
$ ./llc --compile-twice -mtriple=x86_64-pc-win32 trivial.ll -filetype=obj

DefineSymbol::WeakDefaults
 - .weak.foo.default
 - .weak.bar.default

DefineSymbol::WeakDefaults
 - .weak.foo.default
 - áÑJij⌂  p§┼Ø┐☺
 - .debug_macinfo.dw
 - .weak.bar.default
```

This does not seem to leak into the output object file though, so I couldn't
come up with a test. I added one that just does `--run-twice` (and verified
that it does access freed memory), which should result in detecting the
invalid memory accesses when running under ASAN.

Observed in a Julia PR where we started using weak symbols:
JuliaLang/julia#45649

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D129840
@KristofferC KristofferC removed the backport 1.8 Change should be backported to release-1.8 label Aug 7, 2022
vchuravy pushed a commit to JuliaLang/llvm-project that referenced this pull request Aug 23, 2022
When using weak symbols, the WinCOFFObjectWriter keeps a list (`WeakDefaults`)
that's used to make names unique. This list should be reset when the object
writer is reset, because otherwise reuse of the object writer can result in
freed symbols being accessed. With some added output, this becomes clear when
using `llc` in `--run-twice` mode:

```
$ ./llc --compile-twice -mtriple=x86_64-pc-win32 trivial.ll -filetype=obj

DefineSymbol::WeakDefaults
 - .weak.foo.default
 - .weak.bar.default

DefineSymbol::WeakDefaults
 - .weak.foo.default
 - áÑJij⌂  p§┼Ø┐☺
 - .debug_macinfo.dw
 - .weak.bar.default
```

This does not seem to leak into the output object file though, so I couldn't
come up with a test. I added one that just does `--run-twice` (and verified
that it does access freed memory), which should result in detecting the
invalid memory accesses when running under ASAN.

Observed in a Julia PR where we started using weak symbols:
JuliaLang/julia#45649

Differential Revision: https://reviews.llvm.org/D129840
mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this pull request Oct 7, 2022
When using weak symbols, the WinCOFFObjectWriter keeps a list (`WeakDefaults`)
that's used to make names unique. This list should be reset when the object
writer is reset, because otherwise reuse of the object writer can result in
freed symbols being accessed. With some added output, this becomes clear when
using `llc` in `--run-twice` mode:

```
$ ./llc --compile-twice -mtriple=x86_64-pc-win32 trivial.ll -filetype=obj

DefineSymbol::WeakDefaults
 - .weak.foo.default
 - .weak.bar.default

DefineSymbol::WeakDefaults
 - .weak.foo.default
 - áÑJij⌂  p§┼Ø┐☺
 - .debug_macinfo.dw
 - .weak.bar.default
```

This does not seem to leak into the output object file though, so I couldn't
come up with a test. I added one that just does `--run-twice` (and verified
that it does access freed memory), which should result in detecting the
invalid memory accesses when running under ASAN.

Observed in a Julia PR where we started using weak symbols:
JuliaLang/julia#45649

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D129840
@staticfloat
Copy link
Member

This was marked for backporting onto 1.6, but we ran into errors because, as Tim said, the LLVM backport has not happened. If someone wants this backported to 1.6, they will need to get a proper release-1.6 LLVM bump going with the backported fixes (as Tim pointed out).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.6 Change should be backported to release-1.6 float16
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants