Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undefined symbols: ___truncsfbf2 when building on macOS #52067

Open
eschnett opened this issue Nov 7, 2023 · 22 comments
Open

Undefined symbols: ___truncsfbf2 when building on macOS #52067

eschnett opened this issue Nov 7, 2023 · 22 comments
Assignees
Labels
building Build system, or building Julia or its dependencies system:mac Affects only macOS

Comments

@eschnett
Copy link
Contributor

eschnett commented Nov 7, 2023

I am building the current master version of Julia from scratch on macOS (Darwin redshift.pi.local 23.1.0 Darwin Kernel Version 23.1.0: Mon Oct 9 21:27:27 PDT 2023; root:xnu-10002.41.9~6/RELEASE_X86_64 x86_64 i386 Darwin). I see this error:

$ make
    LINK usr/lib/libjulia-internal.1.11.0.dylib
ld: Undefined symbols:
  ___truncsfbf2, referenced from:
      _julia__truncsfbf2 in runtime_intrinsics.o
      _julia__truncdfbf2 in runtime_intrinsics.o
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: *** [Makefile:388: /Users/eschnett/src/julia-master/usr/lib/libjulia-internal.1.11.0.dylib] Error 1
make: *** [Makefile:97: julia-src-release] Error 2

I have both GCC and Clang installed via MacPorts.

It seems that Julia uses this Clang (which clang++: /opt/local/bin/clang++). I checked, and it seems that only GCC provides this function (/opt/local/lib/gcc13/gcc/x86_64-apple-darwin23/13.2.0/libgcc.a:truncsfbf2.o: 0000000000000000 T ___truncsfbf2), but Clang does not provide it.

$ clang++ --version
clang version 17.0.4
Target: x86_64-apple-darwin23.1.0
Thread model: posix
InstalledDir: /opt/local/libexec/llvm-17/bin
@gbaraldi
Copy link
Member

gbaraldi commented Nov 7, 2023

They are building their clang wrong :|

@oscardssmith
Copy link
Member

By "they", do you mean Apple?

@gbaraldi
Copy link
Member

gbaraldi commented Nov 7, 2023

No, I mean macports, I was able to reproduce this locally, so homebrew is wrong here as well

@gbaraldi
Copy link
Member

gbaraldi commented Nov 7, 2023

I see llvm/llvm-project@489bda6 but I'm not sure @fxcoudert any idea?

@maleadt
Copy link
Member

maleadt commented Nov 8, 2023

I'm confused that this generates a libcall to __truncsfbf2, as in runtime_intrinsics we do all the conversion ourselves.

@giordano giordano added building Build system, or building Julia or its dependencies system:mac Affects only macOS labels Nov 8, 2023
@fxcoudert
Copy link
Contributor

homebrew is wrong here as well

I'm puzzled by this statement. I don't see that we do any special treatment for this symbol, or anything related, in our LLVM build. But we do ship LLVM 17, and I seem to find only LLVM 16 on Yggdrasil, so maybe that's a difference in behaviour between those two versions?

@gbaraldi
Copy link
Member

gbaraldi commented Nov 8, 2023

I think this is some weirdness happening with compiler-rt + apple. Where they aren't building the bf16 builtins even though it seems they should be built.

@fxcoudert
Copy link
Contributor

fxcoudert commented Nov 8, 2023

I'm not expert in the bf16 type, but the behaviour of the compilers seems consistent:

meau /tmp $ cat a.c
__bf16 intend(float x) {
    return (__bf16) x;
}
meau /tmp $ gcc-13 -c a.c -W && nm a.o                         
0000000000000020 s EH_frame1
                 U ___truncsfbf2
0000000000000000 T _intend
0000000000000000 t ltmp0
0000000000000020 s ltmp1
meau /tmp $ clang -c a.c -W && nm a.o                           
a.c:2:21: error: cannot type-cast to __bf16
    return (__bf16) x;
                    ^
1 error generated.
meau /tmp $ /opt/homebrew/opt/llvm/bin/clang -c a.c -W && nm a.o                         
0000000000000000 T _intend
0000000000000000 t ltmp0
0000000000000018 s ltmp1
  • GCC emits code through this library function, and provides the library function
  • Apple's clang does not allow type conversion (__bf16 is a storage type only)
  • LLVM 17 emits code without the library function, and does not provide the libration function

So if things worked before, either:

  • LLVM has regressed, it used to provide the function, but does not anymore
  • Julia started emitting that call, while it didn't before

@maleadt
Copy link
Member

maleadt commented Nov 9, 2023

__bf16 intend(float x) {
return (__bf16) x;
}

That's not how our intrinsics convert to __bf16 though, instead, we do the conversion as raw uint16 and cst the pointer:

#define BFLOAT16_TYPE __bf16
#define BFLOAT16_TO_UINT16(x) (*(uint16_t*)&(x))
#define BFLOAT16_FROM_UINT16(x) (*(__bf16*)&(x))

Or at least that's the intention.

@KristofferC
Copy link
Member

FWIW, I also encountered this after a homebrew upgrade:

❯ clang --version                                             
Homebrew clang version 17.0.6
Target: x86_64-apple-darwin23.3.0
Thread model: posix
InstalledDir: /usr/local/opt/llvm/bin

@DilumAluthge
Copy link
Member

DilumAluthge commented Mar 6, 2024

So the clang shipped by Homebrew (on macOS) and MacPorts is broken? But what about the clang shipped by Xcode?

Should our build system try to detect if the user has Xcode (or at least the Xcode Command Line Tools) installed, and if so use that clang instead?

@KristofferC
Copy link
Member

Changing to clang Xcode made it work. I don't think the MacPorts one is broken but it AFAIU doesn't have support for bfloat16 and we fail to detect that.

@gbaraldi
Copy link
Member

gbaraldi commented Mar 6, 2024

The issue is more subtle than that. The issue is that the macports/brew clang support _bf16 and emit their code correctly, but the system/compiler-rt that it uses doesn't have the libcalls that it expects to be there. I'll write a C reproducer.

@gbaraldi
Copy link
Member

gbaraldi commented Mar 6, 2024

llvm/llvm-project#84192 should fix it. But it will need to trickle down to distributions. So we probably need to bump the guard for darwin

@weedge
Copy link

weedge commented Mar 14, 2024

use Command Line Tools for Xcode compiler clang++
-DCMAKE_CXX_COMPILER=/Library/Developer/CommandLineTools/usr/bin/clang++
downlaod from https://developer.apple.com/download/all/?q=Command%20Line%20Tools%20for%20Xcode

@weedge

This comment was marked as duplicate.

@fingolfin
Copy link
Member

This issue is currently blocking an update for libjulia_jll on Yggdrasil. I wonder if anyone has a suggestion for a workaround? E.g. perhaps we can #if 0 some section of code there (the generated libraries are not actually used for runtime, just for linking, so the content of functions is generally irrelevant)

@gbaraldi
Copy link
Member

This code needs an #ifdef _OS_DARWIN_ until llvm/llvm-project#84192 is part of some release. Then it needs to check that version.

@fingolfin
Copy link
Member

@gbaraldi could you provide a hint what "this code" is, i.e. which code needs that #ifdef?

@gbaraldi
Copy link
Member

gbaraldi commented Aug 31, 2024

#if ((defined(__GNUC__) && __GNUC__ > 12) || \
(defined(__clang__) && __clang_major__ > 16)) && \
!defined(_CPU_PPC64_) && !defined(_CPU_PPC_) && \
!defined(_OS_WINDOWS_)
#define BFLOAT16_TYPE __bf16
#define BFLOAT16_TO_UINT16(x) (*(uint16_t*)&(x))
#define BFLOAT16_FROM_UINT16(x) (*(__bf16*)&(x))
// on older compilers, we need to emulate the platform-specific ABI.
// for more details, see similar code above that deals with Float16.
#elif defined(_CPU_X86_) || (defined(_CPU_X86_64_) && !defined(_OS_WINDOWS_))
#define BFLOAT16_TYPE __m128
#define BFLOAT16_TO_UINT16(x) take_from_xmm(x)
#define BFLOAT16_FROM_UINT16(x) return_in_xmm(x)
#elif defined(_CPU_PPC64_) || defined(_CPU_PPC_)
#define BFLOAT16_TYPE uint16_t
#define BFLOAT16_TO_UINT16(x) (x)
#define BFLOAT16_FROM_UINT16(x) (x)
#else
#define BFLOAT16_TYPE float
#define BFLOAT16_TO_UINT16(x) ((uint16_t)*(uint32_t*)&(x))
#define BFLOAT16_FROM_UINT16(x) ({ uint32_t tmp = (uint32_t)(x); *(float*)&tmp; })
#endif
. Though I'm not sure exactly which ABI we need to use here.

I know compilers make a complete mess of this

@vessokolev
Copy link

I don't think it is a problem related to macOS. The same issue exists on Linux. Try to compile the latest Julia's code using LLVM 19. You will get similar error messages:

/opt/software/binutils/2/2.41-gold/bin/ld: ./gc-heap-snapshot.o: in function `_gc_heap_snapshot_record_hidden_edge':
/project/soft-raid1-admin/build/vkolev/compile/julia-1.11.1/usr/include/llvm/ADT/StringMap.h:330:(.text+0x2432): undefined reference to `llvm::StringMapImpl::LookupBucketFor(llvm::StringRef)'
/opt/software/binutils/2/2.41-gold/bin/ld: ./gc-heap-snapshot.o: in function `std::pair<llvm::StringMapIterator<unsigned long>, bool> llvm::StringMap<unsigned long, llvm::MallocAllocator>::try_emplace<unsigned long>(llvm::StringRef, unsigned long&&)':
/project/soft-raid1-admin/build/vkolev/compile/julia-1.11.1/usr/include/llvm/ADT/StringMap.h:330:(.text._ZN4llvm9StringMapImNS_15MallocAllocatorEE11try_emplaceIJmEEESt4pairINS_17StringMapIteratorImEEbENS_9StringRefEDpOT_[_ZN4llvm9StringMapImNS_15MallocAllocatorEE11try_emplaceIJmEEESt4pairINS_17StringMapIteratorImEEbENS_9StringRefEDpOT_]+0x1f): undefined reference to `llvm::StringMapImpl::LookupBucketFor(llvm::StringRef)'
/opt/software/binutils/2/2.41-gold/bin/ld: ./processor.o: in function `ijl_get_cpu_features':
/project/soft-raid1-admin/build/vkolev/compile/julia-1.11.1/src/processor.cpp:974:(.text+0x2dc6): undefined reference to `llvm::sys::getHostCPUFeatures(llvm::StringMap<bool, llvm::MallocAllocator>&)'
/opt/software/binutils/2/2.41-gold/bin/ld: ./coverage.o: in function `std::pair<llvm::StringMapIterator<llvm::SmallVector<unsigned long (*) [32], 0u> >, bool> llvm::StringMap<llvm::SmallVector<unsigned long (*) [32], 0u>, llvm::MallocAllocator>::try_emplace<>(llvm::StringRef)':
/project/soft-raid1-admin/build/vkolev/compile/julia-1.11.1/usr/include/llvm/ADT/StringMap.h:330:(.text._ZN4llvm9StringMapINS_11SmallVectorIPA32_mLj0EEENS_15MallocAllocatorEE11try_emplaceIJEEESt4pairINS_17StringMapIteratorIS4_EEbENS_9StringRefEDpOT_[_ZN4llvm9StringMapINS_11SmallVectorIPA32_mLj0EEENS_15MallocAllocatorEE11try_emplaceIJEEESt4pairINS_17StringMapIteratorIS4_EEbENS_9StringRefEDpOT_]+0x1b): undefined reference to `llvm::StringMapImpl::LookupBucketFor(llvm::StringRef)'
/opt/software/binutils/2/2.41-gold/bin/ld: ./runtime_intrinsics.o: in function `julia__truncsfbf2':
/project/soft-raid1-admin/build/vkolev/compile/julia-1.11.1/src/runtime_intrinsics.c:379:(.text+0x571): undefined reference to `__truncsfbf2'
/opt/software/binutils/2/2.41-gold/bin/ld: ./runtime_intrinsics.o: in function `julia__truncdfbf2':
/project/soft-raid1-admin/build/vkolev/compile/julia-1.11.1/src/runtime_intrinsics.c:385:(.text+0x5fc): undefined reference to `__truncsfbf2'
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: *** [Makefile:391: /project/soft-raid1-admin/build/vkolev/compile/julia-1.11.1/usr/lib/libjulia-internal.so.1.11.1] Error 1
make: *** [Makefile:101: julia-src-release] Error 2

@TidbitSoftware
Copy link

TidbitSoftware commented Oct 30, 2024

For what it's worth, getting the same issue compiling HDF5 on an Intel-based Mac with the latest versions of Xcode and CLT installed (with either selected). I did update Homebrew today so not sure if there was a dependency in there that could be causing this.

Edit: On another Intel-based machine, I was able to verify that the issue was not with CLT nor Homebrew as neither have been updated for about 6 months. Rolling back to a previous version of the HDF5 source worked, so it's an issue with implementation (not properly conditioning for Intel-based machines, I believe).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
building Build system, or building Julia or its dependencies system:mac Affects only macOS
Projects
None yet
Development

No branches or pull requests