Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM assertion failed on mac llvm10 debug build #13205

Closed
will opened this issue Mar 20, 2023 · 5 comments
Closed

LLVM assertion failed on mac llvm10 debug build #13205

will opened this issue Mar 20, 2023 · 5 comments
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:codegen topic:compiler:debugger

Comments

@will
Copy link
Contributor

will commented Mar 20, 2023

Bug Report

Using the prebuilt 1.7.3 binaries from the releases, I get an error on the mac build (but not linux) when doing a --debug build (but not normal or release). This is maybe related to #12589, but the llvm versions are different.

This is on CrunchyData/bridge-cli@cafe97f where I just added a nix flake to pull in the precompiled version from this repo's release page. So you should be able to do nix develop then crystal build src/cli.cr --debug to get the same error:

$ crystal build src/cli.cr --debug
Assertion failed: (Idx < NumElements && "Invalid element idx!"), function getElementOffset, file /var/cache/omnibus/src/llvm/llvm-10.0.0.src/include/llvm/IR/DataLayout.h, line 608.
fish: Job 1, 'crystal build src/cli.cr --debug' terminated by signal SIGABRT (Abort)

$ crystal --version
Crystal 1.7.3 [d61a01e18] (2023-03-07)

LLVM: 10.0.0
Default target: aarch64-apple-darwin

On linux where its Crystal 1.7.3 [d61a01e18] (2023-03-07) LLVM: 13.0.1 the problem doesn't reproduce.

I'm not sure if it's useful but here is some lldb stuff

(lldb) settings set -- target.run-args build src/cli.cr --debug
(lldb) run
Process 73997 launched: '/nix/store/qqr0s2ijmn4ls6i342f4c1hnybmfysf1-crystal-bin/bin/.crystal-wrapped' (arm64)
Assertion failed: (Idx < NumElements && "Invalid element idx!"), function getElementOffset, file /var/cache/omnibus/src/llvm/llvm-10.0.0.src/include/llvm/IR/DataLayout.h, line 608.
Process 73997 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = hit program assert
    frame #4: 0x0000000101db0e10 .crystal-wrapped`LLVMOffsetOfElement.cold.2 + 40
.crystal-wrapped`LLVMOffsetOfElement.cold.3:
->  0x101db0e10 <+0>:  stp    x29, x30, [sp, #-0x10]!
    0x101db0e14 <+4>:  mov    x29, sp
    0x101db0e18 <+8>:  adrp   x0, 1828
    0x101db0e1c <+12>: add    x0, x0, #0xd0b            ; "cast"

first 20 frames of the backtrace

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = hit program assert
    frame #0: 0x00000001977f2868 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x0000000197829cec libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x00000001977622c8 libsystem_c.dylib`abort + 180
    frame #3: 0x0000000197761620 libsystem_c.dylib`__assert_rtn + 272
  * frame #4: 0x0000000101db0e10 .crystal-wrapped`LLVMOffsetOfElement.cold.2 + 40
    frame #5: 0x000000010130c504 .crystal-wrapped`LLVMOffsetOfElement + 84
    frame #6: 0x000000010059e038 .crystal-wrapped`*Crystal::CodeGenVisitor#get_debug_type<Crystal::Type+, Crystal::Type+>:(LibLLVM::MetadataRef | Nil) + 12200
    frame #7: 0x000000010059df58 .crystal-wrapped`*Crystal::CodeGenVisitor#get_debug_type<Crystal::Type+, Crystal::Type+>:(LibLLVM::MetadataRef | Nil) + 11976
    frame #8: 0x000000010059d3ac .crystal-wrapped`*Crystal::CodeGenVisitor#get_debug_type<Crystal::Type+, Crystal::Type+>:(LibLLVM::MetadataRef | Nil) + 8988
    frame #9: 0x000000010059df58 .crystal-wrapped`*Crystal::CodeGenVisitor#get_debug_type<Crystal::Type+, Crystal::Type+>:(LibLLVM::MetadataRef | Nil) + 11976
    frame #10: 0x000000010059d3ac .crystal-wrapped`*Crystal::CodeGenVisitor#get_debug_type<Crystal::Type+, Crystal::Type+>:(LibLLVM::MetadataRef | Nil) + 8988
    frame #11: 0x000000010059d3ac .crystal-wrapped`*Crystal::CodeGenVisitor#get_debug_type<Crystal::Type+, Crystal::Type+>:(LibLLVM::MetadataRef | Nil) + 8988
    frame #12: 0x000000010059df58 .crystal-wrapped`*Crystal::CodeGenVisitor#get_debug_type<Crystal::Type+, Crystal::Type+>:(LibLLVM::MetadataRef | Nil) + 11976
    frame #13: 0x000000010059df58 .crystal-wrapped`*Crystal::CodeGenVisitor#get_debug_type<Crystal::Type+, Crystal::Type+>:(LibLLVM::MetadataRef | Nil) + 11976
    frame #14: 0x000000010059df58 .crystal-wrapped`*Crystal::CodeGenVisitor#get_debug_type<Crystal::Type+, Crystal::Type+>:(LibLLVM::MetadataRef | Nil) + 11976
    frame #15: 0x00000001005d7ec0 .crystal-wrapped`*Crystal::CodeGenVisitor#codegen_fun_signature_non_external<String, Crystal::Def+, Crystal::Type+, Bool, Bool>:Array(Crystal::Arg) + 1800
    frame #16: 0x00000001005d49d8 .crystal-wrapped`*Crystal::CodeGenVisitor#codegen_fun<String, Crystal::Def+, Crystal::Type+, Bool, Crystal::CodeGenVisitor::ModuleInfo, Bool, Bool>:LLVM::Function + 900
    frame #17: 0x00000001005ea20c .crystal-wrapped`*Crystal::CodeGenVisitor#target_def_fun<Crystal::Def+, Crystal::Type+>:LLVM::Function + 2156
    frame #18: 0x00000001005e25ec .crystal-wrapped`*Crystal::CodeGenVisitor#visit<Crystal::Call>:Bool + 1204
    frame #19: 0x00000001005e3110 .crystal-wrapped`*Crystal::CodeGenVisitor#visit<Crystal::Call>:Bool + 4056
    frame #20: 0x00000001005c624c .crystal-wrapped`*Crystal::ASTNode+@Crystal::ASTNode#accept<Crystal::CodeGenVisitor>:Nil + 9812
[backtrace.txt](https://github.com/crystal-lang/crystal/files/11020997/backtrace.txt)

There are almost 500 frames, so I attached them.
backtrace.txt

@will will added the kind:bug A bug in the code. Does not apply to documentation, specs, etc. label Mar 20, 2023
@HertzDevil
Copy link
Contributor

HertzDevil commented Mar 28, 2023

I have reduced it to:

lib LibFoo
  union U
    x : Int8
    y : Int16
  end
end

abstract class Foo
end

class Bar < Foo
  @x : LibFoo::U?
end

Bar.new

@beta-ziliani
Copy link
Member

Here with llvm 11.1 and x86 mac it doesn't reproduce.

@HertzDevil
Copy link
Contributor

Just built LLVM in debug mode and this spec hits the same assertion failure:

it "codegens correct debug info for untyped expression (#4007 and #4008)" do
codegen(%(
require "prelude"
int = 3
case int
when 0
puts 0
when 1, 2, Int32
puts "1 | 2 | Int32"
else
puts int
end
), debug: Crystal::Debug::All)
end

The reduction above broke in OP's setup because of Socket::IPAddress. I don't know if the prelude has something else like that, this spec failure could be a different bug altogether.

@HertzDevil
Copy link
Contributor

HertzDevil commented Apr 11, 2023

Managed to hit this again with a debug build of LLVM 16.0.0. This time the struct type is LibC::PthreadMutexT, which is a lib union on x86_64-linux-gnu. The reduction is just:

lib Foo
  union Bar
    a : Int32
    b : Int16
  end
end

Foo::Bar.new

It makes sense because C unions generated by Crystal only have 1 LLVM struct member for alignment and 1 optional member for padding. In this case %"union.Foo::Bar" = type { [1 x i32] }, so getting the "second" member of Foo::Bar's LLVM type corresponding to @b fails. The relevant code is:

if (ivar_type = ivar.type?) && (ivar_debug_type = get_debug_type(ivar_type))
offset = @program.target_machine.data_layout.offset_of_element(struct_type, idx &+ (type.struct? ? 0 : 1))
size = @program.target_machine.data_layout.size_in_bits(llvm_embedded_type(ivar_type))
# FIXME structs like LibC::PthreadMutexT generate huge offset values
next if offset > UInt64::MAX // 8u64
member = di_builder.create_member_type(nil, name[1..-1], nil, 1, size, size, 8u64 * offset, LLVM::DIFlags::Zero, ivar_debug_type)
element_types << member
end

All lib unions with 3 or more members break debug builds regardless of the member types, and LibC::PthreadMutexT is also one of them. The bug doesn't reproduce on macOS with the default prelude and an empty source, because only GNU libc defines LibC::PthreadMutexT as a union; on the other hand, LibC::In6Addr includes a lib union on all platforms except WASI, and this union contains 3 members (just 2 on Windows), which might explain why require "socket" breaks everywhere. See also #7335.

More generally, offsetof on an extern union just breaks, even outside debug builds: (#9744)

lib Foo
  union Bar
    x : Int32
    y : Int16
    z : Int8
  end
end

offsetof(Foo::Bar, @x) # => 0
offsetof(Foo::Bar, @y) # => 110689

# internal error
# offsetof(Foo::Bar, @z) # Invalid Int32: "93921398392816" (ArgumentError)

This offset must always be 0 because all lib union members are less aligned than the whole union itself.

@HertzDevil
Copy link
Contributor

This was confirmed to be fixed with LLVM 10.0.1, and I don't think LLVM 10.0.0 as used in the OP would be any different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:codegen topic:compiler:debugger
Projects
None yet
Development

No branches or pull requests

3 participants