-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLVM mis-optimize due to returntwice function #17288
Comments
Error in TLS access should error much more reliably and earlier than this. Does |
It does. |
LLVM seems to be generating wrong asm. I reduced it to something that doesn't need libcuda (load a dummy shared library instead) but I can only get segfault when running it under |
OK, with this I could get a segfault outside immutable CuError
code::Int
info::Nullable{String}
CuError(code) = new(code, Nullable{String}())
end
const libcuda = Ref{Ptr{Void}}(C_NULL)
export CuModuleData
"Create a CUDA module from a string containing PTX code."
function CuModuleData(data)
handle_ref = Ref{Ptr{Void}}()
options = Dict{Cint,Any}()
options[0] = Array(UInt8, 1024*1024)
optionKeys, optionValues = encode(options)
try
throw(CuError(ccall(Libdl.dlsym(libcuda[], :cuModuleLoadDataEx), Cint,
(Ptr{Ptr{Void}}, Ptr{Cchar}, Cuint, Ref{Cint}, Ref{Ptr{Void}}), handle_ref, data, length(optionKeys), optionKeys, optionValues)))
catch err
err == 1
end
end
@noinline function encode(options::Dict{Cint,Any})
return Cint[], Ptr{Void}[]
end
CuModuleData("") |
Thanks; yes this case also segfaults on both aforementioned systems. |
reproduced with the following C code in clang so this is a llvm bug. The C program should print AFAICT what's happening is that the no-return branch (the one ends with What's happening in the julia code is basically replacing //
#include <setjmp.h>
#include <stdio.h>
#include <stdlib.h>
jmp_buf env;
__attribute__((noinline)) int f2(int v)
{
__asm__ volatile("":::"memory");
return v * v;
}
__attribute__((noinline)) int f(int a)
{
int b = random();
int c = random();
int d = random();
int e = random();
int f = random();
int g = random();
int h = random();
int i = random();
double k = f2(b) + f2(c + f2(d + f2(e + f2(f + f2(g + f2(h + i))))));
k *= b;
k -= c;
k += i;
if (setjmp(env) == 0) {
printf("%d\n", a + 4);
b = random();
c = random();
d = random();
e = random();
f = random();
g = random();
h = random();
i = random();
k = f2(b) + f2(c + f2(d + f2(e + f2(f + f2(g + f2(h + i))))));
k *= b;
k -= c;
k += i;
printf("%d\n", a + 4);
b = random();
c = random();
d = random();
e = random();
f = random();
g = random();
h = random();
i = random();
k = f2(b) + f2(c + f2(d + f2(e + f2(f + f2(g + f2(h + i))))));
k *= b;
k -= c;
k += i;
printf("%d\n", a + 4);
b = random();
c = random();
d = random();
e = random();
f = random();
g = random();
h = random();
i = random();
k = f2(b) + f2(c + f2(d + f2(e + f2(f + f2(g + f2(h + i))))));
k *= b;
k -= c;
k += i;
printf("%d\n", a + 4);
printf("%f\n", k);
longjmp(env, 1);
}
else {
printf("%d\n", a);
}
return a;
}
int main()
{
return f(0);
} |
I'm also pretty sure it is not UB to make |
Report upstream as https://llvm.org/bugs/show_bug.cgi?id=28431. This is likely a problem in isel so I probably can't help much with debugging it further. |
And the follow julia code reproduces the same issue on 0.4 too. @noinline f2(v) = v
@noinline p(v) = println(v)
@noinline r() = rand(Int)
function f(a)
b0 = r()
c0 = r()
d0 = r()
e0 = r()
f0 = r()
g0 = r()
h0 = r()
i0 = r()
k::Float64 = f2(b0) + f2(c0 + f2(d0 + f2(e0 + f2(f0 + f2(g0 + f2(h0 + i0))))));
k *= b0
k -= c0
k += i0
try
b = r()
c = r()
d = r()
e = r()
f = r()
g = r()
h = r()
i = r()
k += f2(b) + f2(c + f2(d + f2(e + f2(f + f2(g + f2(h + i))))));
k *= b
k -= c
k += i
p(a + 4)
b = r()
c = r()
d = r()
e = r()
f = r()
g = r()
h = r()
i = r()
k += f2(b) + f2(c + f2(d + f2(e + f2(f + f2(g + f2(h + i))))));
k *= b
k -= c
k += i
p(a + 4)
b = r()
c = r()
d = r()
e = r()
f = r()
g = r()
h = r()
i = r()
k += f2(b) + f2(c + f2(d + f2(e + f2(f + f2(g + f2(h + i))))));
k *= b
k -= c
k += i
p(a + 4)
b = r()
c = r()
d = r()
e = r()
f = r()
g = r()
h = r()
i = r()
k += f2(b) + f2(c + f2(d + f2(e + f2(f + f2(g + f2(h + i))))));
k *= b
k -= c
k += i
p(a + 4)
b = r()
c = r()
d = r()
e = r()
f = r()
g = r()
h = r()
i = r()
k += f2(b) + f2(c + f2(d + f2(e + f2(f + f2(g + f2(h + i))))));
k *= b
k -= c
k += i
p(a + 4)
b = r()
c = r()
d = r()
e = r()
f = r()
g = r()
h = r()
i = r()
k += f2(b) + f2(c + f2(d + f2(e + f2(f + f2(g + f2(h + i))))));
k *= b
k -= c
k += i
p(a + 4)
throw(k)
catch
p(a)
end
end
# @code_llvm f(0)
# @code_native f(0)
f(0) |
Impressive sleuthing, thanks for looking into this! |
It's more active than many other llvm issues I've seen.
Not really. Maybe try to move something to other functions to keep the register pressure low in the function with try-catch? |
Also it's probably better to check the return status directly instead of throwing it eagerly and rethrow it after checking it's type |
Yeah, but this is one of the few places where I'm catching the error and rethrowing it (just to add some additional diagnostic information, so it can be slow). In general, it is directly thrown from within |
I just ran into this, in the particularly nasty form of any error at the REPL causing a segfault. |
This game is ridiculous, we make the code easier for llvm to optimize and llvm mis-optimizes it..... |
I added a workaround in #17543 that should have the least performance impact, it fixes the segfault for my reduced example above at least..... However, it blocks the possibility of inlining the allocation ultra-fast path in the near future, which is a very natural next step and is why the pool address are used in the first place.......... |
LLVM has really bad support for returns_twice functions and can incorrectly move memory operations (both at IR and machine code level) due to missing control flow edge. By outlining the exception body, we can hide these functions from LLVM completely (they only exist in C code) and prevent all miscompilation. This also makes it much easier to check the correctness of heap to stack allocation optimization especially since not all memory operation intrinsics in LLVM has a volatile counterpart. This will obviously inhibit some valid optimizations too. These are mainly forwarding of memory operations from the caller to the exception body (since the other way around is almost always invalid) and can be improved with some simple IPO. This also makes it unnecessary to mark any memory operations on the stack with `volatile` this should also improve optimization in certain cases. Since we are scanning all the instructions in the outlined code anyway this also includes a simple optimization to delete exception frame that can't trigger. This implements a tweaked version of https://discourse.julialang.org/t/avoid-llvm-setjmp-bug/1140 Fix #17288
LLVM has really bad support for returns_twice functions and can incorrectly move memory operations (both at IR and machine code level) due to missing control flow edge. By outlining the exception body, we can hide these functions from LLVM completely (they only exist in C code) and prevent all miscompilation. This also makes it much easier to check the correctness of heap to stack allocation optimization especially since not all memory operation intrinsics in LLVM has a volatile counterpart. This will obviously inhibit some valid optimizations too. These are mainly forwarding of memory operations from the caller to the exception body (since the other way around is almost always invalid) and can be improved with some simple IPO. This also makes it unnecessary to mark any memory operations on the stack with `volatile` this should also improve optimization in certain cases. Since we are scanning all the instructions in the outlined code anyway this also includes a simple optimization to delete exception frame that can't trigger. This implements a tweaked version of https://discourse.julialang.org/t/avoid-llvm-setjmp-bug/1140 Fix #17288
LLVM has really bad support for returns_twice functions and can incorrectly move memory operations (both at IR and machine code level) due to missing control flow edge. By outlining the exception body, we can hide these functions from LLVM completely (they only exist in C code) and prevent all miscompilation. This also makes it much easier to check the correctness of heap to stack allocation optimization especially since not all memory operation intrinsics in LLVM has a volatile counterpart. This will obviously inhibit some valid optimizations too. These are mainly forwarding of memory operations from the caller to the exception body (since the other way around is almost always invalid) and can be improved with some simple IPO. This also makes it unnecessary to mark any memory operations on the stack with `volatile` this should also improve optimization in certain cases. Since we are scanning all the instructions in the outlined code anyway this also includes a simple optimization to delete exception frame that can't trigger. This implements a tweaked version of https://discourse.julialang.org/t/avoid-llvm-setjmp-bug/1140 Fix #17288
LLVM has really bad support for returns_twice functions and can incorrectly move memory operations (both at IR and machine code level) due to missing control flow edge. By outlining the exception body, we can hide these functions from LLVM completely (they only exist in C code) and prevent all miscompilation. This also makes it much easier to check the correctness of heap to stack allocation optimization especially since not all memory operation intrinsics in LLVM has a volatile counterpart. This will obviously inhibit some valid optimizations too. These are mainly forwarding of memory operations from the caller to the exception body (since the other way around is almost always invalid) and can be improved with some simple IPO. This also makes it unnecessary to mark any memory operations on the stack with `volatile` this should also improve optimization in certain cases. Since we are scanning all the instructions in the outlined code anyway this also includes a simple optimization to delete exception frame that can't trigger. This implements a tweaked version of https://discourse.julialang.org/t/avoid-llvm-setjmp-bug/1140 Fix #17288
LLVM has really bad support for returns_twice functions and can incorrectly move memory operations (both at IR and machine code level) due to missing control flow edge. By outlining the exception body, we can hide these functions from LLVM completely (they only exist in C code) and prevent all miscompilation. This also makes it much easier to check the correctness of heap to stack allocation optimization especially since not all memory operation intrinsics in LLVM has a volatile counterpart. This will obviously inhibit some valid optimizations too. These are mainly forwarding of memory operations from the caller to the exception body (since the other way around is almost always invalid) and can be improved with some simple IPO. This also makes it unnecessary to mark any memory operations on the stack with `volatile` this should also improve optimization in certain cases. Since we are scanning all the instructions in the outlined code anyway this also includes a simple optimization to delete exception frame that can't trigger. This implements a tweaked version of https://discourse.julialang.org/t/avoid-llvm-setjmp-bug/1140 Fix #17288
LLVM has really bad support for returns_twice functions and can incorrectly move memory operations (both at IR and machine code level) due to missing control flow edge. By outlining the exception body, we can hide these functions from LLVM completely (they only exist in C code) and prevent all miscompilation. This also makes it much easier to check the correctness of heap to stack allocation optimization especially since not all memory operation intrinsics in LLVM has a volatile counterpart. This will obviously inhibit some valid optimizations too. These are mainly forwarding of memory operations from the caller to the exception body (since the other way around is almost always invalid) and can be improved with some simple IPO. This also makes it unnecessary to mark any memory operations on the stack with `volatile` this should also improve optimization in certain cases. Since we are scanning all the instructions in the outlined code anyway this also includes a simple optimization to delete exception frame that can't trigger. This implements a tweaked version of https://discourse.julialang.org/t/avoid-llvm-setjmp-bug/1140 Fix #17288
LLVM has really bad support for returns_twice functions and can incorrectly move memory operations (both at IR and machine code level) due to missing control flow edge. By outlining the exception body, we can hide these functions from LLVM completely (they only exist in C code) and prevent all miscompilation. This also makes it much easier to check the correctness of heap to stack allocation optimization especially since not all memory operation intrinsics in LLVM has a volatile counterpart. This will obviously inhibit some valid optimizations too. These are mainly forwarding of memory operations from the caller to the exception body (since the other way around is almost always invalid) and can be improved with some simple IPO. This also makes it unnecessary to mark any memory operations on the stack with `volatile` this should also improve optimization in certain cases. Since we are scanning all the instructions in the outlined code anyway this also includes a simple optimization to delete exception frame that can't trigger. This implements a tweaked version of https://discourse.julialang.org/t/avoid-llvm-setjmp-bug/1140 Fix #17288
LLVM has really bad support for returns_twice functions and can incorrectly move memory operations (both at IR and machine code level) due to missing control flow edge. By outlining the exception body, we can hide these functions from LLVM completely (they only exist in C code) and prevent all miscompilation. This also makes it much easier to check the correctness of heap to stack allocation optimization especially since not all memory operation intrinsics in LLVM has a volatile counterpart. This will obviously inhibit some valid optimizations too. These are mainly forwarding of memory operations from the caller to the exception body (since the other way around is almost always invalid) and can be improved with some simple IPO. This also makes it unnecessary to mark any memory operations on the stack with `volatile` this should also improve optimization in certain cases. Since we are scanning all the instructions in the outlined code anyway this also includes a simple optimization to delete exception frame that can't trigger. This implements a tweaked version of https://discourse.julialang.org/t/avoid-llvm-setjmp-bug/1140 Fix #17288
LLVM has really bad support for returns_twice functions and can incorrectly move memory operations (both at IR and machine code level) due to missing control flow edge. By outlining the exception body, we can hide these functions from LLVM completely (they only exist in C code) and prevent all miscompilation. This also makes it much easier to check the correctness of heap to stack allocation optimization especially since not all memory operation intrinsics in LLVM has a volatile counterpart. This will obviously inhibit some valid optimizations too. These are mainly forwarding of memory operations from the caller to the exception body (since the other way around is almost always invalid) and can be improved with some simple IPO. This also makes it unnecessary to mark any memory operations on the stack with `volatile` this should also improve optimization in certain cases. Since we are scanning all the instructions in the outlined code anyway this also includes a simple optimization to delete exception frame that can't trigger. This implements a tweaked version of https://discourse.julialang.org/t/avoid-llvm-setjmp-bug/1140 Fix #17288
LLVM has really bad support for returns_twice functions and can incorrectly move memory operations (both at IR and machine code level) due to missing control flow edge. By outlining the exception body, we can hide these functions from LLVM completely (they only exist in C code) and prevent all miscompilation. This also makes it much easier to check the correctness of heap to stack allocation optimization especially since not all memory operation intrinsics in LLVM has a volatile counterpart. This will obviously inhibit some valid optimizations too. These are mainly forwarding of memory operations from the caller to the exception body (since the other way around is almost always invalid) and can be improved with some simple IPO. This also makes it unnecessary to mark any memory operations on the stack with `volatile` this should also improve optimization in certain cases. Since we are scanning all the instructions in the outlined code anyway this also includes a simple optimization to delete exception frame that can't trigger. This implements a tweaked version of https://discourse.julialang.org/t/avoid-llvm-setjmp-bug/1140 Fix #17288
LLVM has really bad support for returns_twice functions and can incorrectly move memory operations (both at IR and machine code level) due to missing control flow edge. By outlining the exception body, we can hide these functions from LLVM completely (they only exist in C code) and prevent all miscompilation. This also makes it much easier to check the correctness of heap to stack allocation optimization especially since not all memory operation intrinsics in LLVM has a volatile counterpart. This will obviously inhibit some valid optimizations too. These are mainly forwarding of memory operations from the caller to the exception body (since the other way around is almost always invalid) and can be improved with some simple IPO. This also makes it unnecessary to mark any memory operations on the stack with `volatile` this should also improve optimization in certain cases. Since we are scanning all the instructions in the outlined code anyway this also includes a simple optimization to delete exception frame that can't trigger. This implements a tweaked version of https://discourse.julialang.org/t/avoid-llvm-setjmp-bug/1140 Fix #17288
CF https://reviews.llvm.org/D75967 Thanks @yuyichao for providing such a nice reproducer! Theoretically we could consider including that in our llvm patchset, regardless of whether upstream accepts it (depending on whether we are sure enough that I didn't break anything else. I am not very confident that llvm has no other setjmp related bugs, the machineIR world is a weird foreign wild west place). |
Something similar to that a subset of that patch seems to have been merged in https://reviews.llvm.org/D77767 |
Is this bug actually still open? Do we have a function that can reproduce it still? |
FWIW, the Julia level reproducer doesn't reproduce for me anymore:
|
One of my packages (CUDAdrv) has recently started failing on julia master, with a segfault in
typemap.c
. I've bisected this issue to e2bd129 (all backtraces and line numbers below are on that commit's tree). I'm not sure where to start debugging this, so I'm at least reporting it here already.This causes a segfault in
jl_typemap_level_assoc_exact
:Running in GDB makes it segfault somewhere else, but I assume due to the same problem (
jl_typeof(NULL)
):... with this comparison (against 209 ==
CUDAdrv.ERROR_NO_BINARY_FOR_GPU
) originating from:I've not been able to reduce the test case, as reduced versions did not reliably trigger the segfault on all my systems anymore, while the full CUDAdrv test suite does. I've tested on two Linux64 systems (one Debian 8, one Arch), with fresh builds without any Makefile flags.
@yuyichao any ideas what might be causing this, or where to look for clues?
The text was updated successfully, but these errors were encountered: