-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize TLS access in sysimg. #14168
Conversation
c.c. @vtjnash |
lgtm (appveyor failure looks unrelated due to triggering this assert https://github.com/JuliaLang/julia/blob/yyc/tls-opt/src/codegen.cpp#L948) |
Yeah, it happened a few times recently and I've just restarted the build. |
a4baa39
to
7960d40
Compare
I'm finally able to test what exactly can LLVM optimize when we are accessing tls variables in a control flow and it seems that LLVM can't hoist a const function call as good as I hoped. The functions I tested with are @eval f(b) = if b
nothing
else
$(Expr(:the_exception))
end
@eval g(n) = for i in 1:n
$(Expr(:the_exception))
end On the current master, llvm emit a call to Clang doesn't have this problem with otherwise identical looking code and metadata ( |
7960d40
to
a5ef077
Compare
Fixed none-MCJIT compilation. Local test passed with threading enabled on both LLVM 3.3 and 3.7. |
* Instead of letting the linker resolve `jl_get_ptls_states` to the wrapper function in `libjulia.so`, use our own symbol table (`jl_sysimg_gvars`) to resolve the address to the actual getter function at initialization time. * Emit call to `jl_get_ptls_states` at the beginning of the function. For some reason, LLVM couldn't hoist the function call out of the loop but is smart enough to move it down to a branch if it is not needed in other branches, or move it closer to where the result is actually needed.
a5ef077
to
877bae0
Compare
Added some comment since the ways we emit TLS references are a little complicated now.... @vtjnash Mind having a look again at this version? (especially about moving the call to |
lgtm |
Optimize TLS access in sysimg.
should we be changing the |
From my understanding, it shouldn't make a difference anymore. We currently provide two TLS variables, one in the The The |
yeah msvc ignores it and gives a warning which is one reason I'm asking. It might have an equivalent which is spelled differently, but it sounds like there are far fewer supported TLS models on windows so maybe not. |
Also note that the choice of tls model we should use as I mentioned above should be the default one anyway so we should be able to just remove it (the command line argument). |
windows only really supports IE, so msvc is probably just going to complain that the the flag is meaningless / impossible |
Instead of letting the linker resolve
jl_get_ptls_states
to the wrapper function inlibjulia.so
, use our own symbol table (jl_sysimg_gvars
) to resolve the address to the actual getter function at initialization time.This saves one load and one jump for each tls address access.
Using the same benchmark I used in the PR that introduce
jl_get_ptls_states
. Only thegcframe (sysimg)
andgcframe
ones for threading on (shown as sysimg and JIT respectively) are included since nothing else is changed by this PR.It's a little funny to see that the sysimg version is now consistently faster than the JIT version (by ~
0.22
clock cycle...).Comparing the assembly of the JIT version with function address inlined
to the sysimg version with the function address loaded from the data section
Maybe it's the instruction order and/or the difference in code size that matter?