-
-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TLS memory errors with aligned thread-local variables #497
Comments
Generally, Re: non-C/C++ languages. It might sound disappointing, but it is not advised to attempt to bring them to devkitPro homebrew environments, as they constitute an endless source of build maintainability and reproducibility problems; not to mention problems caused by necessary and unavoidable API/ABI changes in our upstream, which encourages users of those 3rd party languages to damage their toolchain installations, and ultimately produces unhappy users that are unable to compile simple programs. |
Thanks for the information! It is very interesting, but this explanation doesn't provide an answer to why there is a difference between low or high optimization levels.
I'd lie if I said I knew one. I have no idea why, but a math library (deep down in a dependency tree for other things I was testing) used alignment in TLS variables. It is important to note however how much this kind of functionality is used in Rust programs. Because of how easy it is to manage multi-threaded applications in Rust, many libraries include into their systems non-optional threading functionality. We are working to make managing 3DS preemptive threads just as easy, but to have native support for these kinds of things we need to make some compromises. For now, we have implemented (a quite wasteful, yet highly compatible) heap-based TLS system. Having out-of-the-box support for Rust's most popular libraries is one of our objectives.
We know, I know. It was clear since the start that maintaining our toolchain would be the hardest part (especially long term), which is why I always took some precautions before attempting any important decisions. We aim at abstracting It's a big task, but we strive to overcome it, thanks to our efforts and those of the many contributors that have already joined us. |
@fincs would it be possible to set something like this in the linker script?
The first Thread t = (Thread)memalign(*(size_t*)(__tdata_lma),allocsize+tlssize); Obviously, the rest of the section would be offset by the extra data but I think that could be worked around in the implementation? From a brief test, I can print the value and it appears to correctly reflect the max alignment, so maybe this would be a way to support arbitrary aligns in TLS? I'm not sure how most TLS implementations handle this, but the linker seems to me like the only place that would have the right information to insert for runtime usage...
I don't know too much about the details of the algorithm, or the performance implications, but it comes up in this change to add threading support to a matrix multiplication library: bluss/matrixmultiply@ I assume that the algorithm primarily targets x86 platforms but is meant to be portable and efficient, and it looks like they are using the TLS as some kind of buffer for for parallel calculations, maybe so the whole thing fits in a cache line or something like that? I don't know too much about the details or performance implications there. |
After a fair bit of trial and error, I think I have a way to make the TLS implementation work for over-aligned variables like these. I believe it's possible to make the changes in a backwards-compatible way as well, if necessary, and that the implementation I have does not incur a significant additional runtime cost compared to the existing implementation (other than the additional memory usage you mentioned, which should only affect executables that explicitly use these kinds of over-aligned TLS variables). @fincs would you accept PRs here and in https://github.com/devkitPro/devkitarm-crtls to make this possible? |
Sure, go for it. If all, I'd try to avoid placing the |
What I have now puts it in the Do you think storing the value in a different section is okay, or should I try to work around 3dsxtool to make a "valid" relocation (like |
I anticipated you would run into this problem. It is indeed not correct to try to expose the desired value as a symbol. Symbols are for addresses to "objects", not raw non-address values; and the linker (and 3dsxtool) will get confused otherwise, thinking there is an object at said address.
Seems fine to me. I had a look at LD documentation and it seems that the section must be defined prior to using the |
Yeah, I didn't expect it to work when I first tried it, but I think the reason it can work is this:
Because the I'll see if I can post a version of what I have tonight or tomorrow for you to review and we can go from there. |
Born from: rust3ds/ctru-rs#60
Description
After some testing on our Rust libraries for this system, we've noticed a weird bug with the use of thread-local storage. With some investigation, we have pinpointed the problem to the use of aligned thread-local variables. As our tests show, the offset of the variable pointers are wrong when one or more aligned TLS variables exist, causing memory leaks, access of uninitialized memory etc. This issue appears in both Rust and C programs.
Furthermore, this issue seems to only affect (at least in our examples) programs with a
debug
optimization level.E.g. :
debug
mode in Rust reproduces the bug.release
mode in Rust doesn't reproduce the bug.-Og
in C reproduces the bug.-O2
in C doesn't reproduce the bug.Here is a test program to try the issue. Remember to use
-Og
to replicate:Notes about our Rust toolchain
Our Rust programs mainly use
libctru
(to substitute missingnewlib
functions), and thearm-none-eabi-gcc
linker. Most of the linking flags are the same (basically those in a normal Makefile but in the Rust target-specification). Either way, it's hard to believe this one to be an issue in either of the compilers (rustc
andgcc
) as the issue reproduces in both. It's likely an issue withlibctru
or the linker.The text was updated successfully, but these errors were encountered: