-
-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use __tdata_align to align thread local storage #504
Use __tdata_align to align thread local storage #504
Conversation
libctru/source/system/syscalls.c
Outdated
@@ -168,7 +164,7 @@ void initThreadVars(struct Thread_tag *thread) | |||
tv->thread_ptr = thread; | |||
#pragma GCC diagnostic push | |||
#pragma GCC diagnostic ignored "-Warray-bounds" | |||
tv->tls_tp = (thread != NULL ? (u8*)thread->stacktop : __tls_start) - 8; // Arm ELF TLS ABI mandates an 8-byte header | |||
tv->tls_tp = (thread != NULL ? (u8*)thread->stacktop : __tls_start); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is technically a change in behavior. 8 additional bytes are allocated for each thread to be used as the header, which seemed necessary for the alignment to be correct as I was experimenting, but there might be a way to avoid it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The double pointer sized TLS header is only used by more general, shared-library-oriented TLS models that we don't use (we effectively use local-exec
), thus the reason why libctru (and libnx) fake the existence of the header by simply subtracting its size from the real start of the TLS area. I don't believe it necessary to introduce this extra header at all, and I am not sure why you have had problems with alignment without it, I can only think of the fact that the stack on 32-bit machines typically needs 8-byte alignment, so that should be the baseline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took a bit of trial and error to get right, since the location of the header also needs to be aligned properly, but I think I figured it out, and pushed my changes.
This appears to work with all the test cases I listed above in the description as well.
Since only ARM_TLS_LE32 is used in practice with this library, the 8-byte TLS header goes unused, so we can just fake it by subtracting 8 from the dato offset and using that as tls_tp instead.
Hey, guys, so this is going to sound kind of random, buuuut... I'm a developer on the Sega Dreamcast indie SDK, KallistiOS, and I recently finished up my own quest to implement static TLS on the thing, and followed in the footsteps you guys laid out here and some of the alignment issues you ran into. Wound up making a really big multithreaded test suite just doing terrible things with .tdata and .tbss manually, automatically, and over-aligned variables to get things working. First of all, thanks. Secondly, I just wanted to make sure I'm not missing something and you guys don't have any issues, but don't you also have to align the .tbss data similarly? Check what I had to do here: Anyway, kudos from the DC scene. Looking forward to working with devkitPro some day with my own engine. :) |
@gyrovorbis wow, I'm amazed that this somehow helped someone on another project! Glad you were able to find it and found it useful for your purposes as well. As for the alignment of
It looks like your code is doing some of the work in its |
Fixes #497
Accompanying PR: devkitPro/devkitarm-crtls#6
Cases I was able to test:
Create
static __thread
variables of the following types and verify their values are initialized as expected, both in the main thread and one created withthreadCreate
: