-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flaky test pthreads_exit on ARM #2075
Comments
It is clear that block starting @ 0x4eb74000 is not freed.
|
So it is the memory for thread 17799's dstack is not freed.
So basically when a thread is newly created and in initialization phase, the app is exiting, the new thread simply waits without freeing its own dstack. |
So this is a regression? When the thread init code was first written, the dstack was not allocated yet at this exit-check point. We changed it to allocate the dstack at the parent clone point so we could store data on it. |
I can get this test to fail on AArch64, too. With a debug build of 38920e6 on Fedora 24:
After a while:
It's not the same error every time. |
On some AArch64 systems, at least, all of these are flaky:
|
Here's the stack when the ASSERT in mutex_delete() fails:
|
So thread 28134 still holds the thread_initexit_lock -- what is the callstack of that thread? |
I don't seem to be able to reproduce this now. There's the same failure, but by the time I've managed to connect a debugger, owner is zero. However, I can still give you a backtrace for the other thread when that happens:
|
Both on AArch64 and on x86_64, without DEBUG, pthreads_test sometimes runs for about a second, and sometimes for about a minute. The driver script reports the test as having passed, but is it really working as intended? |
To handle a thread exiting on attach, adds a timeout to wait_event() and its corresponding implementations: os_wait_event() on Windows and ksynch_wait() on UNIX. Uses the timeout to check whether a thread exited, and if so, to move on. Augments the test from #2600 to test this race. Abandon the api.detach_spawn test on Windows:i#2611 covers fixing the tricky problems on Windows. Leaves in place some fixes toward #2611: + Fixes a race where we put interception_code hooks in place before marking them +x + Increases MAX_THREADS_WAITING_FOR_DR_INIT Fixes clang 32-bit missing __moddi3 by adding it to third_party/libgcc and linking that into x86 and arm. To enable adding race checks, moves doing_detach inside the synchall and adds started_detach for the few checks that need pre-synch querying. Removes the dynamo_thread_init_during_process_exit flag that was added in 45dd931 for #2075, as the UNIX uninit_thread_count solution from #2600 solves that problem on its own. Fixes #2601
To handle a thread exiting on attach, adds a timeout to wait_event() and its corresponding implementations: os_wait_event() on Windows and ksynch_wait() on UNIX. Uses the timeout to check whether a thread exited, and if so, to move on. Augments the test from #2600 to test this race. Abandon the api.detach_spawn test on Windows:i#2611 covers fixing the tricky problems on Windows. Leaves in place some fixes toward #2611: + Fixes a race where we put interception_code hooks in place before marking them +x + Increases MAX_THREADS_WAITING_FOR_DR_INIT Fixes clang 32-bit missing __moddi3 by adding it to third_party/libgcc and linking that into x86 and arm. To enable adding race checks, moves doing_detach inside the synchall and adds started_detach for the few checks that need pre-synch querying. Removes the dynamo_thread_init_during_process_exit flag that was added in 45dd931 for #2075, as the UNIX uninit_thread_count solution from #2600 solves that problem on its own. Fixes #2601
The text was updated successfully, but these errors were encountered: