-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Task block detector deadlocks with exception throwing #298
Comments
Not sure how we can fix this. Does that lock need to be recursive? |
On Tue, Jun 06, 2017 at 04:13:51AM -0700, Avi Kivity wrote:
Not sure how we can fix this. Does that lock need to be recursive?
After looking into it some more the lock is indeed recursive, but it is
not signal safe.
…--
Gleb.
|
Just hit this again. |
Possible solution: hijack __cxa_throw, set a thread-local flag, call original __cxa_throw, unset flag. Signal handler can then just look at the flag and look the other way if it is set. |
Well, 464f5e3 sort of fixes the bug since
there is no more lock to deadlock on. With gcc less than 7 there is still
a lock in _Unwind_Find_FDE, but it is taken and released immediately,
so it is much harder to hit the deadlock. But if you tried with gcc7
the bug should not exist there at all, so no wonder you could not hit
it.
…--
Gleb.
|
I suspected it, so I tried with gcc 6.3 (also to try on a bigger machine). |
I also hit this problem with scylla-2.0.0 on CentOS Linux release 7.2.1511 . Many task hang because of deadlock, which is caused by exceptions(write timeout). Thread 1 (Thread 0x7fb4a4c2f080 (LWP 48314)): |
@gleb-cloudius this is fixed. Let's close it. |
@avikivity this seems fixed and I don't have permissions to close seastar issues. |
Exception throwing takes symbol table lock while stack unwinding. If many threads throw exceptions simultaneously a thread may wait for a lock for a long time at which point task block detector will run and will try to unwind the stack too and deadlock as a result:
The text was updated successfully, but these errors were encountered: