-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enableWorkerThreads : crash upon SIGINT #419
Comments
Looks like worker threads are currently broken completely. For a simple application just enabling this functionality makes server hang after completing initial TCP handshake with no data written. |
I've set up an Arch VM and can't reproduce it - both Ctrl+C and making requests work fine. It also works on Windows and Ubuntu. |
Can reliably reproduce it on both native desktop installation and with virtualbox instance. Both 0.7.18 and master. dmd / dub installed from repository. What was your VM setup? |
Hm ok, did you try with vibe.d master? There have been a couple of fixes since then. I'll try 0.7.18 to be sure it's nothing system specific. |
|
Oh sorry, overlooked that. So that's strange then. I'm testing on a fresh system: Up-to-date Arch Linux x86-64, DMD 2.064.2, vibe.d master, DUB master, everything "dub upgrade"d - running on VMware. Tried with your example code, as well as the "bench-http-server" example. |
Did you install dmd using |
The one from the zip file. I guess VMware vs. Virtual Box shouldn't make a difference on this level, but who knows. |
Smells bad. Will retest on binary from .zip shortly. |
Phew, luckily it still crashes :) |
Ok, will report after more detailed investigation. |
Looks like corrupted fiber switch - all core dumps have garbage in registers like this:
Will need to do a very detailed investigation of event loop / fiber code to proceed so this is likely to be put on pause for now =/ |
libevent2.d:427
Should be
|
Unfortunately it can't, because the class is GC managed. This means that any other GC references (including the mutex in particular) might already be finalized before the destructor is called. Also, locking a mutex in a destructor is generally useless because any competing thread will access the already destructed (and possibly freed) object after the mutex is unlocked, still resulting in a crash. It needs to be made sure in the higher level code that no one accesses the object anymore when its destructor is called. In this particular case (the call stack above) it also doesn't make sense for the destructor to be called, because the I'll setup the VM with multiple cores, thinking about it, that seems to be the most likely cause for the differences. |
Still can't get it, even with multiple cores. I'll push a fix for the (un)register methods. |
… from a foreign thread. The GC calls finalizaters in whatever thread the collection happened, which can cause unregisterObject() to be executed in a different thread than where the driver and the object live(d). See also #419.
Using master with that fix I was reliably getting same backtrace with crash on Any suggestions? |
I've understood the issue now and will prepare a fix. Windows and Linux work very differently - Windows in particular currently "sidesteps" the issue, because the break signal arrives in a special thread where On Linux, several issue are working together. First, the default hasher in So basically, the thread shutdown procedure needs to be adjusted to work from any thread and to use a |
Have tested the fix. Good stuff : it does not crash. Bad stuff : it hangs forever upon exit :) Though AFAIR you have been mentioning this as some inherent libevent2 issue, is it still true? |
Though it seem to ignore even |
That sucks! I need to somehow be able to reproduce that. For me it shuts down in a perfectly clean way... Could there be any setting that influences how things are handled for your OS installation? Otherwise, you could try to put debug log statements on (almost) each line of |
BTW, |
Pretty much default Arch x64 virtualbox instance here, 1 CPU setup. Only additional packages installed from repos are
I didn't check |
Wait a minute. Using |
…ler for all threads to shut down. Also replaces the implicitly "__gshared" s_core.m_eventLoopRunning by a thread local s_eventLoopRunning. This sometimes caused threads to not exit when they should. See #419.
There have still been two issues (one shared variable which was thought to be thread local and the signal handler was taking too long to finish). The code is now also written to unify handling across worker threads and normal threads, which makes it easier to understand. In the current state everything works reliably for me using either "kill" or Ctrl+C. |
Yay, works for me! You can call yourself "telepathy debugging master" now :) |
P.S. This has also fixed issue with hanging requests with worker threads / distribution enabled. |
That's some purely awesome multi-thread handling with atomics. I'm always impressed by it because it's so modern |
Phew! :) Okay, now it's time for me to finally take a closer look at the session topics! |
Using trivial listener app:
Crashes after hitting Ctrl+C (Arch Linux x64).
Backtrace:
The text was updated successfully, but these errors were encountered: