-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash handler hangs in Docker container (linux) #269
Comments
Interesting. I’ve never encountered this before. It sounds like your app is doing something specific - not docker- but I haven’t verified the docket behavior yet. The reason I think that is that all pull requests have their unit tests run on a docker platform since a couple of years back and a hanging loop at exit has never happened... I can’t say with 100% that a fatal crash has happened with these runs. I certainly will try to replicate it. —- See the examples (unfortunately I haven’t added this to the documentation yet) for g3::setFatalPreLoggingHook() |
One important note if you try to reproduce it: the application must be PID 1. This is the PID that has special signal handling semantics in the kernel (usually taken by |
It seems that the PID1 is a special Linux thing and not docker https://hackernoon.com/my-process-became-pid-1-and-now-signals-behave-strangely-b05c52cc551c Option 1: declare your own signal handler for the signal emitted with exit() |
Option 4: possibly custom exit handling can be added with the std::atexit |
I don’t see any action on the G3log side regarding this. Although new to me this is expected Linux behavior if you run the process as PID1. Please see option 1-4. Please let me know how it goes. I’ll keep this open for awhile longer |
Regarding the options:
Anyway, if you don't see this as an issue, it's ok. I just wanted to let you know. |
I do see it as an issue - for PID1 processes only To iterate how the fatal handling currently works. Once G3log has successfully shut down sinks etc the original fatal handler is restored, if there were any and the fatal signal is re-emitted. I.e if you are using G3log on a non-PID1 system the behavior should be as close to normal “fatal signal” handling as possible. The default signal exit function does:
The signal from the The reasoning was that if the Easy solution for you:
It’s the same solution you would have to implement if G3log wasn’t there
I’m not against 2. I however want to keep G3log as much std library and cross system calls as possible. So what other Does |
My understanding of PID1 type processes is that they are very special beasts and just exiting it rarely makes any sense. It makes more sense to reboot or to shutdown the whole system. I.e. if you have decided to use PID1 then it’s good coding and design practice to have your own defaults to deal with fatal exits/signals. This is probably (opt1 in my latest reply) the path to go instead of relying on G3log to handle the PID1 exit logic |
As for PID1, it's not special any more: containerized applications run as PID1 by default (PID namespace), and this configuration is recommended by Docker. I'm not insisting on fixing this in g3log, though I've spent some time figuring out why my application hangs in a container and operates fine otherwise. Having this thread here for others running into this, with all the information, is fine for me. |
Added information for recommended PID1 fatal handling in the API documentation. Ref: 01be7d4 |
Hi Kjell,
Applications using g3log hang (do not exit) when they encounter a crash (SIGFPE, SIGSEGV, etc...) while running in a Docker container.
Here's my understanding of the crash handling mechanism (angle brackets indicate threads)
kill()
then callsexit()
When the app runs on a normal host (non-container), then the execution stops at
kill()
, and the default handler eventually terminates the app. In a container the app runs as PID1, and different signal handling rules apply. The default action for signals sent bykill()
is to ignore them (I didn't found the exact specification for this in linux documentation, only blogposts). So execution passeskill()
and hangs inexit()
. I don't know exactly whyexit()
hangs, but I suspect it has to do something with the pending signal handler on the main thread.For me replacing
exit()
with_exit()
solves the problem._exit()
is similar toexit()
, but it does not callatexit()
callbacks (not a problem for abnormal termination), and does not flush the stdio streams. In our application streams are flushed in the sink. So _exit() works for me, but I don't know about other platforms g3log supports. (_exit()
is POSIX, but there's an equivalent C99 call:_Exit()
)What do you think?
The text was updated successfully, but these errors were encountered: