-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop incorrect "Process exited with status 1" puma errors from occurring on SIGTERM #1026
Comments
For more information, see bundle config documentation |
Per "bundle config --help", I believe we can just set the environment variable BUNDLE_DISABLE_EXEC_LOAD - the environment variables take precedence, and then we don't have to worry about race conditions setting a global variable if multiple procs start at the same time. The documentation isn't clear about the expected value of the environment variable; it may not matter. Let's start with "true" as the value, that's clear. |
Disable exec-load in Procfile to eliminate a false puma error at runtime on Heroku. This is a tiny but non-obvious change, so this commit message provides lots of detail. Every day we typically get incorrect error reports of the following form: > heroku/web.1: Process exited with status 1 This is caused because Heroku cycles the webserver (puma) by sending SIGTERM, but when puma exits it incorrectly returns an error code of 1 ("error occurred"). That is simply wrong; a process that exits because it was specifically asked to do so (via SIGTERM) has not experienced an error. Instead, exiting on a request to do so, without any other error, is correct behavior and should return a status code of 0. However, because puma incorrectly reports a non-zero error code, lots of other logs whine about a significant problem (even though no problem has occurred). At this time there seems to be some disagreement on whether the problem is really from puma or bundler. See puma issue 1438 and bundler issue 6090 for more. In a grand sense we don't care, we just don't want false positives. One solution (noted in the issues above) would be to make "bundle exec" use Kernel.exec instead of Kernel.load. This essentially disables a minor optimization that bundler usually uses. This is slightly slower on startup (it replaces a whole OS process, instead of just reloading from Ruby), but we rarely start web processes so it would have practically no effect in the real world. Per "bundle config --help", we do this just by setting the environment variable BUNDLE_DISABLE_EXEC_LOAD. The environment variables take precedence, and then we don't have to worry about race conditions setting a global variable if multiple procs start at the same time. This commit uses the "env" command; technically we probably don't need to, but this ensures that regardless of shell (or straight invocation of exec) we get our environment variable set. Signed-off-by: David A. Wheeler <[email protected]>
OK.
…--
Dan Kohn <[email protected]>
Executive Director, Cloud Native Computing Foundation https://www.cncf.io
+1-415-233-1000 https://www.dankohn.com
On Sat, Feb 3, 2018 at 10:24 PM, David A. Wheeler ***@***.***> wrote:
Per "bundle config --help", I believe we can just set the environment
variable BUNDLE_DISABLE_EXEC_LOAD - the environment variables take
precedence, and then we don't have to worry about race conditions setting a
global variable if multiple procs start at the same time.
The documentation isn't clear about the expected value of the environment
variable; it may not matter. Let's start with "true" as the value, that's
clear.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1026 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AC8MBiawIB36J0lR9O_U4OgzAbzE6yWYks5tRSMFgaJpZM4R3eTc>
.
|
Great! I've labelled this issue as "bug". It's not really a bug in our program, it's really a workaround for a bug in a program we depend on. That said, the actions manifest as a bug in our program, and in the end, we're responsible for the impact of the components we depend on . |
I think we've resolved it. With the environment variable change, here are example log entries we see:
That compares to these older messages, with the "failed to load" messages from bundler:
|
Strictly speaking the process now exiting with exit code 143, not 0. But that's a widely-expected signal value for "received and processed SIGTERM normally", and the rest of the system seems to handle it well. Our problem was that we were getting exit code 1, which is widely interpreted as "things went badly". We'll keep monitoring, but seems okay now. |
Reopening. Heroku sees the exit code 143, and from its viewpoint that's non-zero and thus STILL an error. We'll have to insert a little shim that captures error code 143, and turns it into error code 0. |
#1048 is intended to solve it. |
Completed. It's switched to error code 143, and the rest of the system now knows that 143 isn't a problem. |
Every day we typically get incorrect error reports of the following form:
This is caused because Heroku cycles the webserver (puma) by sending SIGTERM, but when puma exits it incorrectly returns an error code of 1 ("error occurred"). That is simply wrong; a process that exits because it was specifically asked to do so (via SIGTERM) has not experienced an error. Instead, exiting on a request to do so, without any other error, is correct behavior and should return a status code of 0. However, because puma incorrectly reports a non-zero error code, lots of other logs whine about a significant problem (even though no problem has occurred). Here's more details showing this happening:
We are running the current version of puma (3.11.2). See puma issue 1438 and bundler issue 6090 for more.
Note that we run puma using this Procfile:
One solution (noted in the issues above) would be to make "bundle exec" use Kernel.exec instead of Kernel.load. This essentially disables a minor optimization that bundle usually uses. This is slightly slower on startup (it replaces a whole OS process, instead of just reloading from Ruby), but we rarely start processes so it would have practically no effect in the real world. Here's how to force that:
Other solutions are welcome.
The text was updated successfully, but these errors were encountered: