-
Notifications
You must be signed in to change notification settings - Fork 381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mtail keeps on running if producer crashes when reading from stdin #331
Comments
That sounds like a good idea.
I wonder if there are any weird effects by doing so though, does it need
another flag to turn on that behaviour? I think it doesn't, if there are no
other log files open to read and the FD closes then it seems there's
nothing it can do afterwards.
In the interim if you're using bash can you turn on pipefail mode to make
the shell kill mtail?
…On Wed, 29 Jul 2020, 19:22 Emanuele Rocca, ***@***.***> wrote:
Hi!
Wikimedia uses mtail for various purposes, including exposing varnish
statistics. To do that, we've got a very simple shell script called
varnishmtail
<https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/varnish/files/varnishmtail>.
As you can see, the script boils down to varnishncsa | mtail -logs
/dev/stdin. We used to run mtail with -logfds 0 and then moved to -logs
/dev/stdin when -logfds was removed (see this comment on mtail issue #3
<#3 (comment)>).
Now, there's a problem we've discovered recently. If the producer dies
(varnishncsa in this case) mtail keeps on running normally, hence the
varnishmtail script keeps on running, and the systemd unit responsible
for the whole thing does not notice anything. However, for all purposes the
system is at that point not functioning, given that stats aren't updated
any longer. See a more detailed description on our bug tracking system
<https://phabricator.wikimedia.org/T259020>.
I think that, in the special case of when mtail is reading from stdin,
receiving EOF should make the process exit. Thoughts?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#331>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAXFX67B34MC2QL7KRXHS4DR57S5DANCNFSM4PLMQLTQ>
.
|
I also don't think another flag is needed.
That is the first thing I thought of doing too! However, pipefail is about the exit status of the pipeline, not about making any command stop. If mtail keeps on running, the shell waits for it whether pipefail is enabled or not. |
As a workaround for T259020, run the varnishncsa | mtail pipeline in the background, checking that varnishncsa is still up in a while loop. This is an evidently baroque hack needed until mtail is fixed to do the right thing when receiving EOF on stdin: google/mtail#331 Bug: T259020 Change-Id: I6a118e659b3fe7298a7a19ca3268924e8719638b
As of HEAD right now I think that I want to remove the need for using If you can try out in the meantime that would be nice, but not necessary. |
Please do this. Because it looks Thanks! |
There's a bit to untangle here. The tailer can easily shut itself down once there are no more log patterns to watch; i.e. when stdin is closed it is removed from the pattern list. The VM runtime can shut down once the tailer has closed the lines channel. The hard part is getting the exporter and http server to close once the VM is done. There's some TODOs I left around indicating I should do better with the server shutdown code which give me some hints. |
The `Exporter` and `HTTPServer` are not connected by channels to the `Runtime`, only indirectly through mutating effects on the Store. As such we can't rely on the `Exporter` to shut down automatically when the `Runtime` shuts down. Here instead we remove the `WaitGroup` and allow an `Exporter` to be shut down explicitly after the `WaitGroup` is `Done`. Issue: #331
Well it turns out it's not so easy to shut down the tailer because it is also looking for new log patterns, so it needs to know that there are no more logstreams, and that there will never be any more logstreams. I think I have figured out a way to make that happen, needs a bit more refactoring in the pattern globbing code. |
The work of globbing a single pattern is separate from iterating over all patterns, setting us up to start a goroutine per pattern instead of an iterator. Issue: #331
We would like to use this in per-pattern goroutines. Issue: #331
Instead of collecting sockets and tailing them after, just tail them straight away. This makes the "early exit" check for "no things" broken, so replace that with a check in the final shutdown handler that there were no active globs. This will be replaced in the next few changes once migrated to goroutines and we can use only the `WaitGroup` for counting globs, too, but in the meantime this is more correct than the previous code. Issue: #331
Hi!
Wikimedia uses mtail for various purposes, including exposing varnish statistics. To do that, we've got a very simple shell script called varnishmtail. As you can see, the script boils down to
varnishncsa | mtail -logs /dev/stdin
. We used to run mtail with-logfds 0
and then moved to-logs /dev/stdin
when-logfds
was removed (see this comment on mtail issue #3).Now, there's a problem we've discovered recently. If the producer dies (varnishncsa in this case) mtail keeps on running normally, hence the
varnishmtail
script keeps on running, and the systemd unit responsible for the whole thing does not notice anything. However, for all purposes the system is at that point not functioning, given that stats aren't updated any longer. See a more detailed description on our bug tracking system.I think that, in the special case of when mtail is reading from stdin, receiving
EOF
should make the process exit. Thoughts?The text was updated successfully, but these errors were encountered: