You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A recent panic in Filebeat was tracked down to a race condition where a Beats pipeline client was closed while it was still waiting on a response from the queue for an event's publication. The most common cause of closed pipeline clients is a Harvester shutting down.
One of the implications of this panic, though, is that there are cases where a client is shut down while it still has pending events that haven't entered the queue (possibly because the underlying file is deleted or renamed while some of its data is still being processed). We can prevent this from crashing Filebeat, but the "correct" behavior then is to drop any events that haven't yet entered the queue when the client is closed, which is probably not the intended behavior for callers of this API like the Filebeat harvesters.
It's not clear how common this issue is. Anecdotally, users who were seeing the panic issue saw it a few times a day, however only one rare branch of the cancellation would lead to an actual panic, and the number of pending events that are dropped on cancellation may be higher.
This issue would arise most often with a full or mostly-full queue, and with inputs that go through many pipeline clients, such as a Filestream input with a lot of file churn, or a Kubernetes autodiscover input with a lot of ephemeral pods.
The text was updated successfully, but these errors were encountered:
Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!
We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!
A recent panic in Filebeat was tracked down to a race condition where a Beats pipeline client was closed while it was still waiting on a response from the queue for an event's publication. The most common cause of closed pipeline clients is a Harvester shutting down.
One of the implications of this panic, though, is that there are cases where a client is shut down while it still has pending events that haven't entered the queue (possibly because the underlying file is deleted or renamed while some of its data is still being processed). We can prevent this from crashing Filebeat, but the "correct" behavior then is to drop any events that haven't yet entered the queue when the client is closed, which is probably not the intended behavior for callers of this API like the Filebeat harvesters.
It's not clear how common this issue is. Anecdotally, users who were seeing the panic issue saw it a few times a day, however only one rare branch of the cancellation would lead to an actual panic, and the number of pending events that are dropped on cancellation may be higher.
This issue would arise most often with a full or mostly-full queue, and with inputs that go through many pipeline clients, such as a Filestream input with a lot of file churn, or a Kubernetes autodiscover input with a lot of ephemeral pods.
The text was updated successfully, but these errors were encountered: