Generalization of reader pipeline in Beats #16137
Labels
discuss
Issue needs further discussion.
Team:Services
(Deprecated) Label for the former Integrations-Services team
Goal
Filebeat has various readers that are strictly tied to the
log
input. However, the readers (usually multiline and syslog) are requested to be used from multiple inputs. The goal is to generalize these readers so more inputs can utilize them.Current state
The reader pipeline consists of the following readers:
readfile.EncoderReader
: reads an input with the configured encodingreadfile.LineReader
: reads a line from a filereadfile.LimitReader
: truncates message if it is too longreadfile.StripNewLine
: removes the configured newline characters from the end of the messagereadfile.TimeoutReader
: flushes the message if the configured time has elapsedmultiline.Reader
: creates a single message from multiple ones based on patterns and timeoutreadjson.DockerJSON
: parses JSON events from Dockerreadjson.JSONReader
: parses arbitrary JSON eventsAll of the readers above implement the following
reader.Reader
interface:All readers except for
readfile.EncoderReader
read areader.Message
from an underlyingreader.Reader
.These readers are only exposed and used in
log
input and inDockerJSON
incontainer
input. However, these readers could be reused in more inputs (e.g.journal
,tcp
, etc.). It is possible that someone needs to read multiline messages from a systemd journal.But right now adding this feature to more inputs is not straightforward because the reader pipeline is too tightly coupled with the
log
input. The readers are part oflibbeat
, but they can only be used from Filebeat. In order to provide a similar experience for more Beats, the pipeline has to be generalized.Proposal
The "source" parts of inputs have to be decoupled. TCP and UDP inputs were created like this. The concrete data sources are part of the package
inputsource
. More sources can be extracted from inputs to provide flexibility:log
: reads from a log filejournal
: reads a journalThe
inputsource
s should implement the interfaceoi.Reader
orreader.Reader
in order to plug into the pipeline.Progress
So far
filestream
input is the only one that leverages the samereader.Reader
functionality. During adopting the readers, a few things have been sorted out. In the interim, a newparser
interface was introduced so the existing reader.Reader structures can function as previously in thelog
input. The new interface already has an improved JSON handling. The interface is not exposed yet.Next steps
reader.Reader
orbeat.Event
reader.Message
to abeat.Event
reader.Reader
and inputs to support e.g. journalsThe text was updated successfully, but these errors were encountered: