Multiline feature with built-in CRI parser does not separate streams #4387

mdraijer · 2021-12-02T12:03:19Z

Bug Report

Describe the bug

When a container application produces log messages that are split by the container runtime into multiple parts, and there are log messages written to stdout and stderr more or less at the same time, it is possible that Fluent Bit does not concatenate those parts.

To Reproduce

Example log message:

2021-11-18T14:47:34.730832241+00:00 stdout F some short log message A
2021-11-18T14:47:34.730832241+00:00 stdout P some longer log message B part one
2021-11-18T14:47:34.730908438+00:00 stdout F end of the longer message B
2021-11-18T14:47:34.730908438+00:00 stdout P another longer log message  C part one
2021-11-18T14:47:34.730939216+00:00 stdout P message C continued in part two
2021-11-18T14:47:34.735454549+00:00 stderr F some log message D in stderr part one
2021-11-18T14:47:34.735482854+00:00 stdout F message C still continued after a line from a different stream
2021-11-18T14:47:34.735482854+00:00 stderr F and the end of the stderr message D

Steps to reproduce the problem:

Create an application that writes a couple of long lines to stdout and stderr and run in a CRI-O runtime (e.g. Openshift Kubernetes cluster).

Expected behavior

The output (in our case Elastic) shows each log message complete as a separate document.

Your Environment

Version used: container image fluent/fluent-bit:1.8.9

Configuration: relevant part:

  [INPUT]
      Name tail
      Path /var/log/containers/*.log
      Multiline.Parser cri

Environment name and version: Kubernetes 1.20 / Openshift 4.7.37
Server type and version:
Operating System and version: RHEL 47.84, CRI-O 1.20.5-7
Filters and plugins:

Additional context

Lines have an indication in field 3: F for a one-line message and for the concluding line of a multi-line message; P for parts other than the final part of a multi-line message.
Log messages from different streams (stdout, stderr) can be mixed up (examples C and D).
Log messages can be in JSON and we also apply the JSON parser as filter.
Messages which are not recombined by Fluent Bit are hard to find in Elastic and in case they are formatted as JSON, the parts can obviously not be interpreted as JSON, so we are missing a lot of information in Elastic for those lines.

Note that this is not about the multiline filter (see e.g. #4309), nor about the old multiline parser (see Tail - Multiline).

The text was updated successfully, but these errors were encountered:

PettitWesley · 2021-12-03T00:13:24Z

@edsiper IMO, this is a pretty major miss in the cri parser.

mdraijer · 2021-12-13T12:26:15Z

@edsiper @PettitWesley any idea when a fix can be implemented?

github-actions · 2022-03-26T09:38:34Z

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

mdraijer · 2022-03-28T06:07:05Z

I take it that the "pretty major miss" is still open?

PettitWesley · 2022-03-28T20:50:35Z

This definitely needs to be fixed @agup006 @edsiper, this is a fundamental thing users expect Fluent Bit to just get right/work

agup006 · 2022-03-28T20:55:28Z

I thought we had added the partial message support in the initial version of cri parser, adding @lecaros - would you be able to help us schedule this with other multiline work?

PettitWesley · 2022-03-28T21:00:49Z

@agup006 @lecaros partial message support for CRI log files exists but here is reported a bug: lines from stdout and stderr are not properly separated

PettitWesley · 2022-09-19T20:11:31Z

I want to note that fixing this is separate from the work that I did in the multiline filter to add buffered mode and partial_message support.

The work here is net new and different. Though the use case/problem statement is similar- split messages from a container runtime need to be rejoined intelligently based on context (like whether they come from stdout or stderr). My work was to serve folks who are mainly using the fluentd docker log driver to send logs to Fluent Bit. This is CRI container log files which is similar and different.

tarruda · 2022-09-21T16:30:59Z

This can be easily accomplished with a Lua filter:

local buffer = {
  stdout = {
    text = {}
  },
  stderr = {
    text = {}
  }
}


function cb_filter(tag, ts, record)
  local _, _, date, stream, record_type, line = record.log:find('^(%S+)%s+(%S+)%s+([PF])%s+(.+)$')
  
  table.insert(buffer[stream].text, line)
  if #(buffer[stream].text) == 1 then
    -- store the date extracted from the first line
    buffer[stream].date = date
  end
  
  if record_type == 'P' then
    -- not finished
    return -1       
  end
  
  -- flush the buffered message
  record = {
    date = buffer[stream].date,
    stream = stream,
    log = table.concat(buffer[stream].text, '\n')
  }
  buffer[stream].text={}
  
  return 1, ts, record
end

Test on fluent-bit lua playground. Sample input:

{"log":"2021-11-18T14:47:34.730832241+00:00 stdout F some short log message A"}
{"log":"2021-11-18T14:47:34.730832241+00:00 stdout P some longer log message B part one"}
{"log":"2021-11-18T14:47:34.730908438+00:00 stdout F end of the longer message B"}
{"log":"2021-11-18T14:47:34.730908438+00:00 stdout P another longer log message  C part one"}
{"log":"2021-11-18T14:47:34.730939216+00:00 stdout P message C continued in part two"}
{"log":"2021-11-18T14:47:34.735454549+00:00 stderr F some log message D in stderr part one"}
{"log":"2021-11-18T14:47:34.735482854+00:00 stdout F message C still continued after a line from a different stream"}
{"log":"2021-11-18T14:47:34.735482854+00:00 stderr F and the end of the stderr message D"}

Output:

[0,{"date":"2021-11-18T14:47:34.730832241+00:00","stream":"stdout","log":"some short log message A"}]
[0,{"date":"2021-11-18T14:47:34.730832241+00:00","stream":"stdout","log":"some longer log message B part one\nend of the longer message B"}]
[0,{"date":"2021-11-18T14:47:34.735454549+00:00","stream":"stderr","log":"some log message D in stderr part one"}]
[0,{"date":"2021-11-18T14:47:34.730908438+00:00","stream":"stdout","log":"another longer log message  C part one\nmessage C continued in part two\nmessage C still continued after a line from a different stream"}]
[0,{"date":"2021-11-18T14:47:34.735482854+00:00","stream":"stderr","log":"and the end of the stderr message D"}]

Note that this example does not consider multiple files. If grouping streams is required, then per file/tag buffers would need to be used

mmerickel · 2022-12-21T07:23:53Z

Has anyone actually seen containerd or cri-o output interleave partials like in the OP's example? From things like https://github.com/containerd/containerd/blob/78cd9d3b6b4c1a8d977a9ee695328f4c51a304d0/pkg/cri/io/logger_test.go#L121 I'm led to believe that they output the full stream in multiple lines before moving onto lines from the other stream, so you should be able to assume all partials in a row are part of the same stream.

This is counter to the original design spec, but I would say that spec is just broken and makes no sense because it has partials for stdout being joined with a full from stderr for "log entry 2" which just makes no sense.

tl;dr it feels like this isn't actually a bug if the multiline logic is combining consecutive lines from the same file based on P/F tags?

frankhetterich · 2023-02-22T13:41:35Z

We experiencing a similar Issue, but i´m not shure if it´s really the same.
Our Applications are logging in JSON to stdout and we´re harvesting the logs with the tail plugin:

[INPUT]
    Name              tail
    Tag               kube.*
    Path              /var/log/containers/*_default_*.log
    multiline.parser  docker, cri
    DB                /var/log/fluentbit_kube.db
    Mem_Buf_Limit     5MB
    Skip_Long_Lines   Off
    Refresh_Interval  10`

Now it happens that K8s rotates the logfile and truncates the JSON logline randomly:
2023-01-27T13:40:28.92620435Z stdout P {"Timestamp":"2023-01-27T13:40:28.9222232+00:00","Level":"Information","MessageTemplate":"Request finished in {ElapsedMilliseconds}ms {StatusCode} {ContentType}","RenderedMessage":"Request finished in 0ms 200 null","Properties":{"ElapsedMilliseconds":0,"St

The log then continues in a new File:
2023-01-27T13:40:28.92735586Z stdout F atusCode":200,"ContentType":null,"EventId":{"Id":2,"Name":"RequestFinished"}

Our expectation now is that fluentbit will put these two lines together again and send them to Elasticsearch as one line like this (manual) example:

2023-01-27T13:40:28.92620435Z stdout P {"Timestamp":"2023-01-27T13:40:28.9222232+00:00","Level":"Information","MessageTemplate":"Request finished in {ElapsedMilliseconds}ms {StatusCode} {ContentType}","RenderedMessage":"Request finished in 0ms 200 null","Properties":{"ElapsedMilliseconds":0,"StatusCode":200,"ContentType":null,"EventId":{"Id":2,"Name":"RequestFinished"}

Is this expectation correct? Or is the Issue maybe related to the K8s log rotate?

regards,
Frank

ryan65 · 2023-05-08T10:49:04Z

Hi
I too have a problem with multiline from different streams of the same container that get mixed together. occasionally I can see a stderr log line (debug log) take part of a multiline from an stdout stream (info) . it doesnt happen most of the time , but it is pretty consistent once in a while.
Is there any workaround . ?

leonardo-albertovich · 2023-05-08T11:21:40Z

Is this happening in fluent-bit 2.1 or 2.0?

ryan65 · 2023-05-08T12:03:41Z

actually I'm working with 1.9.7 but I didn't see any fix on this so I assume it is still happening.
Just to clarify, this only happens with difference streams , the stderr log seems to sometimes take up the last line or lines of a previous stdout multi line log.
I use my own multiline parser state machine , not a predefined one.
I was wondering if there is a workaround for example, somehow to have multiple instances of parsers running , one for each different stream so that they wont collide.

leonardo-albertovich · 2023-05-08T13:00:01Z

The thing is that there supposedly there should be different multiline component instances which is why I'm asking about 2.0 or 2.1.

Since 1.9 and 1.8 are no longer supported the best approach to get this fixed would be reproducing it in either 2.0 or 2.1 so we can take a look.

ryan65 · 2023-05-09T08:59:27Z

Hi leonardo
Just FYI
I see there is some kind of break in my fluentbit 2.1.1 / fluentd integration. (my fluentd is 1.15.2 but I also tried 1.16 and same reslults)
when upgrading to 2.1.1 I get error messages in the fluentd logs , seems related related to the fluent timestamp
local-lrc-fluentd-log-collector-1 | 2023-05-09 05:50:47 +0000 [error]: #0 unexpected error on reading data host="172.28.0.3" port=40886 error_class=NoMethodError error="undefined method to_i' for [2023-05-09 05:50:46.662332400 +0000, {}]:Array\n\n time = Fluent::EventTime.now if time.nil? || time.to_i == 0 # to_i == 0` for empty EventTime\n ^^^^^\nDid you mean? to_s\n to_a\n to_h"

seems that something is wrong with the internal fluentbit metadata containing the timestamp or something else related to metadata I think.
HOWEVER I see this is fixed in 2.1.2 but I didnt see any mention of this in the release notes.

Anyway regarding the mixed multilines between streams ,
I will deploy with 2.1.2 and check if the issue still happens, and update later on in the day .
Thanks.

leonardo-albertovich · 2023-05-09T10:12:52Z

It seems that it could have been missed by the person who wrote the notes. Yes, there was an issue in versions 2.1.0 and 2.1.1 with the default value of the retain_metadata_in_forward_mode option in the forward output plugin not being set to false which means fluent-bit was serializing records in the new format (that includes metadata).

In fluent-bit 2.1.2 that was amended to retain backwards compatibility with fluentd, older fluent-bit versions and compatible systems which in turn means that when a user wants to interconnect two fluent-bit 2.1+ instances using the forward output plugin they need to explicitly set retain_metadata_in_forward_mode to true in order to retain any existing metadata (only pertinent for opentelemetry ingested data at the moment).

I hope this makes things a bit clearer.

ryan65 · 2023-05-09T11:01:07Z

Hi Leonardo
so I tested with 2.1.2 and the issue still happens
below are logs from the the fluentbit output filter , to demonstrate the issue.
I wrote a test that print several logs every 10 seconds (mixed log levels) and one of them is a multiline callstack.
If you go to timestamp 683629046.556653225 you will see the multiline log in stdout and the last part of the callstack "at process.processTimers (node:internal/timers:512:7" is missing,
Now if you go to the log at 1683629046.556703940 which was printed with logLevel error (stderr stream) the end of the missing callstack is actually appended to this log
Important to note.

This does not happen most of the times , usally everything is OK as expected as you can see in the logs below. You can check all the other logs below which have the word Multiline and will see the whole callstack is there.
this is definitely only when we have different logs in different streams (debug, info, error) , I never saw this issue when all the logs where printed with the same stream . stdout. (info).

2023-05-09T13:43:55+03:00 [0] lrc.saas.srl-worker: [[1683629033.552235588, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:43:53.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||&&&&&&&&&&&&&&&& Same Info 2|| ", "kuberemoved"=>{}
2023-05-09T13:43:57+03:00 [0] lrc.saas.srl-worker: [[1683629036.555438955, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:43:56.555Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZ Just a single line|| ", "kuberemoved"=>{}
2023-05-09T13:43:57+03:00 [1] lrc.saas.srl-worker: [[1683629036.555537074, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:43:56.555Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZZ MultiLine Error: test stack
2023-05-09T13:43:57+03:00 at Timeout.different [as _onTimeout] (/opt/testexec.server/hub/app.js:180:54)
2023-05-09T13:43:57+03:00 at listOnTimeout (node:internal/timers:569:17)
2023-05-09T13:43:57+03:00 at process.processTimers (node:internal/timers:512:7)|| ", "kuberemoved"=>{}
2023-05-09T13:43:57+03:00 [2] lrc.saas.srl-worker: [[1683629036.555557797, {}], {"stream"=>"stderr", "logtag"=>"F", "log"=>"2023-05-09T10:43:56.555Z - debug: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Debug line|| ", "kuberemoved"=>{}
2023-05-09T13:43:57+03:00 [3] lrc.saas.srl-worker: [[1683629036.555608716, {}], {"stream"=>"stderr", "logtag"=>"F", "log"=>"2023-05-09T10:43:56.555Z - error: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Error line|| ", "kuberemoved"=>{}
2023-05-09T13:44:04+03:00 [0] lrc.saas.srl-worker: [[1683629043.552716663, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:03.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# Just a single line|| ", "kuberemoved"=>{}
2023-05-09T13:44:04+03:00 [1] lrc.saas.srl-worker: [[1683629043.552794859, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:03.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# MultiLine Error: test stack
2023-05-09T13:44:04+03:00 at Timeout._onTimeout (/opt/testexec.server/hub/app.js:188:54)
2023-05-09T13:44:04+03:00 at listOnTimeout (node:internal/timers:569:17)
2023-05-09T13:44:04+03:00 at process.processTimers (node:internal/timers:512:7)|| ", "kuberemoved"=>{}
2023-05-09T13:44:04+03:00 [2] lrc.saas.srl-worker: [[1683629043.552830072, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:03.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||$$$$$$$$$$$$$$$$$$$$ Samine Info|| ", "kuberemoved"=>{}
2023-05-09T13:44:06+03:00 [0] lrc.saas.srl-worker: [[1683629043.552833405, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:03.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||&&&&&&&&&&&&&&&& Same Info 2|| ", "kuberemoved"=>{}
2023-05-09T13:44:07+03:00 [0] lrc.saas.srl-worker: [[1683629046.556515642, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:06.556Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZ Just a single line|| ", "kuberemoved"=>{}
2023-05-09T13:44:07+03:00 [1] lrc.saas.srl-worker: [[1683629046.556653225, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:06.556Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZZ MultiLine Error: test stack
2023-05-09T13:44:07+03:00 at Timeout.different [as _onTimeout] (/opt/testexec.server/hub/app.js:180:54)
2023-05-09T13:44:07+03:00 at listOnTimeout (node:internal/timers:569:17)", "kuberemoved"=>{}
2023-05-09T13:44:07+03:00 [2] lrc.saas.srl-worker: [[1683629046.556678249, {}], {"stream"=>"stderr", "logtag"=>"F", "log"=>"2023-05-09T10:44:06.556Z - debug: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Debug line|| ", "kuberemoved"=>{}
2023-05-09T13:44:07+03:00 [3] lrc.saas.srl-worker: [[1683629046.556703940, {}], {"stream"=>"stderr", "logtag"=>"F", "log"=>"2023-05-09T10:44:06.556Z - error: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Error line||
2023-05-09T13:44:07+03:00 at process.processTimers (node:internal/timers:512:7)|| ", "kuberemoved"=>{}
2023-05-09T13:44:14+03:00 [0] lrc.saas.srl-worker: [[1683629053.551961925, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:13.551Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# Just a single line|| ", "kuberemoved"=>{}
2023-05-09T13:44:14+03:00 [1] lrc.saas.srl-worker: [[1683629053.552114257, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:13.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# MultiLine Error: test stack
2023-05-09T13:44:14+03:00 at Timeout._onTimeout (/opt/testexec.server/hub/app.js:188:54)
2023-05-09T13:44:14+03:00 at listOnTimeout (node:internal/timers:569:17)
2023-05-09T13:44:14+03:00 at process.processTimers (node:internal/timers:512:7)|| ", "kuberemoved"=>{}
2023-05-09T13:44:14+03:00 [2] lrc.saas.srl-worker: [[1683629053.552150889, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:13.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||$$$$$$$$$$$$$$$$$$$$ Samine Info|| ", "kuberemoved"=>{}
2023-05-09T13:44:16+03:00 [0] lrc.saas.srl-worker: [[1683629053.552154410, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:13.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||&&&&&&&&&&&&&&&& Same Info 2|| ", "kuberemoved"=>{}
2023-05-09T13:44:17+03:00 [0] lrc.saas.srl-worker: [[1683629056.554114944, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:16.554Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZ Just a single line|| ", "kuberemoved"=>{}
2023-05-09T13:44:17+03:00 [1] lrc.saas.srl-worker: [[1683629056.554279655, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:16.554Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZZ MultiLine Error: test stack
2023-05-09T13:44:17+03:00 at Timeout.different [as _onTimeout] (/opt/testexec.server/hub/app.js:180:54)
2023-05-09T13:44:17+03:00 at listOnTimeout (node:internal/timers:569:17)
2023-05-09T13:44:17+03:00 at process.processTimers (node:internal/timers:512:7)|| ", "kuberemoved"=>{}
2023-05-09T13:44:17+03:00 [2] lrc.saas.srl-worker: [[1683629056.554320923, {}], {"stream"=>"stderr", "logtag"=>"F", "log"=>"2023-05-09T10:44:16.554Z - debug: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Debug line|| ", "kuberemoved"=>{}
2023-05-09T13:44:18+03:00 [0] lrc.saas.srl-worker: [[1683629056.554330939, {}], {"stream"=>"stderr", "logtag"=>"F", "log"=>"2023-05-09T10:44:16.554Z - error: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Error line|| ", "kuberemoved"=>{}
2023-05-09T13:44:24+03:00 [0] lrc.saas.srl-worker: [[1683629063.552755280, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:23.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# Just a single line|| ", "kuberemoved"=>{}
2023-05-09T13:44:24+03:00 [1] lrc.saas.srl-worker: [[1683629063.552883385, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:23.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# MultiLine Error: test stack
2023-05-09T13:44:24+03:00 at Timeout._onTimeout (/opt/testexec.server/hub/app.js:188:54)
2023-05-09T13:44:24+03:00 at listOnTimeout (node:internal/timers:569:17)
2023-05-09T13:44:24+03:00 at process.processTimers (node:internal/timers:512:7)|| ", "kuberemoved"=>{}
2023-05-09T13:44:24+03:00 [2] lrc.saas.srl-worker: [[1683629063.552934892, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:23.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||$$$$$$$$$$$$$$$$$$$$ Samine Info|| ", "kuberemoved"=>{}
2023-05-09T13:44:25+03:00 [0] lrc.saas.srl-worker: [[1683629063.552939775, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:23.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||&&&&&&&&&&&&&&&& Same Info 2|| ", "kuberemoved"=>{}
2023-05-09T13:44:27+03:00 [0] lrc.saas.srl-worker: [[1683629066.554403061, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:26.554Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZ Just a single line|| ", "kuberemoved"=>{}
2023-05-09T13:44:27+03:00 [1] lrc.saas.srl-worker: [[1683629066.554525319, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:26.554Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZZ MultiLine Error: test stack
2023-05-09T13:44:27+03:00 at Timeout.different [as _onTimeout] (/opt/testexec.server/hub/app.js:180:54)
2023-05-09T13:44:27+03:00 at listOnTimeout (node:internal/timers:569:17)
2023-05-09T13:44:27+03:00 at process.processTimers (node:internal/timers:512:7)|| ", "kuberemoved"=>{}
2023-05-09T13:44:27+03:00 [2] lrc.saas.srl-worker: [[1683629066.554554719, {}], {"stream"=>"stderr", "logtag"=>"F", "log"=>"2023-05-09T10:44:26.554Z - debug: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Debug line|| ", "kuberemoved"=>{}
2023-05-09T13:44:27+03:00 [3] lrc.saas.srl-worker: [[1683629066.554586062, {}], {"stream"=>"stderr", "logtag"=>"F", "log"=>"2023-05-09T10:44:26.554Z - error: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Error line|| ", "kuberemoved"=>{}
2023-05-09T13:44:33+03:00 [0] lrc.saas.srl-worker: [[1683629073.071104064, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:33.071Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||RabbitmqMessageTracker : 0 messages are now being handled.|| ", "kuberemoved"=>{}
2023-05-09T13:44:34+03:00 [0] lrc.saas.srl-worker: [[1683629073.552509143, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:33.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# Just a single line|| ", "kuberemoved"=>{}
2023-05-09T13:44:34+03:00 [1] lrc.saas.srl-worker: [[1683629073.552636752, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:33.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# MultiLine Error: test stack
2023-05-09T13:44:34+03:00 at Timeout._onTimeout (/opt/testexec.server/hub/app.js:188:54)
2023-05-09T13:44:34+03:00 at listOnTimeout (node:internal/timers:569:17)
2023-05-09T13:44:34+03:00 at process.processTimers (node:internal/timers:512:7)|| ", "kuberemoved"=>{}
2023-05-09T13:44:34+03:00 [2] lrc.saas.srl-worker: [[1683629073.552674041, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:33.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||$$$$$$$$$$$$$$$$$$$$ Samine Info|| ", "kuberemoved"=>{}

ryan65 · 2023-05-09T11:34:43Z

It seems that it could have been missed by the person who wrote the notes. Yes, there was an issue in versions 2.1.0 and 2.1.1 with the default value of the retain_metadata_in_forward_mode option in the forward output plugin not being set to false which means fluent-bit was serializing records in the new format (that includes metadata).

In fluent-bit 2.1.2 that was amended to retain backwards compatibility with fluentd, older fluent-bit versions and compatible systems which in turn means that when a user wants to interconnect two fluent-bit 2.1+ instances using the forward output plugin they need to explicitly set retain_metadata_in_forward_mode to true in order to retain any existing metadata (only pertinent for opentelemetry ingested data at the moment).

I hope this makes things a bit clearer.

Yes , thanks, just out of curiosity , when do you know when fluentd will be upgraded to support the metadata ?

ryan65 · 2023-05-09T11:42:02Z

Hi Leonardo so I tested with 2.1.2 and the issue still happens below are logs from the the fluentbit output filter , to demonstrate the issue. I wrote a test that print several logs every 10 seconds (mixed log levels) and one of them is a multiline callstack. If you go to timestamp 683629046.556653225 you will see the multiline log in stdout and the last part of the callstack "at process.processTimers (node:internal/timers:512:7" is missing, Now if you go to the log at 1683629046.556703940 which was printed with logLevel error (stderr stream) the end of the missing callstack is actually appended to this log Important to note.

This does not happen most of the times , usally everything is OK as expected as you can see in the logs below. You can check all the other logs below which have the word Multiline and will see the whole callstack is there.

this is definitely only when we have different logs in different streams (debug, info, error) , I never saw this issue when all the logs where printed with the same stream . stdout. (info).

2023-05-09T13:43:55+03:00 [0] lrc.saas.srl-worker: [[1683629033.552235588, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:43:53.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||&&&&&&&&&&&&&&&& Same Info 2|| ", "kuberemoved"=>{} 2023-05-09T13:43:57+03:00 [0] lrc.saas.srl-worker: [[1683629036.555438955, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:43:56.555Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZ Just a single line|| ", "kuberemoved"=>{} 2023-05-09T13:43:57+03:00 [1] lrc.saas.srl-worker: [[1683629036.555537074, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:43:56.555Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZZ MultiLine Error: test stack 2023-05-09T13:43:57+03:00 at Timeout.different [as _onTimeout] (/opt/testexec.server/hub/app.js:180:54) 2023-05-09T13:43:57+03:00 at listOnTimeout (node:internal/timers:569:17) 2023-05-09T13:43:57+03:00 at process.processTimers (node:internal/timers:512:7)|| ", "kuberemoved"=>{} 2023-05-09T13:43:57+03:00 [2] lrc.saas.srl-worker: [[1683629036.555557797, {}], {"stream"=>"stderr", "logtag"=>"F", "log"=>"2023-05-09T10:43:56.555Z - debug: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Debug line|| ", "kuberemoved"=>{} 2023-05-09T13:43:57+03:00 [3] lrc.saas.srl-worker: [[1683629036.555608716, {}], {"stream"=>"stderr", "logtag"=>"F", "log"=>"2023-05-09T10:43:56.555Z - error: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Error line|| ", "kuberemoved"=>{} 2023-05-09T13:44:04+03:00 [0] lrc.saas.srl-worker: [[1683629043.552716663, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:03.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# Just a single line|| ", "kuberemoved"=>{} 2023-05-09T13:44:04+03:00 [1] lrc.saas.srl-worker: [[1683629043.552794859, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:03.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# MultiLine Error: test stack 2023-05-09T13:44:04+03:00 at Timeout._onTimeout (/opt/testexec.server/hub/app.js:188:54) 2023-05-09T13:44:04+03:00 at listOnTimeout (node:internal/timers:569:17) 2023-05-09T13:44:04+03:00 at process.processTimers (node:internal/timers:512:7)|| ", "kuberemoved"=>{} 2023-05-09T13:44:04+03:00 [2] lrc.saas.srl-worker: [[1683629043.552830072, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:03.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||$$$$$$$$$$$$$$$$$$$$ Samine Info|| ", "kuberemoved"=>{} 2023-05-09T13:44:06+03:00 [0] lrc.saas.srl-worker: [[1683629043.552833405, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:03.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||&&&&&&&&&&&&&&&& Same Info 2|| ", "kuberemoved"=>{} 2023-05-09T13:44:07+03:00 [0] lrc.saas.srl-worker: [[1683629046.556515642, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:06.556Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZ Just a single line|| ", "kuberemoved"=>{} 2023-05-09T13:44:07+03:00 [1] lrc.saas.srl-worker: [[1683629046.556653225, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:06.556Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZZ MultiLine Error: test stack 2023-05-09T13:44:07+03:00 at Timeout.different [as _onTimeout] (/opt/testexec.server/hub/app.js:180:54) 2023-05-09T13:44:07+03:00 at listOnTimeout (node:internal/timers:569:17)", "kuberemoved"=>{} 2023-05-09T13:44:07+03:00 [2] lrc.saas.srl-worker: [[1683629046.556678249, {}], {"stream"=>"stderr", "logtag"=>"F", "log"=>"2023-05-09T10:44:06.556Z - debug: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Debug line|| ", "kuberemoved"=>{} 2023-05-09T13:44:07+03:00 [3] lrc.saas.srl-worker: [[1683629046.556703940, {}], {"stream"=>"stderr", "logtag"=>"F", "log"=>"2023-05-09T10:44:06.556Z - error: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Error line|| 2023-05-09T13:44:07+03:00 at process.processTimers (node:internal/timers:512:7)|| ", "kuberemoved"=>{} 2023-05-09T13:44:14+03:00 [0] lrc.saas.srl-worker: [[1683629053.551961925, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:13.551Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# Just a single line|| ", "kuberemoved"=>{} 2023-05-09T13:44:14+03:00 [1] lrc.saas.srl-worker: [[1683629053.552114257, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:13.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# MultiLine Error: test stack 2023-05-09T13:44:14+03:00 at Timeout._onTimeout (/opt/testexec.server/hub/app.js:188:54) 2023-05-09T13:44:14+03:00 at listOnTimeout (node:internal/timers:569:17) 2023-05-09T13:44:14+03:00 at process.processTimers (node:internal/timers:512:7)|| ", "kuberemoved"=>{} 2023-05-09T13:44:14+03:00 [2] lrc.saas.srl-worker: [[1683629053.552150889, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:13.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||$$$$$$$$$$$$$$$$$$$$ Samine Info|| ", "kuberemoved"=>{} 2023-05-09T13:44:16+03:00 [0] lrc.saas.srl-worker: [[1683629053.552154410, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:13.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||&&&&&&&&&&&&&&&& Same Info 2|| ", "kuberemoved"=>{} 2023-05-09T13:44:17+03:00 [0] lrc.saas.srl-worker: [[1683629056.554114944, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:16.554Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZ Just a single line|| ", "kuberemoved"=>{} 2023-05-09T13:44:17+03:00 [1] lrc.saas.srl-worker: [[1683629056.554279655, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:16.554Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZZ MultiLine Error: test stack 2023-05-09T13:44:17+03:00 at Timeout.different [as _onTimeout] (/opt/testexec.server/hub/app.js:180:54) 2023-05-09T13:44:17+03:00 at listOnTimeout (node:internal/timers:569:17) 2023-05-09T13:44:17+03:00 at process.processTimers (node:internal/timers:512:7)|| ", "kuberemoved"=>{} 2023-05-09T13:44:17+03:00 [2] lrc.saas.srl-worker: [[1683629056.554320923, {}], {"stream"=>"stderr", "logtag"=>"F", "log"=>"2023-05-09T10:44:16.554Z - debug: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Debug line|| ", "kuberemoved"=>{} 2023-05-09T13:44:18+03:00 [0] lrc.saas.srl-worker: [[1683629056.554330939, {}], {"stream"=>"stderr", "logtag"=>"F", "log"=>"2023-05-09T10:44:16.554Z - error: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Error line|| ", "kuberemoved"=>{} 2023-05-09T13:44:24+03:00 [0] lrc.saas.srl-worker: [[1683629063.552755280, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:23.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# Just a single line|| ", "kuberemoved"=>{} 2023-05-09T13:44:24+03:00 [1] lrc.saas.srl-worker: [[1683629063.552883385, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:23.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# MultiLine Error: test stack 2023-05-09T13:44:24+03:00 at Timeout._onTimeout (/opt/testexec.server/hub/app.js:188:54) 2023-05-09T13:44:24+03:00 at listOnTimeout (node:internal/timers:569:17) 2023-05-09T13:44:24+03:00 at process.processTimers (node:internal/timers:512:7)|| ", "kuberemoved"=>{} 2023-05-09T13:44:24+03:00 [2] lrc.saas.srl-worker: [[1683629063.552934892, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:23.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||$$$$$$$$$$$$$$$$$$$$ Samine Info|| ", "kuberemoved"=>{} 2023-05-09T13:44:25+03:00 [0] lrc.saas.srl-worker: [[1683629063.552939775, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:23.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||&&&&&&&&&&&&&&&& Same Info 2|| ", "kuberemoved"=>{} 2023-05-09T13:44:27+03:00 [0] lrc.saas.srl-worker: [[1683629066.554403061, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:26.554Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZ Just a single line|| ", "kuberemoved"=>{} 2023-05-09T13:44:27+03:00 [1] lrc.saas.srl-worker: [[1683629066.554525319, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:26.554Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZZ MultiLine Error: test stack 2023-05-09T13:44:27+03:00 at Timeout.different [as _onTimeout] (/opt/testexec.server/hub/app.js:180:54) 2023-05-09T13:44:27+03:00 at listOnTimeout (node:internal/timers:569:17) 2023-05-09T13:44:27+03:00 at process.processTimers (node:internal/timers:512:7)|| ", "kuberemoved"=>{} 2023-05-09T13:44:27+03:00 [2] lrc.saas.srl-worker: [[1683629066.554554719, {}], {"stream"=>"stderr", "logtag"=>"F", "log"=>"2023-05-09T10:44:26.554Z - debug: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Debug line|| ", "kuberemoved"=>{} 2023-05-09T13:44:27+03:00 [3] lrc.saas.srl-worker: [[1683629066.554586062, {}], {"stream"=>"stderr", "logtag"=>"F", "log"=>"2023-05-09T10:44:26.554Z - error: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Error line|| ", "kuberemoved"=>{} 2023-05-09T13:44:33+03:00 [0] lrc.saas.srl-worker: [[1683629073.071104064, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:33.071Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||RabbitmqMessageTracker : 0 messages are now being handled.|| ", "kuberemoved"=>{} 2023-05-09T13:44:34+03:00 [0] lrc.saas.srl-worker: [[1683629073.552509143, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:33.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# Just a single line|| ", "kuberemoved"=>{} 2023-05-09T13:44:34+03:00 [1] lrc.saas.srl-worker: [[1683629073.552636752, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:33.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# MultiLine Error: test stack 2023-05-09T13:44:34+03:00 at Timeout._onTimeout (/opt/testexec.server/hub/app.js:188:54) 2023-05-09T13:44:34+03:00 at listOnTimeout (node:internal/timers:569:17) 2023-05-09T13:44:34+03:00 at process.processTimers (node:internal/timers:512:7)|| ", "kuberemoved"=>{} 2023-05-09T13:44:34+03:00 [2] lrc.saas.srl-worker: [[1683629073.552674041, {}], {"stream"=>"stdout", "logtag"=>"F", "log"=>"2023-05-09T10:44:33.552Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||$$$$$$$$$$$$$$$$$$$$ Samine Info|| ", "kuberemoved"=>{}

BTW - just to clarify , I use my own custom multiline parser , which assumes a space at the begining of the log should be appended to the previous log . since all our logs never start with a space .
[MULTILINE_PARSER]
name multiline-lrc-saas
type regex
rule "start_state" "/^[^[\s\]]]/" "cont"
rule "cont" "/^[\s]]/" "cont"

leonardo-albertovich · 2023-05-09T13:12:46Z

I don't know when / if fluentd will be updated to support metadata. We submitted a forward protocol enhancement proposal but I'm not the one in charge of it and I don't have any information about it at the moment.

Do you think it'd be possible for you to put together a reproduction case we can use to reliably reproduce the issue and work on it?

Ideally it would be a self contained case with the required configuration files, scripts used to produce the input, sample input files (if any) and instructions.

I'm sure that would speed up the process tremendously and would be highly appreciated.

Thank you very much.

ryan65 · 2023-05-09T13:15:53Z

Hi Leonardo
what reproduction are you referring to , the mixed logs or the issue with fluentd and metadata compatibility ?
Thanks.

leonardo-albertovich · 2023-05-09T13:50:22Z

The mixed logs.

ryan65 · 2023-05-09T13:54:41Z

Hi @leonardo-albertovich
After you asked for logs files I actually found something very very interesting .
I entered the container logs themselves and see that the problem already begins in the container log file itself.
see timestamp 13:46:16.770528105Z where the multiline is written. in between the callstack there is an stderr log.
and then one line after the end of the callstack is printed.
Questions:

Not sure but recently we have moved from docker runtime to containerd runtime but Im not sure if this is the cause. It might have happened in the past as well but just went un noticed. could be an issue with the logging library itself. (winston nodejs)
in a case below , would you say that fluentbit should be able to handle a case like this (e.g seperate multiline accumilation for different streams) or is this the expected behaviour of fluentbit in which case the problem lies elsewhere in the order of the logs in the container files.

Thanks.

2023-05-09T13:46:13.769279129Z stdout F 2023-05-09T13:46:13.769Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||$$$$$$$$$$$$$$$$$$$$ Samine Info||
2023-05-09T13:46:13.769282954Z stdout F 2023-05-09T13:46:13.769Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||&&&&&&&&&&&&&&&& Same Info 2||
2023-05-09T13:46:16.770435149Z stdout F 2023-05-09T13:46:16.770Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZ Just a single line||
2023-05-09T13:46:16.770528105Z stdout F 2023-05-09T13:46:16.770Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||ZZZZZZZZZZZZZ MultiLine Error: test stack
2023-05-09T13:46:16.77053802Z stdout F at Timeout.different [as _onTimeout] (/opt/testexec.server/hub/app.js:180:54)
2023-05-09T13:46:16.770544984Z stdout F at listOnTimeout (node:internal/timers:569:17)
2023-05-09T13:46:16.77056145Z stderr F 2023-05-09T13:46:16.770Z - debug: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Debug line||
2023-05-09T13:46:16.77056665Z stdout F at process.processTimers (node:internal/timers:512:7)||
2023-05-09T13:46:16.770603504Z stderr F 2023-05-09T13:46:16.770Z - error: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||YYYYYYYYYYYYYYYYYYY Error line||
2023-05-09T13:46:23.769253346Z stdout F 2023-05-09T13:46:23.769Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# Just a single line||
2023-05-09T13:46:23.769371769Z stdout F 2023-05-09T13:46:23.769Z - info: EnvName=pcoe-aws|TenantId=hub-system|PID=1|ModuleId=worker|JobId=0||############# MultiLine Error: test stack
2023-05-

mdraijer · 2023-05-10T06:37:21Z

@ryan65 @leonardo-albertovich
Yes, the very interesting thing in the previous comment is exactly what I described in the issue itself, see at the top of this page.

Both Docker runtime and CRI-O show this behaviour, although Docker cuts off at 16k and CRI-O at 8k, so you will notice it sooner / more often with CRI-O.

The whole point is that Fluent-Bit should recognize the stream and perform the concatenation logic only for lines from the same stream.

Simple as that.

ryan65 · 2023-05-10T06:46:39Z

hi @mdraijer
Oh , this takes me by surprise, I wasn't actually sure that this is expected of fluentbit, in this case this seems to be a bug in fluentbit mechanism.
@leonardo-albertovich can you please confirm ?

leonardo-albertovich · 2023-05-10T07:11:35Z

From what I can see at the code (and my limited knowledge of the multiline parser system) the stream segregation feature should be there but I need to take a deeper look into it to be able to make a proper assessment.

I'll assign this issue to myself and I'll take some time to reproduce it as time permits (probably next week) to wrap it up as soon as possible.

At first I thought it could be an issue with this regular expression having an excessively greedy time pattern but that should alter the results so I doubt that's the case.

ryan65 · 2023-05-10T08:21:43Z

@leonardo-albertovich ,
Great. thanks very much

leonardo-albertovich · 2023-05-19T11:58:58Z

Quick update: @Syn3rman will work on this issue next week.

Syn3rman · 2023-05-25T12:07:21Z

I've opened a PR for it, could you give it a try @ryan65 or @mdraijer and check if it fixes it? thanks :)

ryan65 · 2023-05-28T05:49:51Z

Hi , not sure how to try it out, is there a docker image version tag available to download?
Thanks.

mdraijer · 2024-06-29T16:56:51Z

@Syn3rman I see 2 merged PR's in this thread. Does that mean that your fix (from May 2023, sorry for the late response), is already in some release? If yes, from which version is it included?

mdraijer mentioned this issue Dec 2, 2021

Generic Multiline Filter Re-Design #4309

Closed

github-actions bot added the Stale label Mar 26, 2022

PettitWesley added long-term Long term issues (exempted by stale bots) and removed Stale labels Mar 28, 2022

cinder-fish mentioned this issue Oct 13, 2022

Slow memory leak after adding multiline filter - 1.9.9 #6192

Closed

mmerickel mentioned this issue Dec 21, 2022

using the concat plugin with CRI logs ends up deleting the stream key from the record fluent-plugins-nursery/fluent-plugin-concat#116

Open

dvmdb mentioned this issue Apr 14, 2023

CRI Parser does not merge splittet Log Lines when K8s rotates the log files #6957

Closed

leonardo-albertovich self-assigned this May 10, 2023

This was referenced May 24, 2023

multiline: remove incorrect flush #7469

Merged

multiline: refactor flush logic #7471

Closed

multiline: remove incorrect flush #7477

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiline feature with built-in CRI parser does not separate streams #4387

Multiline feature with built-in CRI parser does not separate streams #4387

mdraijer commented Dec 2, 2021

PettitWesley commented Dec 3, 2021

mdraijer commented Dec 13, 2021

github-actions bot commented Mar 26, 2022

mdraijer commented Mar 28, 2022

PettitWesley commented Mar 28, 2022

agup006 commented Mar 28, 2022

PettitWesley commented Mar 28, 2022

PettitWesley commented Sep 19, 2022

tarruda commented Sep 21, 2022 •

edited

Loading

mmerickel commented Dec 21, 2022 •

edited

Loading

frankhetterich commented Feb 22, 2023

ryan65 commented May 8, 2023

leonardo-albertovich commented May 8, 2023

ryan65 commented May 8, 2023

leonardo-albertovich commented May 8, 2023

ryan65 commented May 9, 2023

leonardo-albertovich commented May 9, 2023

ryan65 commented May 9, 2023 •

edited

Loading

ryan65 commented May 9, 2023

ryan65 commented May 9, 2023

leonardo-albertovich commented May 9, 2023

ryan65 commented May 9, 2023

leonardo-albertovich commented May 9, 2023

ryan65 commented May 9, 2023 •

edited

Loading

mdraijer commented May 10, 2023

ryan65 commented May 10, 2023

leonardo-albertovich commented May 10, 2023

ryan65 commented May 10, 2023

leonardo-albertovich commented May 19, 2023

Syn3rman commented May 25, 2023

ryan65 commented May 28, 2023 •

edited

Loading

mdraijer commented Jun 29, 2024

Multiline feature with built-in CRI parser does not separate streams #4387

Multiline feature with built-in CRI parser does not separate streams #4387

Comments

mdraijer commented Dec 2, 2021

Bug Report

PettitWesley commented Dec 3, 2021

mdraijer commented Dec 13, 2021

github-actions bot commented Mar 26, 2022

mdraijer commented Mar 28, 2022

PettitWesley commented Mar 28, 2022

agup006 commented Mar 28, 2022

PettitWesley commented Mar 28, 2022

PettitWesley commented Sep 19, 2022

tarruda commented Sep 21, 2022 • edited Loading

mmerickel commented Dec 21, 2022 • edited Loading

frankhetterich commented Feb 22, 2023

ryan65 commented May 8, 2023

leonardo-albertovich commented May 8, 2023

ryan65 commented May 8, 2023

leonardo-albertovich commented May 8, 2023

ryan65 commented May 9, 2023

leonardo-albertovich commented May 9, 2023

ryan65 commented May 9, 2023 • edited Loading

ryan65 commented May 9, 2023

ryan65 commented May 9, 2023

leonardo-albertovich commented May 9, 2023

ryan65 commented May 9, 2023

leonardo-albertovich commented May 9, 2023

ryan65 commented May 9, 2023 • edited Loading

mdraijer commented May 10, 2023

ryan65 commented May 10, 2023

leonardo-albertovich commented May 10, 2023

ryan65 commented May 10, 2023

leonardo-albertovich commented May 19, 2023

Syn3rman commented May 25, 2023

ryan65 commented May 28, 2023 • edited Loading

mdraijer commented Jun 29, 2024

tarruda commented Sep 21, 2022 •

edited

Loading

mmerickel commented Dec 21, 2022 •

edited

Loading

ryan65 commented May 9, 2023 •

edited

Loading

ryan65 commented May 9, 2023 •

edited

Loading

ryan65 commented May 28, 2023 •

edited

Loading