-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logs with json and escaped " are not parsed #615
Comments
I'm having the same issue at the moment. If anyone can assist, it'd be great. |
I've been trying with all the Decode_Field_As combinations I can imagine without success. I need to find a way to keep the " escaped inside the message field |
@rgomesf would you please paste (in formatted mode) your docker parser content that comes from your configmap ? |
Found the problem in your config, in your docker parser you are processing a key named log, but in your log example the content is inside the key called message. Just put something like this in your docker parser:
|
Hi @edsiper . I already tried with that config and it also didn't work I found two outcomes: Decode_Field_As escaped json Decode_Field_As escaped message |
1 Decode_Field_As escaped json that means: unescape field named json 2 Decode_Field_As escaped message that means: unescape field named message also there is something required to be clarified: if your original message contains a byte ", by JSON spec it needs to be escaped so it becomes \". So if your message was packed inside a JSON map likely by docker you will get:
in JSON becomes {"log": "test \"message\""} the decoders in Fluent Bit allows to avoid double escaping when processing the text messages, but when sending the same message to elasticsearch or kibana by JSON spec it needs to be escaped, otherwise it's an invalid JSON message and will not be accepted. For short: if you see double escaped in elasticsearch/kibana means something needs to be fixed, otherwise single escaped is by spec. |
The exact same logs are parser correctly with fluentbit 0.11.11 I see \\\" in kibana JSON view |
When using in_tail, docker json-logs, kubernetes-filter and outputing to splunk, and applications logging JSON to stdout, it seems we got everything right by not having any |
@edsiper is there anything else I can provide you to help find a fix for this? Right now I reverted to 0.11.11 version. |
Dealing with the same urgent problem here. Would be glad to help testing or whatever |
We're having the same issue... If a container writes the following to stdout, it is parsed correctly by Elasticsearch.
However, if the same log line contains
We're using the docker parser with the log field escaped.
In essence:
I would expect the property containing the escaped quotes to be left as is but it seems that the backslashes are removed when the log field is escaped resulting in invalid JSON.
What's the proper way of configuring fluent-bit to handle this? We're using fluent-bit 13.2. |
I'm also facing a similar challenge and would like to know the correct approach. Steps to reproduce as follows. Run a container that outputs the following:
This can be achieved by performing the command:
I then have a docker kubernetes input setup as follows:
Which is parsed by the docker parser configured as:
The JSON payload is then sent to logz.io as:
This causes the end result of the field displaying in elasticsearch/logz.io being an incorrectly indexed JSON field of:
I have spent many days pulling my hair out trying to work this out, I'm wondering if someone can suggest the right configuration to have these JSON messages correctly indexed in elasticsearch? |
I have the same issue here, if UTF8 chars are inside JSON which is the |
@meggarr your assessment is exactly what I'm seeing and it appears to be a limitation per the fluent-bit docs. I am having the same problem of an escaped json in the log field, which I can't parse as JSON as it's escaped, and when I use the do_next after parsing the JSON object is not parsed. I have a ticket in #691 which is a specific representation of my use case. I've included below an excerpt from the docs that suggest that you can only use multiple decoders on a field that is unstructured raw text. Perhaps this is a feature request and not a ug? Optional Actions By default if a decoder fails to decode the field or want to try a next decoder, is possible to define an optional action. Available actions are:
Note that actions are affected by some restrictions: on Decode_Field_As, if succeeded, another decoder of the same type in the same field can be applied only if the data continue being a unstructed message (raw text). |
Maybe this is a bug, what I am thinking is that parsing it as JSON should only care double quotes, but apparently, it transforms UTF8 chars as well, e.g. in my case, after JSON decoding, |
After several minutes investigation in the code I think the UTF8 decoding part is good except that it reads
The UTF8 will read |
In the UTF-8 decoder, it tries to read escaped char for general purpose. Escaping a new-line is valid in JSON encoding, if it lines a UTF-8 decoder and then a JSON decoder, the JSON decoder will fail, as the escaped new-line is treated as a real newline in the first UTF-8 decoder. This fixes the issue in fluent#615 Signed-off-by: Richard Meng <[email protected]>
In the UTF-8 decoder, it tries to read escaped char for general purpose. Escaping a new-line is valid in JSON encoding, if it lines a UTF-8 decoder and then a JSON decoder, the JSON decoder will fail, as the escaped new-line is treated as a real newline in the first UTF-8 decoder. This fixes the issue in #615 Signed-off-by: Richard Meng <[email protected]>
Signed-off-by: Eduardo Silva <[email protected]>
Hi @edsiper, I can not see Message and severity key as a indexable in elastic search.
` |
@HarishHothi would you please share your full Fluent Bit configuration ? (and..are you running in Kubernetes?) |
Hi @edsiper Below is my full fluentbit configuration. Yes we are running fluentbit in on-premise kubernetes cluster. |
hi @edsiper is there any update? |
I did a PR and build 0.14.2 with cherry-picked fix. Could you test if this work for you. |
Hi @ese Still see the same problem. My sample log |
@HarishHothi This is the config I am using:
|
@HarishHothi The quote must be double back-slash escaped in the logs, otherwise, it will be treated as JSON tokens which is wrong, E.g.
|
Is this a related issue helm/charts#10424 ? |
For what it's worth, we ran into the same issue in our Kubernetes set up with the fluent-bit Helm chart using fluentbit 1.0.4 and ended up solving it with the following:
|
Just comment here as well as we were also facing same issue where
When gone through the docker parser/decoder of fluent-bit with the following config:
Was not showing up as valid JSON on Splunk hence it could not extract these lovely JSON fields. Once we changed the decoder to use |
@Pluies we are having the same issues with fluent-bit on Helm (escaping "). Can you show us how you did it in values.yaml?
|
For those using Helm, I found this to be a working configuration in values.yaml:
|
I appreciate this is already closed issue and I have previously commented with workaround I used of using When using Anyone else seen this? Will raise a new issue anyways. |
FYI: Please check the following comment on #1278 : |
Issue already fixed, ref: #1278 (comment) |
A lot of fixes related to structured json logs landed fluent/fluent-bit#615 fluent/fluent-bit#1278 https://fluentbit.io/announcements/v1.2.0/
A lot of fixes related to structured json logs landed fluent/fluent-bit#615 fluent/fluent-bit#1278 https://fluentbit.io/announcements/v1.2.0/
Hi.
I'm using fluent-bit 13.2 daemonset with the following configuration:
In logs with " like this one, fluent-bit doesn't parse the log.
It seems related to "abc " inside the field message value. I was using the docker parser. Also tried changing the parser to:
but the result is the same.
I was using version 11 and it was working ok with this kind of logs.
The text was updated successfully, but these errors were encountered: