-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fluent-plugin: Improve escaping in key_value format #2434
fluent-plugin: Improve escaping in key_value format #2434
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2434 +/- ##
==========================================
+ Coverage 61.55% 61.65% +0.09%
==========================================
Files 160 160
Lines 13612 13612
==========================================
+ Hits 8379 8392 +13
+ Misses 4607 4592 -15
- Partials 626 628 +2
|
The key_value format does not seem to be well specified. This change assumes "logfmt" as the format and escapes values accordingly: The code itself is lifted from here: https://github.com/csquared/node-logfmt/blob/e279d43cde019acfcca0abe99ea6b7ce8327e1f7/lib/stringify.js Grafana seems to handle values escaped in this manner much better than before. e.g.: query="{program="loki"}") was shown as (IIRC) query="{program= and is now displayed correctly. As an added bonus only values that require quoting will actually be quoted, improving readability and derived fields regexes.
c5f3c48
to
e164396
Compare
@cyriltovena ping. I've added a comment to the line in question. |
Thanks @andsens! Cyril is on leave for a bit and we haven't had a chance to circle back around to this PR. I'm less familiar with the fluentd plugin, is there much risk in this PR to breaking anyone's existing installations? |
hmm, I doubt it. I have only tested this in a fluentd->loki->grafana setup. The main issue this change fixes is Grafana being able to actually display values with quotes and spaces in them. It seems unlikely that anyone would have processing between fluentd->loki or loki->Grafana. p.s.: The fix is not complete, e.g. Grafana still has issues showing keys with dots in them. I'll need to examine which issues are on the loki side and which ones on the Grafana side before I do anything about that though. |
ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use the same setup with fluentd
-> loki
and this showed as immediate problelm with nested json and generally logfmt inside logfmt which is very common with a layers of logging collection that is Kubernetes :)
# Escape double quotes and backslashes by prefixing them with a backslash | ||
v = v.to_s.gsub(%r{(["\\])}, '\\\\\1') | ||
if v.include?(' ') || v.include?('=') | ||
formatted_labels.push(%(#{k}="#{v}")) | ||
else | ||
formatted_labels.push(%(#{k}=#{v})) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked some ruby sources for the best way to do this and I came up with this solution instead:
# Escape double quotes and backslashes by prefixing them with a backslash | |
v = v.to_s.gsub(%r{(["\\])}, '\\\\\1') | |
if v.include?(' ') || v.include?('=') | |
formatted_labels.push(%(#{k}="#{v}")) | |
else | |
formatted_labels.push(%(#{k}=#{v})) | |
end | |
# This regular expression will validate true if the string doesn't need escaping. | |
v = v.dump unless v.to_s =~ /\A[\w\.\-\+\%\_\,\:\;\/]*\z/i | |
formatted_labels.push(%(#{k}=#{v})) |
Source: https://github.com/polyfox/moon-logfmt/blob/master/lib/moon-logfmt/utils.rb#L25 and https://ruby-doc.org/core-2.5.1/String.html#method-i-dump
@kirooshu so in this case ruby's internal string escaping is used? |
Yeah, I agree this is why I approved your PR as is. I tried it first and it does work as intended, I just felt like this should be a solved problem like you said and looked for more canonical way. To be honest, fluentd should have some methods for key/value format and I know there are many plugins for key/value and we should probably do what they do already to stay consistent within the fluentd ecosystem. |
For us, if we cannot have this merged, we will need to just "fork" the output by putting it in our version control with these same change. Can we expect to have this fix in the next version release or should we go with the forking for the time being? |
@kirooshu I think the loki team has quite a bit on their plate, so you should probably just go ahead and build the plugin yourself. I'm not currently at my work computer, so I don't have the commands to build the plugin itself, but my Dockerfile for fluentd itself is tracked, so maybe this can save you some time :-)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The
key_value
format does not seem to be well specified.This change assumes "logfmt" as the format and escapes values accordingly.
The code itself is lifted from here:
https://github.com/csquared/node-logfmt/blob/e279d43cde019acfcca0abe99ea6b7ce8327e1f7/lib/stringify.js
Grafana seems to handle values escaped in this manner much better than before.
e.g.:
query="{program="loki"}")
was shown as (IIRC)query="{program=
and is now displayed correctly.As an added bonus only values that require quoting will actually be quoted, improving readability and derived field regexes.