-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add raw log message to log handler #7207
Conversation
@tsg I'm not sure if we should enable this by default or only have it for example in all our modules enabled. |
e5e67b8
to
9eedc54
Compare
Current default for ignore_above is 1024. This is too short for some command line entries on Windows. This increases it to 2048. Closes elastic#8076
CHANGELOG.asciidoc
Outdated
@@ -178,6 +178,8 @@ https://github.com/elastic/beats/compare/v6.2.3...master[Check the HEAD diff] | |||
- Correctly join partial log lines when using `docker` input. {pull}6967[6967] | |||
- Add support for TLS with client authentication to the TCP input {pull}7056[7056] | |||
- Converted part of pipeline from treafik/access metricSet to dissect to improve efficeny. {pull}7209[7209] | |||
- Add support for TLS with client authentication to the TCP input. {pull}7056[7056] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This entry does not belong here.
CHANGELOG.asciidoc
Outdated
@@ -178,6 +178,8 @@ https://github.com/elastic/beats/compare/v6.2.3...master[Check the HEAD diff] | |||
- Correctly join partial log lines when using `docker` input. {pull}6967[6967] | |||
- Add support for TLS with client authentication to the TCP input {pull}7056[7056] | |||
- Converted part of pipeline from treafik/access metricSet to dissect to improve efficeny. {pull}7209[7209] | |||
- Add support for TLS with client authentication to the TCP input. {pull}7056[7056] | |||
- Add log.message to each log event. {pull}[] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, add the number of the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We agreed that this feature is going to be disabled by default, because it has a big impact on users. If it is enabled silently on update for users, their whole installation could break. The size of the events are doubled, so they run out of storage. At this point this must be a conscious decision of users to enable it. Thus, we ensure they can prepare for/acknowledge the increased storage requirements.
If we decide later to enable it by default, we can do it in 7.x.
I've just looked at the issue which references it. I still believe that on input level, raw should not be added to the event. But I am fine with adding this to the modules. I am keeping my review as request changes, because currently it enables keeping the raw message for log input, not just for modules. |
filebeat/harvester/reader/limit.go
Outdated
@@ -18,5 +18,9 @@ func (p *Limit) Next() (Message, error) { | |||
if len(message.Content) > p.maxBytes { | |||
message.Content = message.Content[:p.maxBytes] | |||
} | |||
|
|||
if len(message.Raw) > p.maxBytes { | |||
message.Raw = message.Raw[:p.maxBytes] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am on the fence with this. Raw message should not be truncated, because it is a form of processing. But this could protect users from injecting too long messages to their pipelines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would go with truncating the raw message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would actually go with your first proposal and not truncate it. At the same time I would not index log.original
.
Can you elaborate on what you mean with too long message to the pipeline
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean such a big message which could disrupt the output of Filebeat.
By default filebeat adds log.message to each event which contains the raw unprocessed message. This can be disabled by setting `raw_message` to false to save disk space. The message in `log.message` contains a full multiline message, is encoded and the new line at the end is stripped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be interesting to see what the actual impact on enabling it would mean on the storage level. My hope is that it has a smaller impact as the same data exists twice it "could" compress well.
@kvch @ph This change currently only applies to the log input. Should we to something similar for the other inputs?
filebeat/harvester/reader/limit.go
Outdated
@@ -18,5 +18,9 @@ func (p *Limit) Next() (Message, error) { | |||
if len(message.Content) > p.maxBytes { | |||
message.Content = message.Content[:p.maxBytes] | |||
} | |||
|
|||
if len(message.Raw) > p.maxBytes { | |||
message.Raw = message.Raw[:p.maxBytes] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would actually go with your first proposal and not truncate it. At the same time I would not index log.original
.
Can you elaborate on what you mean with too long message to the pipeline
?
I think we should. In case of |
Closing in favor of #8448 |
By default filebeat adds log.message to each event which contains the raw unprocessed message. This can be disabled by setting
raw_message
to false to save disk space.The message in
log.message
contains a full multiline message, is encoded and the new line at the end is stripped.