Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix filebeat system module timezone parsing #13308

Merged
merged 3 commits into from
Aug 27, 2019
Merged

Fix filebeat system module timezone parsing #13308

merged 3 commits into from
Aug 27, 2019

Conversation

pragkent
Copy link
Contributor

@pragkent pragkent commented Aug 22, 2019

Closes #13306

@pragkent pragkent requested a review from a team as a code owner August 22, 2019 00:54
@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

1 similar comment
@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

@jsoriano
Copy link
Member

ok to test

@jsoriano jsoriano added bug Filebeat Filebeat module needs_backport PR is waiting to be backported to other branches. v7.3.2 v7.4.0 labels Aug 22, 2019
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pragkent thanks for reporting the issue and opening this PR to fix it. I think that you are actually right and we are incorrectly handling timezones, at least since #12253 (7.3.0).

I wonder if other modules are affected too after #12311.

Could you also add a changelog entry mentioning the fix?

@pragkent
Copy link
Contributor Author

Sure, I'll handle this later

@pragkent
Copy link
Contributor Author

And maybe I can try to fix this issue in other modules by the way.

@pragkent pragkent requested a review from a team as a code owner August 23, 2019 04:41
@ghost
Copy link

ghost commented Aug 23, 2019

Hi @pragkent, we have found your signature in our records, but it seems like you have signed with a different e-mail than the one used in your Git commit. Can you please add both of these e-mails into your Github profile (they can be hidden), so we can match your e-mails to your Github profile?

@@ -43,7 +43,8 @@
{
"date": {
"if": "ctx.event.timezone != null",
"field": "@timestamp",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pragkent Thanks for looking at these modules too, but let's focus first on the system module. For the elasticsearch one I think we will also need to change the formats, because it is actually like ISO8601, but without timezone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsoriano yes, you are correct. we need to define a new format, or the timezone is wrong. I'll fix it later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I grabbed some elasticsearch 7.3 logs from our server, it seems that ISO8601 format was used (timezone included). So, use ISO8601 format for parsing is actually right.

And logs from elasticsearch 6.x does not have timezone part, as you said.

So I added a format for parsing elasticsearch 6.x logs.

Could you please double-check that?

Part of the logs is listed as follow:

{"type": "deprecation", "timestamp": "2019-08-20T05:08:27,756+0000", "level": "WARN", "component": "o.e.d.c.s.Settings", "cluster.name": "hulk", "node.name": "hulk-es-22x9pqqc4q",  "message": "[discovery.zen.hosts_provider] setting was deprecated in Elasticsearch and will be removed in a future release! See the breaking changes documentation for the next major version."  }

{"type": "server", "timestamp": "2019-08-22T20:52:56,131+0000", "level": "INFO", "component": "o.e.m.j.JvmGcMonitorService", "cluster.name": "hulk", "node.name": "hulk-es-22x9pqqc4q", "cluster.uuid": "3pYlYTCPT4iLMqHmeq7Yew", "node.id": "UFfAvAUkTIuOlUbMgT9x0A",  "message": "[gc][229011] overhead, spent [293ms] collecting in the last [1s]"  }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I think we could remove the ISO8601 format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or do something like what I proposed for ISO8601 in the system module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think we should not remove ISO8601 for elasticsearch module. For example, if we have elasticsearch 7.3, and have event.timezone set, then the parsing will failed without this ISO8601 format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying and this seems correct in 7.3 for JSON logs, but not for plain text logs. We may need to move date parsing to the specific pipeline (there are three pipelines, a common one, and another two for the specific needings of plain text and json formats).
Also I think that the format for timestamps without timezone should be yyyy-MM-dd'T'HH:mm:ss,SSS, as they contain milliseconds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I just changed the format to yyyy-MM-dd'T'HH:mm:ss,SSS. When the format is yyyy-MM-dd'T'HH:mm:ss, the date processor will always choose ISO8601 format for parsing, because the time in log files contain milliseconds part.

Is there any other reasons which caused plain text logs parsing failure that I missed out?

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in Kafka and nginx modules LGTM, let's not add more modules to this PR.
I have added a couple of comments for elasticsearch and system modules.

@@ -43,7 +43,8 @@
{
"date": {
"if": "ctx.event.timezone != null",
"field": "@timestamp",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I think we could remove the ISO8601 format.

@pragkent
Copy link
Contributor Author

Changes in Kafka and nginx modules LGTM, let's not add more modules to this PR.
I have added a couple of comments for elasticsearch and system modules.

I just reverted changes to nginx and kafka modules.

@andresrc andresrc added Team:Integrations Label for the Integrations team [zube]: In Review labels Aug 26, 2019
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be working for the system module, but I am not so sure about the elasticsearch one. What do you think about focusing in this PR on the system module and fix the other modules in other PRs?

@pragkent
Copy link
Contributor Author

It seems to be working for the system module, but I am not so sure about the elasticsearch one. What do you think about focusing in this PR on the system module and fix the other modules in other PRs?

OK, I'll submit other prs for other modules

@pragkent
Copy link
Contributor Author

I've updated the commits to only include changes about system module.

@pragkent
Copy link
Contributor Author

@jsoriano Just created a few PRs:

#13367 elasticsearch
#13368 kafka
#13369 nginx

jsoriano pushed a commit that referenced this pull request Aug 29, 2019
jsoriano pushed a commit to jsoriano/beats that referenced this pull request Aug 29, 2019
jsoriano pushed a commit to jsoriano/beats that referenced this pull request Aug 29, 2019
jsoriano added a commit that referenced this pull request Aug 29, 2019
jsoriano added a commit that referenced this pull request Aug 29, 2019
Continuation of #13308

(cherry picked from commit fed669d)

Co-authored-by: Kent Wang <[email protected]>
@ztyzbb
Copy link

ztyzbb commented Sep 30, 2019

Found a similar date processor usage in logstash module.
filebeat/module/logstash/log/ingest/pipeline-plain.json
Does it need a fix too?

@jsoriano
Copy link
Member

jsoriano commented Oct 2, 2019

@ztyzbb yes, I can confirm that this issue also happens in logstash module when parsing plain text files. I will open a PR to fix this.

jsoriano added a commit that referenced this pull request Oct 3, 2019
Use the same date format that we are using now for Elasticsearch
timestamps without timezone, that is the same format used for
timestamps in Logstash plain text logs.

Similar to #13308, related to #13877.
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
… parsing (elastic#13358)

Fix timezone handling in system module when non-UTC timezones
are used.

Fix elastic#13306

(cherry picked from commit af371de)

Co-authored-by: Kent Wang <[email protected]>
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
)

Continuation of elastic#13308

(cherry picked from commit a433d5e)

Co-authored-by: Kent Wang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filebeat 7.3 system module timezone parsing error
5 participants