Skip to content

Commit

Permalink
[Filebeat] Fix RFC5424 date parsing in system/syslog in pipeline (#12529
Browse files Browse the repository at this point in the history
)

The date format used by the system/syslog pipeline has a couple of
issues:

- Since Elasticsearch switched to Java time from Joda time, it won't
  correctly parse timezone offsets that use a colon to separate hours
  and minutes.

- RFC5424 allows for 1 to 6 digits for the fractional second. This
  updates the pipeline to also accept 3 decimals (milliseconds).
  • Loading branch information
adriansr authored Jun 14, 2019
1 parent 28c01be commit d30b4dc
Show file tree
Hide file tree
Showing 5 changed files with 58 additions and 2 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,7 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha2...master[Check the HEAD d
- Require client_auth by default when ssl is enabled for tcp input {pull}12333[12333]
- Require certificate authorities, certificate file, and key when SSL is enabled for the TCP input. {pull}12355[12355]
- Load correct pipelines when system module is configured in modules.d. {pull}12340[12340]
- Fix timezone offset parsing in system/syslog. {pull}12529[12529]

*Heartbeat*

Expand Down
5 changes: 4 additions & 1 deletion filebeat/module/system/syslog/ingest/pipeline.json
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,10 @@
"formats": [
"MMM d HH:mm:ss",
"MMM dd HH:mm:ss",
"yyyy-MM-dd'T'HH:mm:ss.SSSSSSZZ"
"yyyy-MM-dd'T'HH:mm:ss.SSSZZ",
"yyyy-MM-dd'T'HH:mm:ss.SSSSSSZZ",
"yyyy-MM-dd'T'HH:mm:ss.SSSXXX",
"yyyy-MM-dd'T'HH:mm:ss.SSSSSSXXX"
],
"ignore_failure": true
}
Expand Down
2 changes: 2 additions & 0 deletions filebeat/module/system/syslog/test/tz-offset.log
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
1986-04-26T01:23:45.101+0400 rmbkmonitor04 shutdown[2649]: shutting down for system halt
1986-04-26T01:23:45.388424+04:00 rmbkmonitor04 thermald: constraint_0_power_limit_uw exceeded.
31 changes: 31 additions & 0 deletions filebeat/module/system/syslog/test/tz-offset.log-expected.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
[
{
"@timestamp": "1986-04-25T21:23:45.101Z",
"event.dataset": "system.syslog",
"event.module": "system",
"event.timezone": "+00:00",
"fileset.name": "syslog",
"host.hostname": "rmbkmonitor04",
"input.type": "log",
"log.file.path": "tz-offset.log",
"log.offset": 0,
"message": "shutting down for system halt",
"process.name": "shutdown",
"process.pid": 2649,
"service.type": "system"
},
{
"@timestamp": "1986-04-25T21:23:45.388Z",
"event.dataset": "system.syslog",
"event.module": "system",
"event.timezone": "+00:00",
"fileset.name": "syslog",
"host.hostname": "rmbkmonitor04",
"input.type": "log",
"log.file.path": "tz-offset.log",
"log.offset": 89,
"message": "constraint_0_power_limit_uw exceeded.",
"process.name": "thermald",
"service.type": "system"
}
]
21 changes: 20 additions & 1 deletion filebeat/tests/system/test_modules.py
Original file line number Diff line number Diff line change
Expand Up @@ -220,13 +220,32 @@ def clean_keys(obj):
# ECS versions change for any ECS release, large or small
ecs_key = ["ecs.version"]

# Keep source log filename for exceptions
filename = None
if "log.file.path" in obj:
filename = os.path.basename(obj["log.file.path"]).lower()

for key in host_keys + time_keys + other_keys + ecs_key:
delete_key(obj, key)

# Remove timestamp for comparison where timestamp is not part of the log line
if (obj["event.dataset"] in ["icinga.startup", "redis.log", "haproxy.log", "system.auth", "system.syslog"]):
if (obj["event.dataset"] in ["icinga.startup", "redis.log", "haproxy.log", "system.auth"]):
delete_key(obj, "@timestamp")

# HACK: This keeps @timestamp for the tz-offset.log in system.syslog.
#
# This can't be done for all syslog logs because most of them lack the year
# in their timestamp, so Elasticsearch will set it to the current year and
# that will cause the tests to fail every new year.
#
# The log.file.path key needs to be kept so that it is stored in the golden
# data, to prevent @timestamp to be removed from it before comparison.
if obj["event.dataset"] == "system.syslog":
if filename == "tz-offset.log":
obj["log.file.path"] = filename
else:
delete_key(obj, "@timestamp")


def delete_key(obj, key):
if key in obj:
Expand Down

0 comments on commit d30b4dc

Please sign in to comment.