Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial key=value parser #426

Merged
merged 3 commits into from
Sep 20, 2021
Merged

initial key=value parser #426

merged 3 commits into from
Sep 20, 2021

Conversation

jsirianni
Copy link
Member

@jsirianni jsirianni commented Sep 17, 2021

Description of Changes

This PR adds support for parsing key=value pairs.

  • configurable delimiter, such as key=value, key/value, key|value, defaulting to =
  • handles quotes and spaces such as key="some poorly formatted string " a=b c=d

The initial usecase for this operator is to make parsing key value pairs found in the common event format a reality.

config

pipeline:
- type: file_input
  include:
  - "./in.txt"
  start_at: beginning

- type: key_value_parser
  timestamp:
    parse_from: $record.t
    layout_type: epoch
    layout: s
  severity:
    parse_from: $record.sev

- type: stdout

in.txt

name=joe age=28 job=engineer t=1136214245 sev=info
name=" bob " age=28 job=welder t=1136214245 sev=warn
name=stanza age=1 job="software engineering" location="grand rapids michigan" t=1136214245 src="10.3.3.76" dst=172.217.0.10 protocol=udp sport=57112 dport=443 translated_src_ip=96.63.176.3 translated_port=57112 sev=trace

output

{
  "timestamp": "2006-01-02T10:04:05-05:00",
  "severity": 30,
  "severity_text": "info",
  "labels": {
    "file_name": "in"
  },
  "record": {
    "age": "28",
    "job": "engineer",
    "name": "joe"
  }
}
{
  "timestamp": "2006-01-02T10:04:05-05:00",
  "severity": 50,
  "severity_text": "warn",
  "labels": {
    "file_name": "in"
  },
  "record": {
    "age": "28",
    "job": "welder",
    "name": "bob"
  }
}
{
  "timestamp": "2006-01-02T10:04:05-05:00",
  "severity": 10,
  "severity_text": "trace",
  "labels": {
    "file_name": "in"
  },
  "record": {
    "age": "1",
    "dport": "443",
    "dst": "172.217.0.10",
    "job": "software engineering",
    "location": "grand rapids michigan",
    "name": "stanza",
    "protocol": "udp",
    "sport": "57112",
    "src": "10.3.3.76",
    "translated_port": "57112",
    "translated_src_ip": "96.63.176.3"
  }
}

Please check that the PR fulfills these requirements

  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)
  • Add a changelog entry (for non-trivial bug fixes / features)
  • CI passes

@djaglowski
Copy link
Member

Log Files Logs / Second CPU Avg (%) CPU Avg Δ (%) Memory Avg (MB) Memory Avg Δ (MB)
1 1000 1.4137672 -0.017274141 127.14655 -2.3678665
1 5000 5.086308 +0.31032753 135.66434 -2.6485596
1 10000 9.7413845 -0.32771492 145.88887 +1.3100739
1 50000 46.96581 -0.29371262 173.97617 +0.25013733
1 100000 90.48211 -7.067627 219.91016 -19.36827
10 100 1.8966132 -0.10344708 134.32274 -0.24609375
10 500 5.4655004 -0.9140496 140.59456 -1.2265625
10 1000 11.000338 -0.05146122 146.39493 +0.26481628
10 5000 58.415684 -3.1143532 185.87352 +4.69989
10 10000 101.398026 +8.052956 213.86476 -0.6551666

@codecov
Copy link

codecov bot commented Sep 17, 2021

Codecov Report

Merging #426 (9d91d61) into master (f07878d) will increase coverage by 0.16%.
The diff coverage is 98.15%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #426      +/-   ##
==========================================
+ Coverage   72.62%   72.77%   +0.16%     
==========================================
  Files         124      125       +1     
  Lines        8158     8212      +54     
==========================================
+ Hits         5924     5976      +52     
- Misses       1723     1726       +3     
+ Partials      511      510       -1     
Impacted Files Coverage Δ
operator/builtin/parser/keyvalue/keyvalue.go 98.15% <98.15%> (ø)
operator/builtin/output/newrelic/newrelic.go 73.55% <0.00%> (-0.83%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f07878d...9d91d61. Read the comment docs.

@djaglowski
Copy link
Member

Log Files Logs / Second CPU Avg (%) CPU Avg Δ (%) Memory Avg (MB) Memory Avg Δ (MB)
1 1000 1.4482846 +0.06894553 126.387794 -1.1876373
1 5000 4.9483047 -0.086241245 136.94356 -4.0868835
1 10000 9.310534 -0.60332394 145.54027 +2.0475464
1 50000 50.173904 +1.3969231 174.92712 -1.8592377
1 100000 83.518326 -14.03141 590.1215 +350.84308
10 100 1.9482893 -0.051771045 134.7302 +0.1613617
10 500 6.086292 -0.2932582 140.09308 -1.7280426
10 1000 11.620922 +0.36196995 146.17511 +0.35398865
10 5000 53.484642 -8.045395 180.42444 -0.7491913
10 10000 100.568634 -2.1400604 217.264 +8.881866

@jsirianni jsirianni marked this pull request as ready for review September 17, 2021 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants