Feat: ECS compliant NAGIOSLOGLINE captures #276

kares · 2020-07-31T18:41:38Z

Here I have mostly struggled with getting the naming right - too much "state"/"type" use in Nagios terminology.
Have decided to name "state status" as status (previously nagios_state) and state type as state_type (legacy name nagios_statetype)
The structure is rather flat to mirror Nagios (introducing state: { type: "..." } would be an option but the terminology would not follow that well). It's a bit of a compromise, sample capture:

p.s. Some of the names were incorrect those have been fixed and specs should clearly reflect the change.

[1427925600] CURRENT HOST STATE: nagioshost;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 2.24 ms

  "nagios_epoch"=>"1427925600", 
  "nagios_type"=>"CURRENT HOST STATE", 
  "nagios_hostname"=>"nagioshost", 
  "nagios_state"=>"UP", 
  "nagios_statetype"=>"HARD", 
  "nagios_statecode"=>"1", # incorrect -> simply an attempt counter
  "nagios_message"=>"PING OK - Packet loss = 0%, RTA = 2.24 ms",

  "timestamp"=>"1427925600", 
  "nagios"=>{
    "log"=>{
      "type"=>"CURRENT HOST STATE", 
      "status"=>"UP", 
      "state_type"=>"HARD",
      "attempt"=>1, 
    }
  }, 
  "host"=>{"hostname"=>"nagioshost"},
  "message"=>"PING OK - Packet loss = 0%, RTA = 2.24 ms" ...

NOTE: now that I looked at ^^^ again I am going to revisit naming once again as the trio feels ackward:

      "type"=>"CURRENT HOST STATE", 
      "status"=>"UP", 
      "state_type"=>"HARD",

... maybe rename status -> state and potentially come up with a better name for the type

kares · 2020-07-31T19:02:03Z

Should be good for an (ECS) naming review, if anyone finds time to do so 🙇 // @webmat @ebeahan

ebeahan · 2020-08-05T21:57:16Z

Would there be an opportunity to leverage the service.* field set at all? For example, could [nagios][log][service] be mapped as [service][name]?

@webmat might have some better guidance on whether if we've ever advised using service.state for something like Nagios UP/DOWN states.

kares · 2020-08-06T06:52:56Z

Would there be an opportunity to leverage the service.* field set at all? For example, could [nagios][log][service] be mapped as [service][name]?

that should do thanks.
have tried to keep everything under nagios.* since there's usually 2 types of the same kind of log lines e.g.
CURRENT SERVICE STATE: localhost;Current Load;OK;HARD;1;OK - load average: 2.54, 1.58, 0.86
CURRENT HOST STATE: s-host08;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 1.58 ms

service names are often nagios "internal" e.g. service.name => "Current Load" in this case ...

@webmat might have some better guidance on whether if we've ever advised using service.state for something like Nagios UP/DOWN states.

thing to consider is whether we would use different state labels, which might be confusing, for the 2 mentioned log lines.
(there's no service.name for the CURRENT HOST STATE log entry, are we okay with using service.state there?)

webmat · 2020-08-06T14:08:49Z

Yes, I think service.name and service.state would both fit well for external checks. E.g. ssh or varnish from the tests:

{ "service": { "name": "Varnish Backend Connections", "state": "CRITICAL" }

👍

Since these groks don't come with a pipeline, the host checks are a bit tricky though. I think "PING [hostname]" or equivalent could be an appropriate service name in this case. But you can't fill that only with a grok. But I would use service.state for the status check nonetheless. I think a user these groks will be able to fill service.name in their pipeline to get the desired consistency.

webmat · 2020-08-06T14:12:11Z

Everything else makes sense to me 👍

for Nagios' host/service state status reports

yaauie

👍 to just about everything here. Yet again, the field names are really well thought-through, and the additional spec coverage is super useful.

I noticed one weirdness with a spec that anticipates the captured message being appended onto the existing message, and would appreciate confirmation that my reasoning is sound. We may benefit from the plain and lines codecs being able to capture directly to log.original in their own ECS modes.

yaauie · 2020-11-10T01:47:10Z

spec/patterns/nagios_spec.rb

  end

  it "generates the nagios_message field" do
-    expect(grok).to include("nagios_message" => "PING OK - Packet loss = 0%, RTA = 2.24 ms")
+    if ecs_compatibility?
+      expect(grok).to include("message" => [message, "PING OK - Packet loss = 0%, RTA = 2.24 ms"])


this artifact feels a little weird to me. I assume this is because we are matching on [message] so a capture appends to the existing message?

Should we encourage people to move the inbound message to log.original before running grok as a general best practice? I assume that would solve this weirdness.

Yes, this is a general pattern we're following here when a message field is to be replaced due ECS.
This is present on other patterns as well that do 'message' replacement".

Not sure at what point to "fix" this, was mostly thinking grok would eventually run with overwrite => [ "message" ]. Since this is fairly common pattern, we could adopt overwrite => [ "message" ] as a default (in ECS mode).

Although I do not like we would have different defaults for a plugin depending on ecs_compatibility (but there's already a precedence for that in the ES output plugin).
I was thinking the .original copy could happen "earlier" (on event creation) when using an ECS factory ...

Yeah copying or moving the original payload in the beginning is a good idea. Note however that ECS unfortunately has 2 such fields, log.original and event.original. At the next major we're likely to remove log.original, so I would recommend using event.original instead. Issue for deprecation/removal is here elastic/ecs#841

since we have Event.new("message" => message) and "match" => [ "message" => .. ] hard-coded in spec helper, and also due testing in legacy mode, this seems sufficient for now moving forward.

have added the concern to our meta-issue's post TODO points, as when it gets addressed at a higher level we could get back here and refactor the helpers.

kares added 2 commits July 31, 2020 20:31

Feat: ecs compatibility for nagios patterns

f6c4241

Test: pend potential grok type-casting bug

4d9ba6b

kares mentioned this pull request Aug 4, 2020

Implement ECS-Compatibility Mode #278

Closed

21 tasks

kares added 2 commits August 10, 2020 20:19

Refactor: to using service.name ECS field

0a3f14b

Refactor: to using service.state ECS field

1113345

for Nagios' host/service state status reports

kares requested a review from yaauie August 10, 2020 18:24

yaauie reviewed Nov 10, 2020

View reviewed changes

kares merged commit beeb8b6 into logstash-plugins:ecs-wip Dec 21, 2020

kares changed the title ~~Feat: make NAGIOSLOGLINE captures ECS compliant~~ Feat: ECS compliant NAGIOSLOGLINE captures Dec 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: ECS compliant NAGIOSLOGLINE captures #276

Feat: ECS compliant NAGIOSLOGLINE captures #276

kares commented Jul 31, 2020 •

edited

Loading

kares commented Jul 31, 2020

ebeahan commented Aug 5, 2020

kares commented Aug 6, 2020

webmat commented Aug 6, 2020

webmat commented Aug 6, 2020

yaauie left a comment

yaauie Nov 10, 2020

yaauie Nov 10, 2020

kares Nov 10, 2020 •

edited

Loading

webmat Nov 10, 2020

kares Nov 20, 2020

Feat: ECS compliant NAGIOSLOGLINE captures #276

Feat: ECS compliant NAGIOSLOGLINE captures #276

Conversation

kares commented Jul 31, 2020 • edited Loading

kares commented Jul 31, 2020

ebeahan commented Aug 5, 2020

kares commented Aug 6, 2020

webmat commented Aug 6, 2020

webmat commented Aug 6, 2020

yaauie left a comment

Choose a reason for hiding this comment

yaauie Nov 10, 2020

Choose a reason for hiding this comment

yaauie Nov 10, 2020

Choose a reason for hiding this comment

kares Nov 10, 2020 • edited Loading

Choose a reason for hiding this comment

webmat Nov 10, 2020

Choose a reason for hiding this comment

kares Nov 20, 2020

Choose a reason for hiding this comment

kares commented Jul 31, 2020 •

edited

Loading

kares Nov 10, 2020 •

edited

Loading