-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[filebeat] add 8.x kibana logs ingest pipeline #31286
Conversation
name: '{< IngestPipeline "pipeline-7" >}' | ||
- pipeline: | ||
if: 'ctx.containsKey("json") && ctx.json.containsKey("ecs") && ctx.json.ecs.containsKey("version")' | ||
name: '{< IngestPipeline "pipeline-ecs" >}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for these 3 conditions to be in mixed state, causing these two if statements to both miss and we fall through to no pipeline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mixed state of these condition would mean an issue with the logs ingested and a fallthrough is likely to be the best option. Now I'm more surprised that we don't have a plaintext ingest pipeline considering kibana can be configured to log with pattern
output https://www.elastic.co/guide/en/kibana/current/logging-settings.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically ctx.json?.ecs?.version?
but I think for some reason that doesn't work. Maybe cause we're doing containsKey? Can't remember offhand why I did it this way in the elasticsearch module pipeline. It was kind of a scramble so memories are blurry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So looks like we could use this here.
- pipeline:
if: 'ctx?.json?.ecs?.version == null'
name: '{< IngestPipeline "pipeline-7" >}'
- pipeline:
if: 'ctx?.json?.ecs?.version != null'
name: '{< IngestPipeline "pipeline-ecs" >}'
That's probably more idiomatic anyway. The pattern @klacabane copied from me was probably not my greatest moment 😛
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM. We could merge before having log samples and test expectations, but I think we'd be doing our future selves a disservice. So hitting the "request change" button to get those added.
field: event.created | ||
- script: | ||
lang: painless | ||
inline: 'ctx.json.keySet().each (key -> ctx[key] = ctx.json.get(key))' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was surprised to see this rather than https://www.elastic.co/guide/en/elasticsearch/reference/master/json-processor.html - wondering what's special about the kibana module that causes them to land like this rather than as a message
JSON string like they land in the elasticsearch module.
Guess I'll give it a go and see if I can work out what's happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. It's because of
json.keys_under_root: false |
I tried doing something like this but the fields end up as just strings rather than actual values.
- foreach:
field: json
processor:
set:
field: "{{_ingest._key}}"
value: _ingest._value
Maybe painless is the only option to merge objects in an ingest pipeline. Would be nice to have an ingest pipeline SME weigh-in. Maybe @jbaiera can recommend one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for late response! Yeah, I don't think there is an easier way to merge two object fields together in ingest at the moment. You could use the append processor, but I think you'd need to know all the field names ahead of time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries @jbaiera ! thanks for the cross check.
Maybe interesting food for thought, but the lack of such a thing makes me wonder if we should stick to json parsing at the ES side (since the https://www.elastic.co/guide/en/elasticsearch/reference/current/json-processor.html can merge/expand objects).
I tried adding some tests to this, but I think we have another issue in that it'll end up inserting all the http headers as keys which presents a risk of field explosion. I tried adding this to fields.yml, but it doesn't do what I hoped it would. Still poking at it.
|
Same issue here, I ended up removing the headers but I'm sure we're missing out on valuable information. I'll try copying them back under |
@klacabane yeah, at least in moving them I think we can provide the mapping. Would be great if we could use https://www.elastic.co/guide/en/elasticsearch/reference/current/flattened.html to get value indexing without field explosion. I see it in use for
|
@klacabane lemme know if you're all set for a review (or maybe dismis/re-request to let me know) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks good to me. Had a few comments about maybe better approaches, but nothing I'd say needs to block the merge.
field: event.created | ||
- script: | ||
lang: painless | ||
inline: 'ctx.json.keySet().each (key -> ctx[key] = ctx.json.get(key))' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. It's because of
json.keys_under_root: false |
I tried doing something like this but the fields end up as just strings rather than actual values.
- foreach:
field: json
processor:
set:
field: "{{_ingest._key}}"
value: _ingest._value
Maybe painless is the only option to merge objects in an ingest pipeline. Would be nice to have an ingest pipeline SME weigh-in. Maybe @jbaiera can recommend one?
name: '{< IngestPipeline "pipeline-7" >}' | ||
- pipeline: | ||
if: 'ctx.containsKey("json") && ctx.json.containsKey("ecs") && ctx.json.ecs.containsKey("version")' | ||
name: '{< IngestPipeline "pipeline-ecs" >}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So looks like we could use this here.
- pipeline:
if: 'ctx?.json?.ecs?.version == null'
name: '{< IngestPipeline "pipeline-7" >}'
- pipeline:
if: 'ctx?.json?.ecs?.version != null'
name: '{< IngestPipeline "pipeline-ecs" >}'
That's probably more idiomatic anyway. The pattern @klacabane copied from me was probably not my greatest moment 😛
* add routing pipeline to 7 or ecs * simplify ecs pipeline * flatten headers * kibana 8.x logs integration test * shorter condition (cherry picked from commit 47777ec) # Conflicts: # filebeat/docs/fields.asciidoc # filebeat/module/kibana/log/ingest/pipeline.yml
* add routing pipeline to 7 or ecs * simplify ecs pipeline * flatten headers * kibana 8.x logs integration test * shorter condition (cherry picked from commit 47777ec)
* add routing pipeline to 7 or ecs * simplify ecs pipeline * flatten headers * kibana 8.x logs integration test * shorter condition (cherry picked from commit 47777ec) Co-authored-by: Kevin Lacabane <[email protected]>
* add routing pipeline to 7 or ecs * simplify ecs pipeline * flatten headers * kibana 8.x logs integration test * shorter condition
Summary
Closes #31216
Added a separate ingest pipeline to handle kibana 8.x log formats. The pipeline triggers when the
ecs.version
field exists and otherwise falls back to the 7.x pipeline.The pipeline reproduces what is done for the elasticsearch 8.x logs by unwrapping all the fields under
json.
to the root of the document since the format is ecs compatible. Due to potential mapping explosion with the request/response headers, these fields are copied fromhttp.request.headers
tokibana.log.meta.req.headers
where we have control over the field definition. Here we define them as flattenedTesting
Right
logs which is handled in Failure to index Kibana log entries with fieldright
#31576