Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[filebeat] add 8.x kibana logs ingest pipeline #31286

Merged
merged 5 commits into from
May 11, 2022

Conversation

klacabane
Copy link
Contributor

@klacabane klacabane commented Apr 13, 2022

Summary

Closes #31216

Added a separate ingest pipeline to handle kibana 8.x log formats. The pipeline triggers when the ecs.version field exists and otherwise falls back to the 7.x pipeline.

The pipeline reproduces what is done for the elasticsearch 8.x logs by unwrapping all the fields under json. to the root of the document since the format is ecs compatible. Due to potential mapping explosion with the request/response headers, these fields are copied from http.request.headers to kibana.log.meta.req.headers where we have control over the field definition. Here we define them as flattened

Testing

  • Add logging configuration to kibana.yml
logging.root.level: debug

logging:
  appenders:
    file:
      type: file
      fileName: ./kibana.log
      layout:
        type: json
  root:
    appenders: [file]
  • Enable kibana filebeat module
filebeat.modules:
  - module: kibana
    log:
      enabled: true
      var.paths:
        - ../../kibana/kibana.log

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Apr 13, 2022
@elasticmachine
Copy link
Collaborator

elasticmachine commented Apr 13, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-05-11T09:27:35.989+0000

  • Duration: 67 min 25 sec

Test stats 🧪

Test Results
Failed 0
Passed 6236
Skipped 732
Total 6968

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@klacabane klacabane added backport-v8.2.0 Automated backport with mergify backport-7.17 Automated backport to the 7.17 branch with mergify Team:Infra Monitoring UI - DEPRECATED Infrastructure Monitoring UI team - DEPRECATED - Use Team:Monitoring Module:kibana Kibana Beats modules labels Apr 20, 2022
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Apr 20, 2022
@klacabane klacabane added the bug label Apr 20, 2022
@klacabane klacabane marked this pull request as ready for review April 20, 2022 14:53
@klacabane klacabane requested a review from a team as a code owner April 20, 2022 14:53
@klacabane klacabane requested a review from matschaffer April 20, 2022 14:53
@klacabane klacabane changed the title add routing pipeline to 7 or ecs [filebeat] add 8.x kibana logs ingest pipeline Apr 20, 2022
name: '{< IngestPipeline "pipeline-7" >}'
- pipeline:
if: 'ctx.containsKey("json") && ctx.json.containsKey("ecs") && ctx.json.ecs.containsKey("version")'
name: '{< IngestPipeline "pipeline-ecs" >}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for these 3 conditions to be in mixed state, causing these two if statements to both miss and we fall through to no pipeline?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mixed state of these condition would mean an issue with the logs ingested and a fallthrough is likely to be the best option. Now I'm more surprised that we don't have a plaintext ingest pipeline considering kibana can be configured to log with pattern output https://www.elastic.co/guide/en/kibana/current/logging-settings.html

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically ctx.json?.ecs?.version? but I think for some reason that doesn't work. Maybe cause we're doing containsKey? Can't remember offhand why I did it this way in the elasticsearch module pipeline. It was kind of a scramble so memories are blurry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So looks like we could use this here.

- pipeline:
    if: 'ctx?.json?.ecs?.version == null'
    name: '{< IngestPipeline "pipeline-7" >}'
- pipeline:
    if: 'ctx?.json?.ecs?.version != null'
    name: '{< IngestPipeline "pipeline-ecs" >}'

That's probably more idiomatic anyway. The pattern @klacabane copied from me was probably not my greatest moment 😛

Copy link
Contributor

@matschaffer matschaffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM. We could merge before having log samples and test expectations, but I think we'd be doing our future selves a disservice. So hitting the "request change" button to get those added.

field: event.created
- script:
lang: painless
inline: 'ctx.json.keySet().each (key -> ctx[key] = ctx.json.get(key))'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was surprised to see this rather than https://www.elastic.co/guide/en/elasticsearch/reference/master/json-processor.html - wondering what's special about the kibana module that causes them to land like this rather than as a message JSON string like they land in the elasticsearch module.

Guess I'll give it a go and see if I can work out what's happening.

Copy link
Contributor

@matschaffer matschaffer May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. It's because of

json.keys_under_root: false

I tried doing something like this but the fields end up as just strings rather than actual values.

- foreach:
    field: json
    processor:
      set:
        field: "{{_ingest._key}}"
        value: _ingest._value

Maybe painless is the only option to merge objects in an ingest pipeline. Would be nice to have an ingest pipeline SME weigh-in. Maybe @jbaiera can recommend one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for late response! Yeah, I don't think there is an easier way to merge two object fields together in ingest at the moment. You could use the append processor, but I think you'd need to know all the field names ahead of time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries @jbaiera ! thanks for the cross check.

Maybe interesting food for thought, but the lack of such a thing makes me wonder if we should stick to json parsing at the ES side (since the https://www.elastic.co/guide/en/elasticsearch/reference/current/json-processor.html can merge/expand objects).

@matschaffer
Copy link
Contributor

I tried adding some tests to this, but I think we have another issue in that it'll end up inserting all the http headers as keys which presents a risk of field explosion.

Screen_Shot_2022-04-21_at_15_16_43

I tried adding this to fields.yml, but it doesn't do what I hoped it would. Still poking at it.

    - name: http.request.headers
      type: object
      object_type: keyword

@klacabane
Copy link
Contributor Author

Same issue here, I ended up removing the headers but I'm sure we're missing out on valuable information. I'll try copying them back under kibana.meta.log.req to see if they're dynamically mapped there as well

@matschaffer
Copy link
Contributor

@klacabane yeah, at least in moving them I think we can provide the mapping. Would be great if we could use https://www.elastic.co/guide/en/elasticsearch/reference/current/flattened.html to get value indexing without field explosion.

I see it in use for

so maybe we can use it here as well.

@klacabane klacabane requested a review from a team as a code owner May 9, 2022 12:31
@klacabane klacabane requested review from rdner and fearful-symmetry and removed request for a team May 9, 2022 12:31
@matschaffer
Copy link
Contributor

@klacabane lemme know if you're all set for a review (or maybe dismis/re-request to let me know)

@klacabane klacabane requested a review from matschaffer May 10, 2022 10:10
Copy link
Contributor

@matschaffer matschaffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good to me. Had a few comments about maybe better approaches, but nothing I'd say needs to block the merge.

field: event.created
- script:
lang: painless
inline: 'ctx.json.keySet().each (key -> ctx[key] = ctx.json.get(key))'
Copy link
Contributor

@matschaffer matschaffer May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. It's because of

json.keys_under_root: false

I tried doing something like this but the fields end up as just strings rather than actual values.

- foreach:
    field: json
    processor:
      set:
        field: "{{_ingest._key}}"
        value: _ingest._value

Maybe painless is the only option to merge objects in an ingest pipeline. Would be nice to have an ingest pipeline SME weigh-in. Maybe @jbaiera can recommend one?

name: '{< IngestPipeline "pipeline-7" >}'
- pipeline:
if: 'ctx.containsKey("json") && ctx.json.containsKey("ecs") && ctx.json.ecs.containsKey("version")'
name: '{< IngestPipeline "pipeline-ecs" >}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So looks like we could use this here.

- pipeline:
    if: 'ctx?.json?.ecs?.version == null'
    name: '{< IngestPipeline "pipeline-7" >}'
- pipeline:
    if: 'ctx?.json?.ecs?.version != null'
    name: '{< IngestPipeline "pipeline-ecs" >}'

That's probably more idiomatic anyway. The pattern @klacabane copied from me was probably not my greatest moment 😛

@klacabane klacabane merged commit 47777ec into elastic:main May 11, 2022
mergify bot pushed a commit that referenced this pull request May 11, 2022
* add routing pipeline to 7 or ecs

* simplify ecs pipeline

* flatten headers

* kibana 8.x logs integration test

* shorter condition

(cherry picked from commit 47777ec)

# Conflicts:
#	filebeat/docs/fields.asciidoc
#	filebeat/module/kibana/log/ingest/pipeline.yml
@klacabane klacabane removed the backport-7.17 Automated backport to the 7.17 branch with mergify label May 11, 2022
mergify bot pushed a commit that referenced this pull request May 11, 2022
* add routing pipeline to 7 or ecs

* simplify ecs pipeline

* flatten headers

* kibana 8.x logs integration test

* shorter condition

(cherry picked from commit 47777ec)
klacabane added a commit that referenced this pull request May 11, 2022
* add routing pipeline to 7 or ecs

* simplify ecs pipeline

* flatten headers

* kibana 8.x logs integration test

* shorter condition

(cherry picked from commit 47777ec)

Co-authored-by: Kevin Lacabane <[email protected]>
chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023
* add routing pipeline to 7 or ecs

* simplify ecs pipeline

* flatten headers

* kibana 8.x logs integration test

* shorter condition
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v8.2.0 Automated backport with mergify bug Module:kibana Kibana Beats modules Team:Infra Monitoring UI - DEPRECATED Infrastructure Monitoring UI team - DEPRECATED - Use Team:Monitoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filebeat's Kibana module doesn't support new 8.0 logging format
5 participants