Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run processors, defined in the block overriding the input, before the module. #26862

Closed
wants to merge 1 commit into from

Conversation

mjmbischoff
Copy link
Contributor

@mjmbischoff mjmbischoff commented Jul 13, 2021

What does this PR do?

Changing the order of processors when defining processors on the (overriding) input block, to be run before the module.

Why is it important?

When overriding the input on a module https://www.elastic.co/guide/en/beats/filebeat/7.13/advanced-settings.html .
https://www.elastic.co/guide/en/beats/filebeat/7.13/defining-processors.html#where-valid describes how it's a valid location but is ambiguous about the order, other then that it runs after the input. This change makes the processors defined there, run right after the input, thus before the module. This is useful in the case where input is overiden as the structure might have changed due to using a different input/transport.

For example, when adding a buffer to ingest pipeline, kafka is often introduced. Currently even if json is posted to kafka it is read back as json (escaped) string into the message field. Due to the modules having a log input with https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-log.html#filebeat-input-log-config-json setup this breaks / disallows kafka being used. As it iether expects json fields as root or json parsed and placed under the json field.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  • Discuss breaking change
  • Discuss adding processor at module level to reintroduce old behaviour

How to test this PR locally

Related issues

Use cases

As one of the reasons to introduce a buffer in the ingest pipeline is to allow quick draining of the logsources / reduce backpressure. parsing and processing is best done after the buffer. A beat / elastic agent -> kafka -> filebeat -> es is a preferred setup where logstash is not needed.

In other causes where a different intermediate is introduced this change will also help to smooth out any structural changes introduced by going through that intermediate. (for example syslog server or logging to s3 bucket.)

…rriding) input block, to be run before the module.
@mjmbischoff mjmbischoff added bug enhancement breaking change release-note:breaking The content should be included as a breaking change labels Jul 13, 2021
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jul 13, 2021
@mjmbischoff
Copy link
Contributor Author

mjmbischoff commented Jul 13, 2021

Note that this PR reduces the need for #26833 as we could, after this is merged, use https://www.elastic.co/guide/en/beats/filebeat/current/decode-json-fields.html processor. #26833 is more efficient, but pointing out how this PR introduces flexibility.

Onboarding log sources is still a time consuming matter, this removes some barriers.

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-07-13T11:07:37.960+0000

  • Duration: 96 min 9 sec

  • Commit: a798efa

Test stats 🧪

Test Results
Failed 0
Passed 14867
Skipped 2312
Total 17179

Trends 🧪

Image of Build Times

Image of Tests

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 14867
Skipped 2312
Total 17179

Copy link
Contributor

@kvch kvch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as #27154 (review)

@mjmbischoff mjmbischoff added the Team:Elastic-Agent Label for the Agent team label Aug 23, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/agent (Team:Agent)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Aug 23, 2021
@botelastic
Copy link

botelastic bot commented Sep 22, 2021

Hi!
We just realized that we haven't looked into this PR in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it in as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Sep 22, 2021
@mjmbischoff
Copy link
Contributor Author

Yes still relevant, as it allows us to be compatible with a lot more products / log sources / setups. Allowing us to correct for any small things introduced by these(99% compatible, additional transport step introducing some form of wrapping).

@botelastic botelastic bot removed the Stalled label Sep 22, 2021
@mergify
Copy link
Contributor

mergify bot commented Sep 22, 2021

This pull request does not have a backport label. Could you fix it @mjmbischoff? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v./d./d./d is the label to automatically backport to the 7./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

@mergify mergify bot added the backport-skip Skip notification from the automated backport with mergify label Sep 22, 2021
@botelastic
Copy link

botelastic bot commented Oct 22, 2021

Hi!
We just realized that we haven't looked into this PR in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it in as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Oct 22, 2021
@botelastic
Copy link

botelastic bot commented Nov 21, 2021

Hi!
This PR has been stale for a while and we're going to close it as part of our cleanup procedure.
We appreciate your contribution and would like to apologize if we have not been able to review it, due to the current heavy load of the team.
Feel free to re-open this PR if you think it should stay open and is worth rebasing.
Thank you for your contribution!

@botelastic botelastic bot closed this Nov 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip Skip notification from the automated backport with mergify breaking change bug enhancement release-note:breaking The content should be included as a breaking change Stalled Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants