-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
run processors, defined in the block overriding the input, before the module. #26862
Conversation
…rriding) input block, to be run before the module.
Note that this PR reduces the need for #26833 as we could, after this is merged, use https://www.elastic.co/guide/en/beats/filebeat/current/decode-json-fields.html processor. #26833 is more efficient, but pointing out how this PR introduces flexibility. Onboarding log sources is still a time consuming matter, this removes some barriers. |
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
Trends 🧪💚 Flaky test reportTests succeeded. Expand to view the summary
Test stats 🧪
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as #27154 (review)
Pinging @elastic/agent (Team:Agent) |
Hi! We're labeling this issue as |
Yes still relevant, as it allows us to be compatible with a lot more products / log sources / setups. Allowing us to correct for any small things introduced by these(99% compatible, additional transport step introducing some form of wrapping). |
This pull request does not have a backport label. Could you fix it @mjmbischoff? 🙏
NOTE: |
Hi! We're labeling this issue as |
Hi! |
What does this PR do?
Changing the order of processors when defining processors on the (overriding) input block, to be run before the module.
Why is it important?
When overriding the input on a module https://www.elastic.co/guide/en/beats/filebeat/7.13/advanced-settings.html .
https://www.elastic.co/guide/en/beats/filebeat/7.13/defining-processors.html#where-valid describes how it's a valid location but is ambiguous about the order, other then that it runs after the input. This change makes the processors defined there, run right after the input, thus before the module. This is useful in the case where input is overiden as the structure might have changed due to using a different input/transport.
For example, when adding a buffer to ingest pipeline, kafka is often introduced. Currently even if json is posted to kafka it is read back as json (escaped) string into the message field. Due to the modules having a log input with https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-log.html#filebeat-input-log-config-json setup this breaks / disallows kafka being used. As it iether expects json fields as root or json parsed and placed under the json field.
Checklist
I have commented my code, particularly in hard-to-understand areasI have made corresponding change to the default configuration filesCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Author's Checklist
How to test this PR locally
Related issues
Use cases
As one of the reasons to introduce a buffer in the ingest pipeline is to allow quick draining of the logsources / reduce backpressure. parsing and processing is best done after the buffer. A beat / elastic agent -> kafka -> filebeat -> es is a preferred setup where logstash is not needed.
In other causes where a different intermediate is introduced this change will also help to smooth out any structural changes introduced by going through that intermediate. (for example syslog server or logging to s3 bucket.)