-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setup data streams and send events to them #28450
Setup data streams and send events to them #28450
Conversation
This pull request does not have a backport label. Could you fix it @kvch? 🙏
NOTE: |
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
💚 Flaky test reportTests succeeded. 🤖 GitHub commentsTo re-run your PR in the CI, just comment with:
|
d1d3adc
to
39b0ad1
Compare
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments but I think we should likely quickly sync on this before moving forward. The question from my side is, when do we switch to use data streams by default? Any Beat in 8.0 should use data streams by default and the option to use legacy should not exist anymore. In 7.16, it could be opt in. But in this case, all 7.16 Beats used have to be configured to write to data streams. We should not have part of the 7.16 data in data streams and some in indices as it complicates things. Any easy way out here is to keep in 7.16 the indices as the default and 8.0 goes to data streams automatically.
It seems we already had an option in place for component, index and legacy templates. What is the difference between index and component templates in these config options? And as we already had the option available, why are so many changes to the code needed?
I probably misunderstood you last time we discussed it. I thought we agreed that from 7.16, Beats should send events into data streams, but let users opt-out if they want to. Also, that is how I interpreted the following line in the original issue: "Moving to data streams already in 7.x will help improve robustness in resource setup and management."
From testing point of view, it makes our life easier. But I would still move to data streams earlier because it bring more value to our users.
Agreed, that's what I had implemented. Or did I miss something?
Index template are regular templates as you expect. Component templates can be used in other index templates. I added the option to load Beats mapping/templates as component templates, so users can add more fields and create their own index templates by including beats component templates and their custom component templates. This option is for advanced users, who know how to manage mappings, what fields they need, etc.
The option I added a new
Furthermore, there are also some refactoring in loading to avoid copy-pasting the same code in template loading. |
This pull request does not have a backport label. Could you fix it @kvch? 🙏
NOTE: |
e2cbecb
to
2d3d788
Compare
…lastic#28538) ## What does this PR do? From now on Beats is going to load the new index templates to Elasticsearch 7.x and 8.x. The PR consists of a few test fixes, and I had found an issue in templates, that I will port to master. ## Why is it important? This way we can be forward compatible with Elasticsearch 8.x where legacy templates are no longer available.
jenkins run tests |
…lity-libeat-index-template
This pull request is now in conflicts. Could you fix it? 🙏
|
…lity-libeat-index-template
/test filebeat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Lets get it in. We can still do a few follow up PRs with minor clenaup.
Awesome to see how much code and with it complexity this removed!
} | ||
|
||
if templateComponent.load && !ilmComponent.load && ilmComponent.enabled { | ||
return false, "Loading template with ILM settings whithout loading ILM " + | ||
"policy and alias can lead to issues and is not recommended. " + | ||
"policy can lead to issues and is not recommended. " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a follow up: This log message and the above, we likely could be more specific on which config should be checked exactly to make it easier for users to find it.
MapStr: common.MapStr(tmpl), | ||
} | ||
} | ||
templates, _ := response.GetValue("index_templates") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the errors be checked just to be save?
## What does this PR do? We are introducing data streams to Beats. It means that all Beats are going to send events to data streams instead of indices regardless of ES version. Do not confuse it with the data stream naming convention we use in integrations. Naming does not change in Beats, only the underlying data storage method in Elasticsearch. The name of the data stream is going to be `{beatname}-{version}` and the index pattern is `{beatname}-{version}`. With this change, the option `setup.template.type` no longer makes sense. Hence, it is removed completely from 8.x. If you are loading JSON index templates by specifying a file in `setup.template.json.path`, make sure you move from the legacy format to composable index templates. Beats no longer load an alias to Elasticsearch, instead all data can be reached through the data stream. One of the limitations is that only create operations are supported in data streams. Thus, there is no way to use e.g. "index" or "delete" operation types when sending events to ES. ## Why is it important? Simplify loading ILM, templates, and use the specialized data streams for output events. (cherry picked from commit 405c342)
## What does this PR do? We are introducing data streams to Beats. It means that all Beats are going to send events to data streams instead of indices regardless of ES version. Do not confuse it with the data stream naming convention we use in integrations. Naming does not change in Beats, only the underlying data storage method in Elasticsearch. The name of the data stream is going to be `{beatname}-{version}` and the index pattern is `{beatname}-{version}`. With this change, the option `setup.template.type` no longer makes sense. Hence, it is removed completely from 8.x. If you are loading JSON index templates by specifying a file in `setup.template.json.path`, make sure you move from the legacy format to composable index templates. Beats no longer load an alias to Elasticsearch, instead all data can be reached through the data stream. One of the limitations is that only create operations are supported in data streams. Thus, there is no way to use e.g. "index" or "delete" operation types when sending events to ES. ## Why is it important? Simplify loading ILM, templates, and use the specialized data streams for output events. (cherry picked from commit 405c342) Co-authored-by: Noémi Ványi <[email protected]>
Hi @kvch We have attempted to perform filebeats installation on latest 8.0 snapshot build that is available after the merges on above ticket. Build details:
Observations: Result: Filebeat indices showed up on discover tab and NO data was available under data streams tab. Scenario 2: Attempted to install filebeat on Linux OS with 'setup.ilm.enabled' property as true [ uncommented at filebeat.reference.yml file] with Module system enabled. Result: Again Filebeat indices showed up on discover tab and NO data was available under data streams tab. Scenario 3: When we ran up Filebeat set command at scenario1 and scenario2 we observed below messages. So, we again updated 'setup.ilm.overwrite' property as true and re-ran the setup and filebeat -e commands. However, again observation remained same. Further, we wanted to know more about the messages that are shown up after running setup command.
So, Could you please let us know if we are missing anything. Thanks |
Adoption of data streams in Beats has nothing to with data streams in fleet. The data streams here are just special indices for time-series data managed by Elasticsearch. Thus, you should not see data streams in Fleet, but events on the Discover tab. Based on your tests and screenshots, it seems to me that everything is working well. |
Hi @kvch Thanks for providing the feedback. However, we are still confused about the implemented changes. In Ticket summary, it is mentioned that:
Where we can validate this information. Please make us more clear on new changes that are effective with respect to proposed changes. Thanks |
I am sorry if I was not clear. The ideal way to check is not necessarily if the expected names show up. I am rather interested in if dashboards still work. For example you could enable system module, send a few logs and check if the system dashboards can display the data. Could you please test this for me? |
Hi @kvch We have retested filebeat installation on 8.0 snapshot and below are our observations: Please let us know if we are still missing anything. Thanks |
Awesome! This is what I wanted to see. Thanks. |
As of Beats 8.0, data streams are the default output to Elasticsearch. Line 245 needs change to {y} and the phrasing in Line 248 should reflect that all three options can take advantage of data streams, easier ILM, and the data stream naming scheme. Reference: elastic/beats#28450 linked in https://www.elastic.co/guide/en/beats/libbeat/8.0/release-notes-8.0.0.html
As of Beats 8.0, data streams are the default output to Elasticsearch. Line 245 needs change to {y} and the phrasing in Line 248 should reflect that all three options can take advantage of data streams, easier ILM, and the data stream naming scheme. Reference: elastic/beats#28450 linked in https://www.elastic.co/guide/en/beats/libbeat/8.0/release-notes-8.0.0.html
As of Beats 8.0, data streams are the default output to Elasticsearch. Line 245 needs change to {y} and the phrasing in Line 248 should reflect that all three options can take advantage of data streams, easier ILM, and the data stream naming scheme. Reference: elastic/beats#28450 linked in https://www.elastic.co/guide/en/beats/libbeat/8.0/release-notes-8.0.0.html
What does this PR do?
We are introducing data streams to Beats. It means that all Beats are going to send events to data streams instead of indices regardless of ES version. Do not confuse it with the data stream naming convention we use in integrations. Naming does not change in Beats, only the underlying data storage method in Elasticsearch.
The name of the data stream is going to be
{beatname}-{version}
and the index pattern is{beatname}-{version}
.With this change, the option
setup.template.type
no longer makes sense. Hence, it is removed completely from 8.x. If you are loading JSON index templates by specifying a file insetup.template.json.path
, make sure you move from the legacy format to composable index templates.Beats no longer load an alias to Elasticsearch, instead all data can be reached through the data stream.
One of the limitations is that only create operations are supported in data streams. Thus, there is no way to use e.g. "index" or "delete" operation types when sending events to ES.
Why is it important?
Simplify loading ILM, templates, and use the specialized data streams for output events.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Related issues
Closes #25018
Requires #28671