Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moving Away from Shared Component Templates Already? #91370

Open
MakoWish opened this issue Nov 7, 2022 · 11 comments
Open

Moving Away from Shared Component Templates Already? #91370

MakoWish opened this issue Nov 7, 2022 · 11 comments

Comments

@MakoWish
Copy link

MakoWish commented Nov 7, 2022

I thought the idea of Component Templates was fantastic. Managing dozens (or hundreds) of desparate _template's in the past was a royal pain. If there was a change to, let's say, the host fields, I would have to go through and modify every single _template that used the host fields. The creation of _index_template's being composed of _component_template really simplified that. All I would have to do is update the host component template, and all Index Templates using it would also be updated automatically.

I am now trying in our DEV environment to migrate away from indices to start using Elastic Agent and Data Streams, but Elastic has once again gone back to defining fields distinctly in every single Index Template. For instance, if you look at the Index Template logs-system.security, instead of being composed of host, process, source, destination, etc. Component Templates, it is composed of logs-system.security@package and logs-system.security@custom, where each one of those explicitly define all the components. This is completely counter-intuitive to the idea of using Component Templates to begin with.

Why was this decision made, and can we make a hard push to get back to the original idea behind Component Templates? We use several custom fields throughout many data sources, like host.bios.*, user.target.*, and many others, and this sudden move away from the new Component Templates is going to make my life a nightmare.

Eric

@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Nov 7, 2022
@leandrojmp
Copy link
Contributor

I had the same impression when I started to planning the migration from normal Indices and custom pipelines to the Elastic Agent Integrations.

The integrations helps you to add new data without the need to create pipelines to parse the messages and everything else, but it also makes the administration and managing of these data way hard and confusing.

For example, there is only one lifecycle policy for all the integrations, if you have a low volume rate integration and a very high volume rate integration, they will use the same lifecycle policy, of course you can change it but you would need to edit tens of templates to achieve that.

This was one of the many reasons that made us drop the adoption of Elastic Agent and use it just for some simple things.

@MakoWish
Copy link
Author

MakoWish commented Nov 8, 2022

Yeah, after about three months of playing around with it, we came to the conclusion just yesterday that Elastic Agent and Data Streams are just not ready for mainstream yet. We will be reverting our Control Group devices back to Beats agents. This is unfortunate, as I was really excited about being able to centrally manage agent deployments. Beats and indices with Component Templates are easy. Elastic Agent with convoluted Data Streams, along with the impossible-to-manage Index Templates and Ingest Pipelines, are not something I want to deal with.

Eric

@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Nov 9, 2022
@ruflin
Copy link
Contributor

ruflin commented Nov 29, 2022

Part of the reason component templates were developed is the data stream naming scheme. It was not possible to combine templates the way we wanted without having "accidental inheritance". Elastic Agent and Fleet fully rely on component templates and these are not going aways, rather we start using it more and more.

We need to split up 2 things:

  • The feature of component templates which we seem to all agree is great. It is here to stay
  • How component templates are used to construct templates

Fleet has a specific way how to manage component templates and for the integrations it installs, it follows the internal conventions. But most of these component templates are not meant for users to modify, unfortunately Elasticsearch has no way to specify "managed" templates. These templates have an extension point which is defined as @custom. This is all Fleet specific.

Why was this decision made, and can we make a hard push to get back to the original idea behind Component Templates?

@MakoWish You can use the way component templates work best for you, it is not that we are moving away from component templates in any way. There is for example currently a discussion ongoing around ECS to offer it as a component template. Should it 1 ECS component template or should it be 1 per prefix, or 1 per core and 1 for extended. There are pros and cons for each.

For example, there is only one lifecycle policy for all the integrations, if you have a low volume rate integration and a very high volume rate integration, they will use the same lifecycle policy, of course you can change it but you would need to edit tens of templates to achieve that.

@leandrojmp Unfortunately you are right. The way it is setup is not ideal and we are working on improving/fixing it. The part that changed is now as you have multiple data streams for the different datasets, you can at least have ILM policies for the different data. Component templates will help us to solve this problem. Previously in beats, you had one big index with all the data inside if you didn't specify your own indexing strategy.

@MakoWish
Copy link
Author

@ruflin

You can use the way component templates work best for you

No, not really. The Index Templates for Elastic Agent Integrations are managed, so if I were to change them to use the previous Component Templates we have relied on for a while now, they will just be overwritten on the next Integration update. I already tried.

it is not that we are moving away from component templates in any way.

But you have. Instead of an Index Template using shared Component Templates as I understand they were meant to be used like this:

...
  "composed_of": {
    "host",
    "client",
    "source",
    "destination",
   ...
  }

You no longer use those shared Component Templates and have created completely disparate Component Templates like this:

...
  "composed_of": {
    "integration@package",
    "integration@custom"
  }

IMHO, this completely breaks the entire idea behind shared Component Templates.

Previously in beats, you had one big index with all the data inside if you didn't specify your own indexing strategy.

We are using our own indexing strategy, but Elastic Agent would force us away from that, and we would have to re-architecture pretty much everything we have done over the past four years. For instance, we broke each Filebeat and Metricbeat module out to their own indices, so instead of everything going to filebeat-*, we have Filebeat's Cisco moduless writing to our cisco-* indices; Metricbeat's IIS module writes to iis-*; Filebeat's Threat Intel module writes to threatintel-*; and so on and so forth. Now Elastic Agent wants to write everything to logs-* without any apparent way to change that. Every dashboard, visualization, saved search, security detection rule, Watcher... everything would need to be rebuilt to use Elastic Agent.

@ruflin
Copy link
Contributor

ruflin commented Nov 29, 2022

No, not really. The Index Templates for Elastic Agent Integrations are managed, so if I were to change them to use the previous Component Templates we have relied on for a while now, they will just be overwritten on the next Integration update. I already tried.

If you are using the integration packages, yes you are forced to use the template structure we have put in place and use @custom. This is by design so we can upgrade packages without breaking your setup. If you are using your own data streams, you have the complete freedom to put together index templates and component templates the way you want. What you mean in this context "previous component templates"?

Instead of an Index Template using shared Component Templates as I understand they were meant to be used

What you describe is a valid use case for component templates. But this doesn't mean the way we use component templates in Fleet / Elastic Agent is invalid. I can't remember that we ever used component templates for Fleet / Agent the way you describe it above. There are reasons for the way we use it which are related to allow users to overwrite settings or mappings to a certain point which would not be possible without component templates. As described before, there is a good chance in some parts of the integrations for ECS we also start to use the component templates as reusable parts which I hope you will like as it seems the top level objects you described, sound a lot like ECS.

The data stream naming scheme is inspired exactly by what you did with your indices and many others. Elastic Agent forces you to use the data stream naming scheme but it does not force you in any way to use component templates in the way we do for integrations. You can specify your components, your index templates etc. as long as you use logs-*-*. So in your scenario, this would likely be logs-cisco-default, logs-threatintel-default etc. Happy to chat more about migration from Beats to Elastic Agent but I don't think this is the right place as it is not related to component templates.

@MakoWish
Copy link
Author

If you are using the integration packages, yes you are forced to use the template structure we have put in place and use @Custom.

That is the main issue, right there. We previously had a single component template for each ECS (or custom) set of fields. If there was a change to one of these Component Templates, it was automatically applied to every Index Template that uses it. Now if we make a change to the user set of fields, for instance, we will have hundreds of @custom Component Templates to update.

Just as an arbitrary example, we enrich quite a lot of our events, regardless of the source, to add more information about users related to the events. It does not matter if the event comes from our firewall data, anti-virus, Beats agents, or any other source, we enrich them all with custom ECS-compliant fields such as user.department, user.description, and user.manager. We can no longer just update the single shared user Component Template. We would now need to update every single @custom template to include those custom ECS-compliant fields.

What you mean in this context "previous component templates"?

I only mean the Component Templates that are shared by other Index Templates such as host, user, source, destination, etc. as opposed to this new @custom idea.

@leandrojmp
Copy link
Contributor

The main issue in my opinion is that if you need to make any custom change on any integration, be it a custom mapping for a custom field or a custom ingest pipeline, you wil have so much work that in the end it will make you avoid to use the integrations at all.

For example, I had a recent issue on discuss while trying to add a custom ingest pipeline to an integration to add a custom field, source.ip, an ecs one.

Following the documentation I saw that i just needed to create a ingest pipeline named logs-integration.dataset@custom, this worked, but them the field source.ip was not present in the mapping for this dataset and I got a conflict message from kibana, another dataset in the same integration had a different mapping, so to add a simple custom field I need to edit a custom ingest pipeline and a custom template, if I want to add a custom field to this integration I would need to edit at least 5 custom ingest pipelines and 5 custom component templates.

It would be better if the integrations had a simple way to add a component templates and ingest pipelines to all its datasets, without the need to edit so many files.

But I agree that this is not the right place to chat about it as this is not an issue with component templates, but with the integrations.

@MakoWish
Copy link
Author

MakoWish commented Nov 29, 2022

Here is a perfect example of the issue at hand. I went to create a new Data View for the Indices and Data Streams winlogbeat*,logs-system*,logs-windows*, and there is a conflict across the Component Templates with the Data Steams.

All my winlogbeat-* indices have source.geo.location mapped as geo_point, as they are using our shared source Component Template. The Data Streams logs-system.auth-default and logs-system.security-default also have source.geo.location properly set as geo_point. Unfortunately, the @package Component Template for logs-windows.sysmon_operational only has mappings for source.port, source.domain, and source.ip fields, so source.geo.location was incorrectly created as object. This could have been avoided if the Index Templates used shared Component Templates.

To my understanding, there is no way to reindex a Data Stream's backing indices, so even if I add the source mappings to @custom, there is no way for me to reindex the affected indices to correct the conflict.

EDIT:

Also just found metrics-linux.socket is missing the source.geo.location and source.ip mappings as well.

And system.process.cpu.system.time.ms should be long, but it also has different mappings throughout as well:

Type: date
.ds-metrics-elastic_agent.filebeat-default-2022.11.07-000001, .ds-metrics-elastic_agent.fleet_server-default-2022.11.17-000001, .ds-metrics-elastic_agent.metricbeat-default-2022.11.07-000001

Type: long
.ds-metrics-elastic_agent.elastic_agent-default-2022.11.07-000001

@MakoWish MakoWish changed the title Moving Away from Component Templates Already? Moving Away from Shared Component Templates Already? Nov 29, 2022
@ruflin
Copy link
Contributor

ruflin commented Nov 30, 2022

@leandrojmp @MakoWish The points you are bringing up are valid and we should find ways to accommodate this kind of usage. I think we initially got distracted by the title "Moving Away from Shared Component Templates Already?" but the way I read it now, it is much more about how integrations templates / ingest pipelines can be extended with your own component templates / ingest pipelines.

To continue the conversation, I suggest we take this to the Kibana repository as Fleet is part of Kibana and is the tool that installs the templates. @leandrojmp Any chance you could put the details you have in #91370 (comment) into a Github issue under https://github.com/elastic/kibana and reference it here? Then @MakoWish can join in with his details.

@MakoWish For the "wrong" mappings, this ECS discussion should help with it eventually: #85692 The discussion was stale for a bit but it will continue shortly.

@MakoWish
Copy link
Author

I think we initially got distracted by the title "Moving Away from Shared Component Templates Already?" but the way I read it now, it is much more about how integrations templates / ingest pipelines can be extended with your own component templates / ingest pipelines.

No, it is about the use of shared component templates throughout to ensure all data sources have the same mappings. If you want to retain the @custom Component Template idea to allow users to add their own mappings, that is fine, but the @package Component Template idea is where the flaw lies.

@leandrojmp
Copy link
Contributor

@ruflin I will create a new issue in the Kibana repository with a feature request to make it easier to use custom templates and ingest pipelines with integrations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants