Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Import settings component templates from Elasticsearch #163141

Closed
felixbarny opened this issue Aug 4, 2023 · 24 comments · Fixed by #163731
Closed

[Fleet] Import settings component templates from Elasticsearch #163141

felixbarny opened this issue Aug 4, 2023 · 24 comments · Fixed by #163731
Assignees
Labels
Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@felixbarny
Copy link
Member

Currently, fleet is creating its own settings that are decoupled from any components that ship with Elasticsearch by default.

index: {
...(isILMPolicyDisabled
? {}
: {
// ILM Policy must be added here, for now point to the default global ILM policy name
lifecycle: {
name: ilmPolicy ? ilmPolicy : type,
},
}),
// What should be our default for the compression?
codec: 'best_compression',
// setting `ignore_malformed` only for data_stream for logs
...(type === 'logs'
? {
mapping: {
ignore_malformed: true,
},
}
: {}),
// All the default fields which should be queried have to be added here.
// So far we add all keyword and text fields here if there are any, otherwise
// this setting is skipped.
...(defaultFieldNames.length
? {
query: {
default_field: defaultFieldNames,
},
}
: {}),
},

These settings are largely similar to the settings component templates that Elasticsearch ships with. For example: https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/core/template-resources/src/main/resources/logs-settings.json

This makes it difficult to get a unified behavior for Fleet-managed data streams and data streams that are created via the index templates that are built into Elasticsearch.

As there's no single source of truth, changes have to be made both in Fleet and Elasticsearch. For example #157184 and elastic/elasticsearch#95329. We're planning to make similar changes to set subobjects: false and dynamic: until_limit that need to be implemented both in Elasticsearch and in Fleet.

Also, we want to have different settings for TSDB metric data streams compared to non-TSDB metric data streams. The PR elastic/elasticsearch#97602 creates a dedicated settings component template for TSDB data streams. While we could replicate the same in Fleet, I think it's preferable to have a single source of truth for these.

One challenge with that approach is that when changing these files in Elasticsearch, we need to consider that this has consequences for all integrations that are managed by Fleet. We could solve this by adding a codeowners entry for these settings files to request a review from the ingest team and/or from solution teams that are building integrations.

@felixbarny felixbarny added the Team:Fleet Team label for Observability Data Collection Fleet team label Aug 4, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@felixbarny
Copy link
Member Author

Hey @joshdover what do you think about this proposal? Any concerns? Do you think this is something we could contribute from the observability team?

@felixbarny
Copy link
Member Author

Now that elastic/elasticsearch#97602 is merged, we should import that component template for TSDB data streams.

This will bring an important performance improvement of using index.codec: default instead of best_compression for TSDB.

@joshdover joshdover self-assigned this Aug 11, 2023
@felixbarny
Copy link
Member Author

@dakrone one challenge that we've identified is whether Fleet can actually rely on these assets to be always available. For example, users could set stack.templates.enabled=false which prevents the templates from being installed.

@joshdover
Copy link
Contributor

Plan is to keep the lifecycle.name setting and query.default_field setting that is currently populated by Fleet, and remove codec and mapping.ignore_malformed (only applied to logs right now). Everything else will be delegated to ES's component templates for the defaults.

For example, users could set stack.templates.enabled=false which prevents the templates from being installed.

I was hoping that disabling this setting already broke Fleet due to the built in ILM policies not being present, but this seems to be tolerant to non-existent ILM policies. Only way we can handle this scenario in a non-breaking way would be to populate the codec and mapping.ignore_malformed when the stack templates are disabled, which makes the code ugly, but we can do it.

Another question what should we do for traces data streams that doesn't have an component template for settings in ES? Supply the same defaults we do today? (the only one that would apply is codec: best_compression)

@felixbarny
Copy link
Member Author

We could also use allow_missing for these. That could also be a solution for the traces data stream. That way, we don't need to change anything in Fleet when we introduce a traces@settings component template.

@joshdover
Copy link
Contributor

I thought about that but it will change the codec and may be considered breaking?

@felixbarny
Copy link
Member Author

I guess strictly speaking yes but I'd assume that most users that set stack.templates.enabled=false don't use integrations or the data stream naming scheme. Based on the discussion on the issue that introduced that flag (elastic/elasticsearch#62835) this was added to not break indices with names that conflict with the data stream naming scheme (such as logs-data-things-2020-09-23).

While a change in codec can be unexpected and lead to higher disk usage, it can be mitigated easily (by manually creating that component template) and usually doesn't immediately break stuff.

@joshdover
Copy link
Contributor

We could also use allow_missing for these. That could also be a solution for the traces data stream. That way, we don't need to change anything in Fleet when we introduce a traces@settings component template.

FYI the implementation in the draft PR will use trace-settings until the naming convention is switched over for all templates.

@joshdover
Copy link
Contributor

I guess strictly speaking yes but I'd assume that most users that set stack.templates.enabled=false don't use integrations or the data stream naming scheme. Based on the discussion on the issue that introduced that flag (elastic/elasticsearch#62835) this was added to not break indices with names that conflict with the data stream naming scheme (such as logs-data-things-2020-09-23).

While a change in codec can be unexpected and lead to higher disk usage, it can be mitigated easily (by manually creating that component template) and usually doesn't immediately break stuff.

Thanks for digging up the history. I agree this helps improve the situation, though I'm not sure if we can rely on users only using this option only for that reason. Even though the mitigation is straightforward, it may not be obvious why this started happening in the first place.

All this said, these kind of change seems in line with the changes we've been making to index templates recently. I just don't want to put undue pain on users here - see the recent breakages caused by TSDB rollout.

@dakrone
Copy link
Member

dakrone commented Aug 11, 2023

@dakrone one challenge that we've identified is whether Fleet can actually rely on these assets to be always available. For example, users could set stack.templates.enabled=false which prevents the templates from being installed.

What sort of mitigation do we have if the user has the stack templates disabled? Should Fleet refuse to install further templates/integrations if it wants to rely on the stack-installed ones?

@felixbarny
Copy link
Member Author

Another option would be to change the semantics of stack.templates.enabled slightly and only always install component templates. The flag would then only control whether the index templates are installed. That seems to cater for both needs: We want to be able to rely on some fundamental component templates to be always available so that we can use them in Fleet and in other places (such as elastic/elasticsearch#97546). Users that don't want to install the index templates as it conflicts with their index names, can disable the installation of the index templates.

There's one slight risk though: users that have set stack.templates.enabled=false might have created component templates with the same name as the stack component templates. However, that risk should be mitigated with elastic/elasticsearch#98535.

@joshdover
Copy link
Contributor

+1 on that idea, @felixbarny. @dakrone does that make sense to you?

@dakrone
Copy link
Member

dakrone commented Aug 21, 2023

Users that don't want to install the index templates as it conflicts with their index names

It's not only that, in the past the setting was introduced because a user had a pre-existing template with the same ID, and the stack ones would overwrite it on upgrade.

@felixbarny
Copy link
Member Author

@dakrone would you agree that the new naming convention (elastic/elasticsearch#98535) mitigates that issue?

@dakrone
Copy link
Member

dakrone commented Aug 21, 2023

@felixbarny we discussed this a bit today in our meeting and discussed a potential path forward. We'd be able to do this for the newly renamed component templates, and not for the older names. We'll also need to document the naming scheme for these templates.

@felixbarny
Copy link
Member Author

We'd be able to do this for the newly renamed component templates, and not for the older names.

Yeah, that's what I had in mind.

We'll also need to document the naming scheme for these templates.

++ makes sense

Seems like we have a path forward 🙂

@joshdover
Copy link
Contributor

I think this means we shouldn't move forward with the Fleet change until we have the new template names implemented in elastic/elasticsearch#98535. Any estimate on this change?

@eyalkoren
Copy link
Contributor

I think this means we shouldn't move forward with the Fleet change until we have the new template names implemented in elastic/elasticsearch#98535. Any estimate on this change?

Yes, it's a dependency. No estimate on this change yet, but it will probably include the creation of a new stack index template registry as part of it. I will add a comment in elastic/elasticsearch#98535 with what we think will be included. Feel free to provide feedback

@eyalkoren
Copy link
Contributor

eyalkoren commented Aug 22, 2023

@ruflin
Copy link
Member

ruflin commented Aug 29, 2023

If I remember correctly when we introduced stack.templates.enabled=false we had a discussion to remove this flag in 8.0 but we never did. The second part discussed was that basically when this flag is used, Fleet is unusable. It seems it is still mostly usable but any standalone Elastic Agent shipping data would have a very bad experience. The idea back then was to disable "Fleet / Integrations" if this flag is set to true but also never did it. I still think we should follow up on this. Fleet and integrations should show a very huge warning when the flag is used.

In addition, I suggest we deprecate the flag. Users using it will get a deprecation warning and hopefully that encourages them to move away from the flag. If and when we remove the flag completely is a separate discussion. In general, any user using Observability or Security should never use the flag.

@felixbarny
Copy link
Member Author

@joshdover would Fleet roll over the data stream if there are changes in the stack component templates?

@joshdover
Copy link
Contributor

@joshdover would Fleet roll over the data stream if there are changes in the stack component templates?

No this is not currently handled by Fleet. Fleet will only attempt to rollover data streams when integrations are modified. I don't think Fleet should handle this and it's better handled by Elasticsearch itself.

@ruflin
Copy link
Member

ruflin commented Oct 31, 2023

it's better handled by Elasticsearch itself.

Agree: elastic/elasticsearch#75031

joshdover added a commit that referenced this issue Nov 2, 2023
## Summary

Fixes #163141
Fixes #160288

Blocked by:
- elastic/elasticsearch#98535

This switches where integrations installed by EPM get their default
index settings from to use the [source-of-truth component templates
supplied by
Elasticsearch](https://github.com/elastic/elasticsearch/tree/main/x-pack/plugin/core/template-resources/src/main/resources).
This will help ensure that data streams configured by EPM always get the
same defaults as data streams the user creates using the default
`logs-*-*` and `metrics-*-*` templates. For now, no default mappings are
sourced from Elasticsearch.

As part of this change the template format version was incremented to
force EPM to reinstall all templates and rollover data streams on the
Stack upgrade to the version including this change.

### Checklist

Delete any items that are not applicable to this PR.

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <[email protected]>
delanni pushed a commit to delanni/kibana that referenced this issue Nov 6, 2023
…63731)

## Summary

Fixes elastic#163141
Fixes elastic#160288

Blocked by:
- elastic/elasticsearch#98535

This switches where integrations installed by EPM get their default
index settings from to use the [source-of-truth component templates
supplied by
Elasticsearch](https://github.com/elastic/elasticsearch/tree/main/x-pack/plugin/core/template-resources/src/main/resources).
This will help ensure that data streams configured by EPM always get the
same defaults as data streams the user creates using the default
`logs-*-*` and `metrics-*-*` templates. For now, no default mappings are
sourced from Elasticsearch.

As part of this change the template format version was incremented to
force EPM to reinstall all templates and rollover data streams on the
Stack upgrade to the version including this change.

### Checklist

Delete any items that are not applicable to this PR.

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants