[Fleet] Add namespace-specific index and component templates #121118

joshdover · 2021-12-13T18:00:12Z

Fleet and Elastic Agent users need a mechanism to customize how their data is being ingested, mapped, and stored that will be preserved across Stack and integration package upgrades. This issue outlines how we plan to structure index and component templates to support user customizations to Fleet-managed data streams, on a namespace-level of granularity.

Scope

The goal is provide a future-proof naming scheme and template structure that will allow users to add the following customizations to Fleet-managed data streams:

Mappings (additive and non-additive)
ILM policy
Number of replicas, primaries, and routing shards
Refresh interval
Other general index settings (query.*, etc.)

This scheme does not allow customizing:

Ingest pipelines
- Elasticsearch does not support more an arbitrary number of ingest node pipelines so the only existing way to customize ingest pipelines is to modify the one installed by Fleet which will not be preserved across package upgrades. See Specify multiple ingest pipelines for a data stream elasticsearch#61185
- If Elasticsearch were to add a default_pipelines setting which is an array of pipelines, it’s likely that this customization scheme would be compatible.

Design

Existing scheme (as of 8.2)

The existing scheme that we use in Fleet today installs a single index template for each dataset in a package that matches data streams for all namespaces. It has the following properties:

name: <type>-<dataset>
matches: <type>-<dataset>-*
priority: 200
Component templates (highest to lowest precedence):
- .fleet_agent_id_verification-1
  - final pipeline & mappings for agent_id verification (can optionally be disabled in kibana.yml)
- .fleet_globals-1
  - global settings and mappings applied to every data stream (eg. event.ingested)
- <type>-<dataset>@custom
  - user-defined customizations (settings and/or mappings - for all namespaces)
- <type>-<dataset>@package
  - package-defined mappings and settings

New proposed scheme

In order to preserve user customizations across upgrades, it’s important that we store their overrides in a separate component template that Fleet can copy over to new versions of the package’s index template. In this updated scheme, we will add an additional index template that is namespace-specific and of higher priority than the base template:

name: <type>-<dataset>-<namespace>
matches: <type>-<dataset>-<namespace>
priority: 250
Components (highest to lowest precedence):
- .fleet_agent_id_verification-1
  - final pipeline & mappings for agent_id verification (can optionally be disabled in kibana.yml)
- .fleet_globals-1
  - global settings and mappings applied to every data stream (eg. event.ingested)
- <type>-<dataset>-<namespace>@custom
  - namespace-specific user-defined customizations
- <type>-<dataset>@custom
  - user-defined customizations (settings and/or mappings - for all namespaces)
- <type>-<dataset>@package
  - package-defined mappings and settings

During package upgrades, Fleet would preserve the contents of both the ‘global’ custom template (<type>-<dataset>@custom) and the namespace-specific ones (<type>-<dataset>-<namespace>@custom) while replacing all of the other templates (including the index template). This would allow the user’s customizations to be preserved and to override any package-specific settings and mappings.

Like the ‘global’ custom template we offer today, we would allow users to directly edit the namespace-specific templates with arbitrary settings and mappings in order to override those supplied by the package. We would also use the template to store customizations that we plan to support directly in the UI (eg. setting the ILM policy).

We will not remove the base index template we install today that matches a wildcard namespace (<type>-<dataset>-*) because Elastic Agent standalone requires this template to be installed.

Changing a namespace for an existing integration policy

If a user edits an existing integration policy to point to a new namespace, we can offer them the option to copy over any customizations from the previous namespace’s <type>-<dataset>-<namespace>@custom template. We would not delete the old templates since this could affect the existing data streams and indices or any standalone agents ingesting data into this namespace.

As a separate enhancement, we could offer a ‘cleanup’ UI either in Fleet or Stack Management that shows index templates that are not currently in use.

Customize API

In order to facilitate automated usage of this scheme, we should provide a high-level package customization Kibana Fleet API in Kibana that allows admins to make customizations without worrying about the low-level details of how the templates are configured, whether or not a data stream needs to be rolled over, or how to apply the setting changes retroactively to backing indices. The main usecase for this is for standalone Agent usage. This may also be used to power in-app features for making customizations (eg. setting the ILM policy).

# Write custom settings and mappings to all namespaces
# Writes to `<type>-<dataset>@custom` templates
PUT /api/fleet/epm/nginx/customize
{
  "settings": { … },
  "mappings": { … },
}

# Add or update a namespace for an integration, creates the namespace-specific templates
# Write custom settings and mappings to namespace
# Writes to `<type>-<dataset>-<namespace>@custom` templates
PUT /api/fleet/epm/nginx/customize/namespace/foo
{
  "settings": { … },
  "mappings": { … },
}

# Removes a namespace, deleting namespace-specific templates
# Does not delete data indices or data streams
DELETE /api/fleet/epm/nginx/customize/namespace/foo

All of the other APIs should also create these namespaces automatically. For example, if an integration policy is added for the nginx package on the bar namespace, the POST /api/fleet/package_policies API should also create the appropriate namespace templates if they don't already exist.

There are additional use cases for this API outside of index templates, for example there have been other requests for namespace-specific transforms. We should design this API to accommodate future use cases easily.

Upgrade considerations

For packages that were installed before this scheme was introduced, Fleet should automatically add the appropriate namespace-specific index and component templates in order to facilitate a consistent experience for end-users. See #121099

For upgrades where any @custom components already exist, they should be retained and not removed so that they are still present once the new package version is installed. This means existing templates should also not get overwritten.

Open questions

When should namespace-specific templates be deleted when using the product?
- If we're going to support a generic API that doesn't require integration or agent policies to point to namespaces, then I don't think we can do any automated cleanup else we could delete configuration that is in use by a standalone agent.
There are separate @custom component templates for each data stream in an integration package, while the API design proposed here would apply to the entire integration. This can present problems if a user manually edits a single component template so the data streams are not in sync, for example the source of truth is now ambiguous. How would we solve this?
- Have a single, managed component template that is used for customizations that apply to the entire integration. Leave the @custom templates unmanaged and never edit them. (@joshdover votes for this one)
- Store the customizations set on this API in a Kibana Saved Object and use this as the source of truth. Manual user edits to @custom templates would then be merged in after settings from this SO. This would allow manual additions and modifications to @custom templates to be preserved, however deletes would be lost.
How should namespace renames work? If a user renames the namespace field on an integration policy or agent policy, should we attempt to copy any customizations on the previous namespace when creating the new namespace? If not or if the new namespace already exists, should we warn the user that settings/mappings are going to change for this data?
How do we handle when a new dataset is added for an existing package? Should we keep a copy of any custom settings/mappings in a Saved Object and automatically apply them to all datasets during package upgrades?
Should the management APIs allow changes to mappings? If so when and how would the user expect these to take effect e.g would a rollover be automatic?

The text was updated successfully, but these errors were encountered:

elasticmachine · 2021-12-13T18:00:14Z

Pinging @elastic/fleet (Team:Fleet)

dominiqueclarke · 2022-03-14T14:14:01Z

Synthetics Use Case

All monitors added in Monitor Management, use data streams to write back results to ES. There is a separate data stream for each monitor type (ICMP, HTTP or TCP), with browser monitors being further split down between the data sets we store (network, screenshot etc.).

In addition, the namespace that’s been defined when setting up the monitor (which will be default by default), is appended to the name of the data stream.

All monitors configuration is stored as Kibana saved objects, outside of the Fleet integration policy ecosystem. More information on UI Monitor Management and the Synthetics Service: https://docs.google.com/presentation/d/18eT7xgyqJ5TG5srOeHUN8bFrY-5MBZIpp3TIEcTpI5M/edit#slide=id.g1055e0afed4_0_548

As a result, Fleet will have no way of knowing when a new non-default namespace monitor is created.

To enable the generation of namespaced component and index templates, Synthetics will require a way to generate the appropriate component and index templates on the fly whenever a non-default namespace monitor is created.

If Fleet could expose the logic for creating these templates on its plugin contract, Synthetics could hook into and reuse the logic from this implementation for UI Monitor Management and the Synthetics Service.

cc: @joshdover @andrewvc

mostlyjason · 2022-03-14T19:13:45Z

Thanks @dominiqueclarke adding a link to the Uptime issue that explains that the need isn't just to create namespaces dynamically, but also to configure custom ILM policies for each elastic/uptime#453. This will require us to setup index and component templates for each namespace so that users can set custom ILM policies.

threatangler-jp · 2022-09-19T01:06:10Z

Checking in on the status of this. What version do you expect it to be included in? And is there a workaround available? Our specific need is to apply different ILM policies to different sets of agents. Thank you!

joshdover · 2022-09-19T11:31:27Z

Hi @threatangler-jp 👋, thanks for the question. We do not have a public ETA on this, however we have published documentation on how to do this manually for the exact use case you have here: https://www.elastic.co/guide/en/fleet/8.4/data-streams-ilm-tutorial.html

threatangler-jp · 2022-09-19T13:18:53Z

Thank you @joshdover. We are on v8.3.3 and this workaround appears to only be available on v8.4. Let me know if I am mistaken on that factor.

We will eventually upgrade but upgrading is a heavy lift and so that will take some time.

joshdover · 2022-09-19T14:29:16Z

@threatangler-jp Looks like we moved the doc from 8.3 to 8.4. It's supported on 8.3 though, here's the 8.3 doc: https://www.elastic.co/guide/en/fleet/8.3/data-streams.html#data-streams-ilm-tutorial

threatangler-jp · 2022-09-19T15:05:12Z

@joshdover great news! Thank you :)

felixbarny · 2023-05-10T08:17:50Z

In general, I'm excited about this but I'm bit worried that this is too tightly coupled to Fleet. Ideally, these customization extension points should also be available for data streams that aren't managed by Fleet. Maybe we can find a way to integrate this more tightly into Elasticsearch itself.

cc @dakrone @jbaiera @eyalkoren

This need is getting more important because of the reroute processor. Users can route data to dynamic data streams that aren't set up via Fleet. For example:

- reroute:
  dataset: {{service.name}}

Fleet can't know all values of service.name upfront to create dedicated index templates for each individual service.name with all of the extension points. That's one of the reason we're making the built-in logs-*-* index template more potent:

One of the ways we could do that is to still rely on the @custom component templates as proposed in the issue and add them to the logs-*-* index template.

We'd need to add placeholders into the component template, though, for example logs-{{data_stream.dataset}}@custom. Another option would be to store these customizations in another entity. Maybe in the data stream itself rather than the index template. Or have an option for component template to inject themselves into index templates and data streams. But that sounds similar to legacy index templates where it's hard to determine the effective settings when multiple templates are merged together.

felixbarny · 2023-07-13T16:42:44Z

I've created an Elasticsearch issue for this: elastic/elasticsearch#97664. I'd like to propose closing this issue in favor of the Elasticsearch issue as I think this feature shouldn't be exclusive to Fleet. More on the reasoning about that in the issue.

joshdover · 2023-07-14T15:03:45Z

+1 on moving this to Elasticsearch, though we will still need to do work in Fleet to add these additional component template names to our index templates, which is largely the same work as before.

joshdover · 2023-07-14T15:04:14Z

Oh sorry, yes we can close this one but we still need #149484

joshdover added the Team:Fleet Team label for Observability Data Collection Fleet team label Dec 13, 2021

joshdover added the enhancement New value added to drive a business result label Dec 13, 2021

joshdover mentioned this issue Dec 14, 2021

[Integrations] Add a link to ILM policies in the integration policy editor #108554

Open

3 tasks

jen-huang added the v8.2.0 label Jan 19, 2022

joshdover mentioned this issue Feb 3, 2022

[Fleet] Move data stream mappings from index template to component template. #121184

Closed

jen-huang removed the v8.2.0 label Feb 14, 2022

jen-huang mentioned this issue Feb 14, 2022

[Spike] Investigate separate Index lifecycle policies for each datastream elastic/uptime#453

Closed

mostlyjason mentioned this issue Feb 15, 2022

[Request] Document how to customize index templates from packages elastic/observability-docs#1578

Closed

joshdover mentioned this issue Mar 4, 2022

[Fleet] Changes in package install format should be applied on Stack upgrades #121099

Closed

5 tasks

jguay mentioned this issue Mar 7, 2022

Support for custom fields in fleet integrations elastic/elastic-agent#138

Closed

paulb-elastic mentioned this issue Mar 21, 2022

Separate Index lifecycle policies for each dataset elastic/uptime#462

Closed

kfirpeled mentioned this issue Mar 22, 2022

add privileges for kIbana_system user to serve cloud security posture… elastic/elasticsearch#84941

Merged

This was referenced Mar 29, 2022

[Fleet] Analyzer in index template settings are not working #128209

Closed

[Request] Fleet integrations index template structure elastic/ingest-docs#111

Open

joshdover mentioned this issue Apr 26, 2022

Add transform to spec elastic/package-spec#307

Merged

2 tasks

This was referenced Jun 1, 2022

[Fleet] Add support for input type packages #133296

Closed

[Fleet] Add support for custom ingest pipeline to integrations #133740

Closed

joshdover mentioned this issue Dec 15, 2022

[Fleet] Honor index_mode: time_series setting during package installation #146804

Closed

ruflin mentioned this issue Dec 15, 2022

Allow to easily add custom pipelines and templates per integration, currently it is done per dataset. #146792

Closed

joshdover mentioned this issue Dec 23, 2022

Remove event.duration and event.ingested from metric events elastic/integrations#4894

Open

kpollich mentioned this issue Dec 29, 2022

[Draft] Play around with index_mode in Fleet #147684

Closed

joshdover mentioned this issue Jan 25, 2023

[Fleet] Add support for customizing integration data streams at more levels of granularity #149484

Open

joshdover mentioned this issue May 10, 2023

[Fleet] Support for document-based routing via ingest pipelines #151898

Closed

felixbarny mentioned this issue Jul 13, 2023

Allow customizing managed data streams at different levels of granularity elastic/elasticsearch#97664

Open

joshdover closed this as completed Jul 14, 2023

joshdover closed this as not planned Won't fix, can't repro, duplicate, stale Jul 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fleet] Add namespace-specific index and component templates #121118

[Fleet] Add namespace-specific index and component templates #121118

joshdover commented Dec 13, 2021 •

edited

Loading

elasticmachine commented Dec 13, 2021

dominiqueclarke commented Mar 14, 2022

mostlyjason commented Mar 14, 2022

threatangler-jp commented Sep 19, 2022

joshdover commented Sep 19, 2022

threatangler-jp commented Sep 19, 2022

joshdover commented Sep 19, 2022

threatangler-jp commented Sep 19, 2022

felixbarny commented May 10, 2023

felixbarny commented Jul 13, 2023

joshdover commented Jul 14, 2023

joshdover commented Jul 14, 2023

[Fleet] Add namespace-specific index and component templates #121118

[Fleet] Add namespace-specific index and component templates #121118

Comments

joshdover commented Dec 13, 2021 • edited Loading

Scope

Design

Existing scheme (as of 8.2)

New proposed scheme

Changing a namespace for an existing integration policy

Customize API

Upgrade considerations

Open questions

elasticmachine commented Dec 13, 2021

dominiqueclarke commented Mar 14, 2022

Synthetics Use Case

mostlyjason commented Mar 14, 2022

threatangler-jp commented Sep 19, 2022

joshdover commented Sep 19, 2022

threatangler-jp commented Sep 19, 2022

joshdover commented Sep 19, 2022

threatangler-jp commented Sep 19, 2022

felixbarny commented May 10, 2023

felixbarny commented Jul 13, 2023

joshdover commented Jul 14, 2023

joshdover commented Jul 14, 2023

joshdover commented Dec 13, 2021 •

edited

Loading