Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Add namespace-specific index and component templates #121118

Closed
1 of 5 tasks
Tracked by #133740
joshdover opened this issue Dec 13, 2021 · 12 comments
Closed
1 of 5 tasks
Tracked by #133740

[Fleet] Add namespace-specific index and component templates #121118

joshdover opened this issue Dec 13, 2021 · 12 comments
Labels
enhancement New value added to drive a business result Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@joshdover
Copy link
Contributor

joshdover commented Dec 13, 2021

Fleet and Elastic Agent users need a mechanism to customize how their data is being ingested, mapped, and stored that will be preserved across Stack and integration package upgrades. This issue outlines how we plan to structure index and component templates to support user customizations to Fleet-managed data streams, on a namespace-level of granularity.

Scope

The goal is provide a future-proof naming scheme and template structure that will allow users to add the following customizations to Fleet-managed data streams:

  • Mappings (additive and non-additive)
  • ILM policy
  • Number of replicas, primaries, and routing shards
  • Refresh interval
  • Other general index settings (query.*, etc.)

This scheme does not allow customizing:

  • Ingest pipelines
    • Elasticsearch does not support more an arbitrary number of ingest node pipelines so the only existing way to customize ingest pipelines is to modify the one installed by Fleet which will not be preserved across package upgrades. See Specify multiple ingest pipelines for a data stream elasticsearch#61185
    • If Elasticsearch were to add a default_pipelines setting which is an array of pipelines, it’s likely that this customization scheme would be compatible.

Design

Existing scheme (as of 8.2)

The existing scheme that we use in Fleet today installs a single index template for each dataset in a package that matches data streams for all namespaces. It has the following properties:

  • name: <type>-<dataset>
  • matches: <type>-<dataset>-*
  • priority: 200
  • Component templates (highest to lowest precedence):
    • .fleet_agent_id_verification-1
      • final pipeline & mappings for agent_id verification (can optionally be disabled in kibana.yml)
    • .fleet_globals-1
      • global settings and mappings applied to every data stream (eg. event.ingested)
    • <type>-<dataset>@custom
      • user-defined customizations (settings and/or mappings - for all namespaces)
    • <type>-<dataset>@package
      • package-defined mappings and settings

New proposed scheme

In order to preserve user customizations across upgrades, it’s important that we store their overrides in a separate component template that Fleet can copy over to new versions of the package’s index template. In this updated scheme, we will add an additional index template that is namespace-specific and of higher priority than the base template:

  • name: <type>-<dataset>-<namespace>
  • matches: <type>-<dataset>-<namespace>
  • priority: 250
  • Components (highest to lowest precedence):
    • .fleet_agent_id_verification-1
      • final pipeline & mappings for agent_id verification (can optionally be disabled in kibana.yml)
    • .fleet_globals-1
      • global settings and mappings applied to every data stream (eg. event.ingested)
    • <type>-<dataset>-<namespace>@custom
      • namespace-specific user-defined customizations
    • <type>-<dataset>@custom
      • user-defined customizations (settings and/or mappings - for all namespaces)
    • <type>-<dataset>@package
      • package-defined mappings and settings

During package upgrades, Fleet would preserve the contents of both the ‘global’ custom template (<type>-<dataset>@custom) and the namespace-specific ones (<type>-<dataset>-<namespace>@custom) while replacing all of the other templates (including the index template). This would allow the user’s customizations to be preserved and to override any package-specific settings and mappings.

Like the ‘global’ custom template we offer today, we would allow users to directly edit the namespace-specific templates with arbitrary settings and mappings in order to override those supplied by the package. We would also use the template to store customizations that we plan to support directly in the UI (eg. setting the ILM policy).

We will not remove the base index template we install today that matches a wildcard namespace (<type>-<dataset>-*) because Elastic Agent standalone requires this template to be installed.

Changing a namespace for an existing integration policy

If a user edits an existing integration policy to point to a new namespace, we can offer them the option to copy over any customizations from the previous namespace’s <type>-<dataset>-<namespace>@custom template. We would not delete the old templates since this could affect the existing data streams and indices or any standalone agents ingesting data into this namespace.

As a separate enhancement, we could offer a ‘cleanup’ UI either in Fleet or Stack Management that shows index templates that are not currently in use.

Customize API

In order to facilitate automated usage of this scheme, we should provide a high-level package customization Kibana Fleet API in Kibana that allows admins to make customizations without worrying about the low-level details of how the templates are configured, whether or not a data stream needs to be rolled over, or how to apply the setting changes retroactively to backing indices. The main usecase for this is for standalone Agent usage. This may also be used to power in-app features for making customizations (eg. setting the ILM policy).

# Write custom settings and mappings to all namespaces
# Writes to `<type>-<dataset>@custom` templates
PUT /api/fleet/epm/nginx/customize
{
  "settings": { … },
  "mappings": { … },
}

# Add or update a namespace for an integration, creates the namespace-specific templates
# Write custom settings and mappings to namespace
# Writes to `<type>-<dataset>-<namespace>@custom` templates
PUT /api/fleet/epm/nginx/customize/namespace/foo
{
  "settings": { … },
  "mappings": { … },
}

# Removes a namespace, deleting namespace-specific templates
# Does not delete data indices or data streams
DELETE /api/fleet/epm/nginx/customize/namespace/foo

All of the other APIs should also create these namespaces automatically. For example, if an integration policy is added for the nginx package on the bar namespace, the POST /api/fleet/package_policies API should also create the appropriate namespace templates if they don't already exist.

There are additional use cases for this API outside of index templates, for example there have been other requests for namespace-specific transforms. We should design this API to accommodate future use cases easily.

Upgrade considerations

For packages that were installed before this scheme was introduced, Fleet should automatically add the appropriate namespace-specific index and component templates in order to facilitate a consistent experience for end-users. See #121099

For upgrades where any @custom components already exist, they should be retained and not removed so that they are still present once the new package version is installed. This means existing templates should also not get overwritten.

Open questions

  • When should namespace-specific templates be deleted when using the product?
    • If we're going to support a generic API that doesn't require integration or agent policies to point to namespaces, then I don't think we can do any automated cleanup else we could delete configuration that is in use by a standalone agent.
  • There are separate @custom component templates for each data stream in an integration package, while the API design proposed here would apply to the entire integration. This can present problems if a user manually edits a single component template so the data streams are not in sync, for example the source of truth is now ambiguous. How would we solve this?
    • Have a single, managed component template that is used for customizations that apply to the entire integration. Leave the @custom templates unmanaged and never edit them. (@joshdover votes for this one)
    • Store the customizations set on this API in a Kibana Saved Object and use this as the source of truth. Manual user edits to @custom templates would then be merged in after settings from this SO. This would allow manual additions and modifications to @custom templates to be preserved, however deletes would be lost.
  • How should namespace renames work? If a user renames the namespace field on an integration policy or agent policy, should we attempt to copy any customizations on the previous namespace when creating the new namespace? If not or if the new namespace already exists, should we warn the user that settings/mappings are going to change for this data?
  • How do we handle when a new dataset is added for an existing package? Should we keep a copy of any custom settings/mappings in a Saved Object and automatically apply them to all datasets during package upgrades?
  • Should the management APIs allow changes to mappings? If so when and how would the user expect these to take effect e.g would a rollover be automatic?
@joshdover joshdover added the Team:Fleet Team label for Observability Data Collection Fleet team label Dec 13, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@dominiqueclarke
Copy link
Contributor

Synthetics Use Case

All monitors added in Monitor Management, use data streams to write back results to ES. There is a separate data stream for each monitor type (ICMP, HTTP or TCP), with browser monitors being further split down between the data sets we store (network, screenshot etc.).

In addition, the namespace that’s been defined when setting up the monitor (which will be default by default), is appended to the name of the data stream.

All monitors configuration is stored as Kibana saved objects, outside of the Fleet integration policy ecosystem. More information on UI Monitor Management and the Synthetics Service: https://docs.google.com/presentation/d/18eT7xgyqJ5TG5srOeHUN8bFrY-5MBZIpp3TIEcTpI5M/edit#slide=id.g1055e0afed4_0_548

As a result, Fleet will have no way of knowing when a new non-default namespace monitor is created.

To enable the generation of namespaced component and index templates, Synthetics will require a way to generate the appropriate component and index templates on the fly whenever a non-default namespace monitor is created.

If Fleet could expose the logic for creating these templates on its plugin contract, Synthetics could hook into and reuse the logic from this implementation for UI Monitor Management and the Synthetics Service.

cc: @joshdover @andrewvc

@mostlyjason
Copy link
Contributor

Thanks @dominiqueclarke adding a link to the Uptime issue that explains that the need isn't just to create namespaces dynamically, but also to configure custom ILM policies for each elastic/uptime#453. This will require us to setup index and component templates for each namespace so that users can set custom ILM policies.

@threatangler-jp
Copy link

Checking in on the status of this. What version do you expect it to be included in? And is there a workaround available? Our specific need is to apply different ILM policies to different sets of agents. Thank you!

@joshdover
Copy link
Contributor Author

Hi @threatangler-jp 👋, thanks for the question. We do not have a public ETA on this, however we have published documentation on how to do this manually for the exact use case you have here: https://www.elastic.co/guide/en/fleet/8.4/data-streams-ilm-tutorial.html

@threatangler-jp
Copy link

Thank you @joshdover. We are on v8.3.3 and this workaround appears to only be available on v8.4. Let me know if I am mistaken on that factor.

We will eventually upgrade but upgrading is a heavy lift and so that will take some time.

@joshdover
Copy link
Contributor Author

@threatangler-jp Looks like we moved the doc from 8.3 to 8.4. It's supported on 8.3 though, here's the 8.3 doc: https://www.elastic.co/guide/en/fleet/8.3/data-streams.html#data-streams-ilm-tutorial

@threatangler-jp
Copy link

@joshdover great news! Thank you :)

@felixbarny
Copy link
Member

In general, I'm excited about this but I'm bit worried that this is too tightly coupled to Fleet. Ideally, these customization extension points should also be available for data streams that aren't managed by Fleet. Maybe we can find a way to integrate this more tightly into Elasticsearch itself.

cc @dakrone @jbaiera @eyalkoren

This need is getting more important because of the reroute processor. Users can route data to dynamic data streams that aren't set up via Fleet. For example:

- reroute:
  dataset: {{service.name}}

Fleet can't know all values of service.name upfront to create dedicated index templates for each individual service.name with all of the extension points. That's one of the reason we're making the built-in logs-*-* index template more potent:

One of the ways we could do that is to still rely on the @custom component templates as proposed in the issue and add them to the logs-*-* index template.

We'd need to add placeholders into the component template, though, for example logs-{{data_stream.dataset}}@custom. Another option would be to store these customizations in another entity. Maybe in the data stream itself rather than the index template. Or have an option for component template to inject themselves into index templates and data streams. But that sounds similar to legacy index templates where it's hard to determine the effective settings when multiple templates are merged together.

@felixbarny
Copy link
Member

I've created an Elasticsearch issue for this: elastic/elasticsearch#97664. I'd like to propose closing this issue in favor of the Elasticsearch issue as I think this feature shouldn't be exclusive to Fleet. More on the reasoning about that in the issue.

@joshdover
Copy link
Contributor Author

+1 on moving this to Elasticsearch, though we will still need to do work in Fleet to add these additional component template names to our index templates, which is largely the same work as before.

@joshdover
Copy link
Contributor Author

Oh sorry, yes we can close this one but we still need #149484

@joshdover joshdover closed this as not planned Won't fix, can't repro, duplicate, stale Jul 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

No branches or pull requests

7 participants