Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Alerting] Preconfigured alert history index connector #94909

Merged
merged 62 commits into from
Apr 8, 2021

Conversation

ymao1
Copy link
Contributor

@ymao1 ymao1 commented Mar 18, 2021

Resolves #90412

Summary

This PR introduces the idea of a pre-configured alert history Index connector, where we control the mappings and the basic shape of the data being written.

  • Added a config for xpack.actions.preconfiguredAlertHistoryEsIndex, which defaults to false
  • When this config is true, a preconfigured connector is added on action plugin setup and an index template matching the alert-history-* index pattern is created on action plugin start
  • When creating a new rule, the user can select the preconfigured index and see the shape of the data that will be indexed. They will also be able to change the index that is used (although all indices will need to start with the alert-history- prefix in order to be matched by the index template

Screen Shot 2021-03-24 at 3 14 59 PM

  • Example of document that is indexed:
{
	"@timestamp": "2021-03-18T13:18:05.115Z",
	"tags": ["example"],
	"rule": {
		"id": "5cc41cc0-87ec-11eb-ab83-59658ff7789b",
		"name": "test",
		"params": {
			"example__always-firing": {
				"instances": 1
			}
		},
		"space": "default",
		"type": "example.always-firing"
	},
	"alert": {
		"id": "f2402fcb-e257-4ef0-b7d2-9add6c6ac948",
		"actionGroup": "small",
		"actionGroupName": "Small t-shirt"
	},
	"event": {
		"kind": "alert"
	}
}

where context and params are namespaced with the rule type id to avoid mapping conflicts if user is interested in adding more mappings.

Checklist

Delete any items that are not applicable to this PR.

@pmuellr
Copy link
Member

pmuellr commented Mar 18, 2021

If config is set to true, a preconfigured ES Index connector is registered on plugin setup

This points to the big problem with "preconfigured alerts". They really need to be "owned" by a user, so that they'll be run with the right privileges. In this case, we'd have to write to the index as kibana system user, since there is no other user in the picture at this point, but at the same time the data would need to be read by customers, and there's some problems with that (I think). Generally kibana system user doesn't read/write to customer indices, and users don't read/write to indices owned by the kibana system user (directly).

User will need to have a role that gives access to this alert history index

Ya, and kibana system user will need access as well.

Preconfigured schema looks like this:

Smells like ECS! We should look to see how much of this actually maps to ECS ... this is what I use as my cheat-sheet for ECS fields, but there's more doc in that repo (and asciidoc versions also in the usual docs site): https://github.com/elastic/ecs/blob/master/generated/csv/fields.csv

I considered creating an index for every rule type id instead (alert-history-example__always-firing-* for example) but thought that was moving into the "alerts as data" territory.

index-per-rule or index-per-producer would let us apply some role access to the indices, to keep folks with security but not o11y from seeing o11y alert activity. We also need to consider the Kibana space security access bits, which might end up being another part of the index name (SIEM does this today).

Is it too confusing that a user with a role that doesn't include access to alert-history-* can create a rule with actions that throw errors on indexing? This is already something that can happen if a user uses a connector that someone else has made (and they don't have access to the index). But it feels somehow worse because it's a preconfigured index.

If kibana system user is writing this index, then we won't have that problem :-)

We could look into how Stack Monitoring does this - I believe they require some kind of user gesture - maybe even just visiting a page, to set up their alerts, so they can associate the alerts with a user.

If we proceed with this approach, should we look into auto generating the mappings/schemas like the event log? Or is that overkill for this?

I think we could start with hand-rolled mappings/schemas. The properties in params and state are presumably not mapped, and we'd use the flattened type for these?


As a side note, there has been some recent chatter about us providing Kibana Index Patterns for these kind of indices as well - indices Kibana is generating but expecting customers to use in Discover / Lens. Presumably we'd create these at the same time we create the alerts.

@ymao1
Copy link
Contributor Author

ymao1 commented Mar 18, 2021

@pmuellr Thanks for your feedback!

To me, there is a (small?) difference between this and preconfigured alerts, because here, it's still the responsibility of a user to create an alert that uses this special preconfigured connector. Therefore the action that uses the connector would call ES with the user's role. That's why I chose to not bootstrap the actual index on plugin start, just the index template and ILM. I was thinking of it more similarly to the metricbeat and heartbeat indices, where there are index templates for them if and when the customer/user starts writing to the indices. Maybe that's not a good way to think of it though and we should just consider this a kibana system owned index?

@pmuellr
Copy link
Member

pmuellr commented Mar 18, 2021

Oof, you're right - I was triggered by "preconfigured alert" as part of the phrasing, so we should call this something else :-) . "Alert state indexing action" or something? Names, amirite?

So ya, the index will have been written by the user who created the alert itself.

Mentions of ILM also make me wonder if we should consider data streams here. It handles some/all of the ILM aspects, and there are some other benefits like I believe you can optimally do time-range searches since the datastream knows the date ranges at the index level (don't have to search through indices not in the time-range). There are likely some gotchas as well.

I have a feeling that creating the associated resources with the index (datastream, template, ILM, privileges, etc) is going to get complicated, and wondering if we could get by - first pass anyway - with just doc'ing how to do that. That would actually provide some user-level flexibility - they'd pick if they wanted to use a datastream, or set up ILM, or maybe living the lazy template/mapping-free lifestyle is actually ok for them. And may actually work out fine! :-)

The doc for the new config could point to instructions on setting up the various associated resources.

I super-interested in how useful this data will be to customers. I guess we already showed the value with @gmmorris's prototype?

I just noticed state and params are set as enabled: false in the current code, which I think makes this less useful to customers, since that's presumably the data they want to be accessing. And it's likely numeric, so even flattened won't be useful. Runtime fields might work, but then we'd need to describe the shape of the params and state to the customers, or expect them to figure it out themselves with Discover. Maybe the new runtime fields work in the stack makes this less of a problem?

I'm assuming there is interesting stuff in state for customers to make use of, but ... is there? For example, I don't think we actually keep much interesting stuff in the alert or instance state for the index threshold alert. We could, like writing out the current values we're calculating to test against the threshold. Which is slightly weird, in that we'd have to start storing state we don't need for the alert, but just for this action to make use of.

@gmmorris
Copy link
Contributor

gmmorris commented Mar 22, 2021

Thanks for pushing this draft @ymao1 !

I love the idea of a drag-and-drop Index connector that just indexes the alerts into an index, but I do fear that the moving pieces here could make it hard for users to figure out what's going on.

A couple of questions:

Would this support multiple rule types?
I haven't looked at the code yet, but I would expect the different field available in different alerts would cause a mapping clash (unless all dynamic field are enabled:false which has other downsides).
If this is the case, could this be mitigated by generating an index per rule? 🤔
This would mean the index is created by each rule owner, so the index always requires the creator's permissions... my one fear would be what if another user updates the rule 😬

Is this worth the effort given the Alerts-as-data initiative?
This approach is new to me (though I love it!), so I assume some other conversations have taken place which I've missed, but I feel obliged to ask - if we expect Alert-as-data in 7.x, are we better off delivering a smaller feature (that requires less long term maintenance) more like the original POC?

I'll be the first to say my POC was flawed in many ways, but it gave a simple and unambiguous one way to do one specific thing users have asked for. I could argue it both ways tbh, so wondering what @arisonl thinks about this approach.

@ymao1
Copy link
Contributor Author

ymao1 commented Apr 7, 2021

@gchaps @pmuellr Have moved the docs for this to the Index connector page and added some experimental warnings and a note about ILM

Preview here: https://kibana_94909.docs-preview.app.elstc.co/guide/en/kibana/master/index-action-type.html#preconfigured-connector-alert-history

Put a link to that from the preconfigured connector page as well: https://kibana_94909.docs-preview.app.elstc.co/guide/en/kibana/master/pre-configured-connectors.html#build-in-preconfigured-connectors

@ymao1 ymao1 requested review from pmuellr and gchaps April 7, 2021 18:32
Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@gchaps gchaps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


[float]
[[preconfigured-connector-alert-history]]
=== Alert history preconfigured {es} index connector
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This title is a little hard to read. Is the word "preconfigured" needed?

Alert history {es} index connector

@ymao1 ymao1 dismissed mdefazio’s stale review April 8, 2021 20:04

Review comments have been addressed

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
triggersActionsUi 364 365 +1

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
triggersActionsUi 1.5MB 1.5MB +4.2KB
uptime 948.5KB 948.5KB +19.0B
total +4.2KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
core 376.9KB 377.0KB +156.0B
triggersActionsUi 100.3KB 105.0KB +4.7KB
total +4.9KB

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @ymao1

@ymao1 ymao1 added the auto-backport Deprecated - use backport:version if exact versions are needed label Apr 8, 2021
@ymao1 ymao1 merged commit 71ed148 into elastic:master Apr 8, 2021
kibanamachine added a commit to kibanamachine/kibana that referenced this pull request Apr 8, 2021
* Adding preconfigured alert history index

* Adding functions to build alert history document

* Adding functions to build alert history document

* Moving index template creation to plugin start

* Adding unit tests

* Adding unit tests

* Adding unit tests

* Simplifying

* Revert "Merge branch 'master' of https://github.com/elastic/kibana into alerting/default-es-index-schema"

This reverts commit 957c333, reversing
changes made to 4b1b787.

* Reverting some changes

* Reverting some changes

* Adding index override

* Updating UI with index override

* Only allow indexOverride for preconfigured alert history connector

* Handling preconfigured connector id clashes

* Cleanup

* UI unit tests

* Fixing default schema shown in UI

* Fixing functional tests

* Adding functional test

* Fixing functional tests

* Adding docs and link to docs

* Adding config to docker allowlist

* Fixing wrong typescript operator

* Changing default for config to false

* Cleanup

* Adding note about index privileges to docs

* Fixing i18n

* PR fixes

* PR fixes

* PR fixes

* PR fixes - wording

* PR fixes

* Fixing unit and functional tests

* Fixing types check

* ES -> Elasticsearch

* Moving files

* Adding kibana- to beginning of prefix

* Namespacing alert data within schema with kibana

* Fix i18n

* Updating docs

* Fixing unit tests

* Fixing doc links

* Fixing types check

* PR fixes

Co-authored-by: Kibana Machine <[email protected]>
@kibanamachine
Copy link
Contributor

💚 Backport successful

Status Branch Result
7.x

This backport PR will be merged automatically after passing CI.

ymao1 added a commit that referenced this pull request Apr 9, 2021
…#96680)

* [Alerting] Preconfigured alert history index connector (#94909)

* Adding preconfigured alert history index

* Adding functions to build alert history document

* Adding functions to build alert history document

* Moving index template creation to plugin start

* Adding unit tests

* Adding unit tests

* Adding unit tests

* Simplifying

* Revert "Merge branch 'master' of https://github.com/elastic/kibana into alerting/default-es-index-schema"

This reverts commit 957c333, reversing
changes made to 4b1b787.

* Reverting some changes

* Reverting some changes

* Adding index override

* Updating UI with index override

* Only allow indexOverride for preconfigured alert history connector

* Handling preconfigured connector id clashes

* Cleanup

* UI unit tests

* Fixing default schema shown in UI

* Fixing functional tests

* Adding functional test

* Fixing functional tests

* Adding docs and link to docs

* Adding config to docker allowlist

* Fixing wrong typescript operator

* Changing default for config to false

* Cleanup

* Adding note about index privileges to docs

* Fixing i18n

* PR fixes

* PR fixes

* PR fixes

* PR fixes - wording

* PR fixes

* Fixing unit and functional tests

* Fixing types check

* ES -> Elasticsearch

* Moving files

* Adding kibana- to beginning of prefix

* Namespacing alert data within schema with kibana

* Fix i18n

* Updating docs

* Fixing unit tests

* Fixing doc links

* Fixing types check

* PR fixes

Co-authored-by: Kibana Machine <[email protected]>

* Fixing docs link

Co-authored-by: Kibana Machine <[email protected]>
phillipb added a commit to phillipb/kibana that referenced this pull request Apr 12, 2021
…to-metrics-tab

* 'master' of github.com:elastic/kibana: (44 commits)
  [Exploratory View]Additional metrics for kpi over time (elastic#96532)
  [Fleet] UI changes on hosted policy detail view (elastic#96337)
  Stacked line charts incorrectly shows one term as 100% (elastic#96203)
  [Fleet] Create enrollment API keys as current user (elastic#96464)
  [Lens] Make table and metric show on top Chart switcher (elastic#96601)
  skip flaky suite (elastic#96691)
  [Lens] Hide "Show more errors" once expanded (elastic#96605)
  [Discover] Unskip histogram hiding test (elastic#95759)
  skip flyout test, add linked issue elastic#96708
  skip copy_to_space_flyout_internal.test.tsx elastic#96708
  fix config validation (elastic#96502)
  Document telemetry fields for stack security features (elastic#96638)
  [Partial Results] Move inspector adapter integration into search source (elastic#96241)
  [RAC] Rule registry plugin (elastic#95903)
  [APM] Run precommit tasks sequentially (elastic#96551)
  [Maps] fix Kibana does not recognize a valid geo_shape index when attempting to create a Tracking Containment alert (elastic#96633)
  [Security Solution] [Cases] Small UI bugfixes (elastic#96511)
  [Actions UI] Changed PagerDuty action form UI to fill payload fields according to the API docs for Resolve and Acknowledge events. (elastic#96363)
  App Search: Result Component Updates (elastic#96184)
  [Alerting] Preconfigured alert history index connector (elastic#94909)
  ...
@ymao1 ymao1 deleted the alerting/default-es-index-schema branch May 19, 2021 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed Feature:Actions release_note:enhancement Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) Team:Uptime - DEPRECATED Synthetics & RUM sub-team of Application Observability v7.13.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

It is too cumbersome to index an active alert instance "as-is" into a document in Elasticsearch