[RAC] turn off observability alerts as data writing in a more granular way #119602

mgiota · 2021-11-24T12:47:38Z

Fixes #119217

xpack.ruleRegistry.write.disabledRegistrationContexts flag was introduced to disable writing to observability alerts-as-data indices in a more granular way.

The registration contexts we use are:

observability.logs
observability.metrics
observability.apm
observability.uptime

How to test

In kibana.dev.yml create a new config xpack.ruleRegistry.write.disabledRegistrationContexts : ['observability.logs'] (you could try with other values from the above list)
Delete alerts indices by restarting ES and kibana in a local setup. If you use a CCS setup make sure you stop Kibana and delete alerts indices, index templates and component templates. Here's an example of how you could reset your cluster

http DELETE 'https://YOUR_ENDPOINT/_index_template/.alerts*' && 
http DELETE 'https://YOUR_ENDPOINT/_component_template/.alerts*' && 
http DELETE 'https://YOUR_ENDPOINT/.kibana*,.internal*,.tasks*'

Restart kibana
Create a new rule with sensitive thresholds for log threshold (if you specified another registration context in kibana.dev.yml create a rule of that type accordingly)
Verify that writing to the specified registration context(s) is disabled. You could verify this under Kibana > Stack Management > Index Management.
- Under Indices tab make sure to enable the Include hidden indices toggle, search for .internal.alerts and verify that nothing appears on the list.
- Under Index templates tab, enable View System templates, search for alerts and verify that nothing appears on the list.
- Verify that the specified disabled registration contexts don't appear under Component templates
You could also verify that disabling writing to specified registration contexts works, by creating rules for the specified registration contexts and making sure that no alerts appear on the Alerts table

…a plugin service and not in the resourceInstaller

mgiota · 2021-11-30T07:15:37Z

@elasticmachine merge upstream

fkanout · 2021-11-29T16:06:55Z

x-pack/plugins/rule_registry/server/rule_data_plugin_service/rule_data_plugin_service.ts

@@ -195,7 +200,8 @@ export class RuleDataService implements IRuleDataService {
    return new RuleDataClient({
      indexInfo,
      resourceInstaller: this.resourceInstaller,
-      isWriteEnabled: this.isWriteEnabled(),
+      isWriteEnabled:
+        this.isWriteEnabled() && !this.isRegistrationContextDisabled(registrationContext),


Isn't confusing to combine isWriteEnabled and disabledRegistrationContexts with the same key in RuleDataClient? Because it would be hard to determine why the value is true/false.

Besides, from a convention perspective, it seems that the relation here is 1:1 in the RuleDataClient

@fkanout Great question, yep you are right. In the beginning I thought to add a separate key, but that means I would need to change more places in the code (places where isWriteEnabled is called, I should check for disabledEegistrationContexts as well). The way I have it now is more centralized. I do the checks in one place.

@weltenwort is there a benefit of combining isWriteEnabled and disabledRegistrationContexts into one key vs splitting it into separate keys?

Maybe we can try to answer the opposite question: Why would the RuleDataClient need to know about the reason for being disabled? In the spirit of keeping the coupling as loose as possible it would make sense not to introduce the knowledge about a per-registration-context disablement feature out of the RuleDataClient if not necessary.

So what would be the benefit of passing them separately?

@weltenwort Ok if we want to keep the coupling as loose as possible, it makes sense to keep the knowledge about per-registration-context disabling out of the RuleDataClient. @fkanout do you have any objections?

I think the knowledge that RuleDataClient already has via isWriteEnabled is similar to isRegistrationContextDisabled , the difference is only the granularity of that info. (everything/selected things). Unless if RuleDataClient shouldn't know about isWriteEnabled in the first place - That, I don't know.

My understanding is once Alert-as-Data is adopted across our products, the general xpack.rule_registry.write.enabled will be deprecated. While the xpack.ruleRegistry.write.disabledRegistrationContexts could stay for long. From that point, I would say they should be separated.

@fkanout What if we add even more granularity and disable writing per rule type id? Would we want to pass one more key for example disabledRuleTypeIds? Does the RuleDataClient need to know this level of detail? I don't know what is the correct answer.

For now I would keep it as it is. We could discuss it further if you want and come up with the most appropriate solution.

IMHO, the only thing that RuleDataClient should know is if isWriteEnabled true or false. All the complex logics in it, makes it hard to read/understand.

I would implement all the logic in this.isWriteEnabled() method at line 118.
Otherwise, what is the reason to create such a method? this.options.isWriteEnabled could be used in everywhere.

So, it would be:

public isWriteEnabled(): boolean { return this.options.isWriteEnabled && !this.options.disabledRegistrationContexts.includes(registrationContext); }

@ersin-erdal Sounds good to me. Let me refactor and push the changes.

I would implement all the logic in this.isWriteEnabled() method at line 118.

It could be more readable, but it is still the same solution. We have a consensus about it, so let's merge it like that 👍🏻

fkanout

LGTM

kibana-ci · 2021-12-01T14:31:35Z

💚 Build Succeeded

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`ruleRegistry`	141	145	+4

Unknown metric groups

API count

id	before	after	diff
`ruleRegistry`	167	171	+4

History

💚 Build #9885 succeeded b406c61
💚 Build #9882 succeeded f04b4b9c6d5f83d614475c26919c9db0ac971661
💚 Build #9864 succeeded 9c3344e
💛 Build #9227 was flaky c4b9e72
💚 Build #9170 succeeded 59c3b6e
💔 Build #9011 failed 1cea679

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @mgiota

…r way (elastic#119602) * [RAC] turn off writing to disabled alerts indices * fix error * fix errors * do not install component templates for disabled registration contexts * add resource installer unit tests * refactoring: disable installing index level resources in the rule data plugin service and not in the resourceInstaller * refactor based on review comments * update comment for isWriteEnabled method Co-authored-by: Kibana Machine <[email protected]>

kibanamachine · 2021-12-01T16:16:22Z

💚 Backport successful

Status	Branch	Result
✅	8.0

This backport PR will be merged automatically after passing CI.

…r way (#119602) (#120126) * [RAC] turn off writing to disabled alerts indices * fix error * fix errors * do not install component templates for disabled registration contexts * add resource installer unit tests * refactoring: disable installing index level resources in the rule data plugin service and not in the resourceInstaller * refactor based on review comments * update comment for isWriteEnabled method Co-authored-by: Kibana Machine <[email protected]> Co-authored-by: mgiota <[email protected]>

fkanout · 2021-12-01T18:18:00Z

I approved the PR, as it fulfills the ACs. However, I shared a couple of questions/scenarios with @mgiota. These could be edge cases or false positives. However, I will follow up here to share them and to get everyone's feedback/ thoughts.

Scenario:

The flag xpack.ruleRegistry.write.disabledRegistrationContexts is OFF (all contexts are allowed)
Kibana starts and indices initiated with Alert-as-Data.
Create a rule. e.g. Logs thresholds
Alerts are ingested
Shutdown Kibana
Turn the flag xpack.ruleRegistry.write.disabledRegistrationContexts ON with observability.logs
Rerun Kibana.
Try to create a rule for Logs thresholds

Questions ⁉️:
A. The Alert-as-Data Log indices are still there, and we can create a rule based on their field. Is that Ok?
B. What going to happen if we carry on and create a rule? Will we have alerts?
C. Will the rule registry still update the alert status, as the check against the flag is done in the initiation phase?
D. What is the behavior of the life-cycle-executor after restarting Kibana? Does it have the latest status of a rule and its alerts?

I tried to make it as plain as I could and be careful with the terminologies. However, please feel free to ask if something is not clear.

jasonrhodes · 2021-12-01T19:18:23Z

Thanks @fkanout -- from my perspective, in your scenario, I would expect no writes at all to happen for any rule with "observability.logs" as its registration context after step 7.

Note: In all cases with this new flag it should work exactly the same as xpack.ruleRegistry.write.enabled: false works. If we determine that we need to change how the disabledRegistrationContexts flag works, we should be sure to change write.enabled to work the same way when all contexts are off, too.

For your questions:

A. The Alert-as-Data Log indices are still there, and we can create a rule based on their field. Is that Ok?

We shouldn't remove existing indices, so that makes sense. And rule creation should absolutely continue -- this scenario is probably most likely to occur for someone who is using the alerting framework to run rules against their data but doesn't want the alert documents to be written/updated for some reason.

B. What going to happen if we carry on and create a rule? Will we have alerts?

The rule should execute as normal, and schedule actions as normal. It should not create any new alert documents or update any previously created alert documents.

C. Will the rule registry still update the alert status, as the check against the flag is done in the initiation phase?

Good question. My understanding is that this flag is checked on every write, but I may be mistaken. We should confirm, because I don't think we should continue to update alerts if this flag is off. However, as I mentioned above, if this is already not the case for the write.enabled: false scenario, we'd have to explore changing that as well and what impact that might have.

D. What is the behavior of the life-cycle-executor after restarting Kibana? Does it have the latest status of a rule and its alerts?

Not sure I follow this one. The task manager will start up again, find all rules, and start running them again. It will handle passing in the "previous run" state just as it usually does on subsequent executions, as if the restart didn't happen. We do a lookup of the existing alerts to update during the executor phase, so I think we would find any previous alerts and continue to update them, but I'm not sure if we treat the restart as a "resolved" state for any alerts that were active when the system was restarted. I imagine we don't, I'm not sure what we expect here.

fkanout · 2021-12-02T11:19:02Z

Thank you, @jasonrhodes!
For C, confirmed, the flag checked in every write. i.e., If the flag is ON (disabled context), then no alerts will be written.

…r way (elastic#119602) * [RAC] turn off writing to disabled alerts indices * fix error * fix errors * do not install component templates for disabled registration contexts * add resource installer unit tests * refactoring: disable installing index level resources in the rule data plugin service and not in the resourceInstaller * refactor based on review comments * update comment for isWriteEnabled method Co-authored-by: Kibana Machine <[email protected]>

mgiota force-pushed the 119217_turn_off_alerts_granular branch from 16fe39b to 8df538d Compare November 24, 2021 13:08

[RAC] turn off writing to disabled alerts indices

e0f87f8

mgiota force-pushed the 119217_turn_off_alerts_granular branch from 8df538d to e0f87f8 Compare November 24, 2021 13:12

mgiota added 2 commits November 24, 2021 14:29

fix error

1cea679

fix errors

59c3b6e

mgiota marked this pull request as ready for review November 25, 2021 07:32

mgiota self-assigned this Nov 25, 2021

mgiota marked this pull request as draft November 25, 2021 08:21

mgiota added 2 commits November 25, 2021 09:59

do not install component templates for disabled registration contexts

c4b9e72

add resource installer unit tests

9c3344e

mgiota force-pushed the 119217_turn_off_alerts_granular branch from 5a4db77 to 9c3344e Compare November 29, 2021 23:37

refactoring: disable installing index level resources in the rule dat…

5aa1d17

…a plugin service and not in the resourceInstaller

mgiota force-pushed the 119217_turn_off_alerts_granular branch from f04b4b9 to 5aa1d17 Compare November 30, 2021 07:10

mgiota marked this pull request as ready for review November 30, 2021 07:12

Merge branch 'main' into 119217_turn_off_alerts_granular

b406c61

fkanout reviewed Nov 30, 2021

View reviewed changes

fkanout approved these changes Dec 1, 2021

View reviewed changes

mgiota added 2 commits December 1, 2021 14:01

refactor based on review comments

0e72f09

update comment for isWriteEnabled method

8a77d8f

mgiota merged commit 5deb23b into elastic:main Dec 1, 2021

kibanamachine mentioned this pull request Dec 1, 2021

[8.0] [RAC] turn off observability alerts as data writing in a more granular way (#119602) #120126

Merged

mgiota mentioned this pull request Dec 3, 2021

[Observability RAC] document xpack.ruleRegistry.write.disabledRegistrationContexts flag elastic/observability-docs#1310

Closed

mgiota deleted the 119217_turn_off_alerts_granular branch January 4, 2022 10:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RAC] turn off observability alerts as data writing in a more granular way #119602

[RAC] turn off observability alerts as data writing in a more granular way #119602

mgiota commented Nov 24, 2021 •

edited

Loading

mgiota commented Nov 30, 2021

fkanout Nov 29, 2021

mgiota Nov 30, 2021 •

edited

Loading

weltenwort Nov 30, 2021

mgiota Nov 30, 2021

fkanout Nov 30, 2021

mgiota Nov 30, 2021

ersin-erdal Dec 1, 2021 •

edited

Loading

mgiota Dec 1, 2021

fkanout Dec 1, 2021 •

edited

Loading

fkanout left a comment

kibana-ci commented Dec 1, 2021

API count

kibanamachine commented Dec 1, 2021

fkanout commented Dec 1, 2021 •

edited

Loading

jasonrhodes commented Dec 1, 2021

fkanout commented Dec 2, 2021

[RAC] turn off observability alerts as data writing in a more granular way #119602

[RAC] turn off observability alerts as data writing in a more granular way #119602

Conversation

mgiota commented Nov 24, 2021 • edited Loading

How to test

mgiota commented Nov 30, 2021

fkanout Nov 29, 2021

Choose a reason for hiding this comment

mgiota Nov 30, 2021 • edited Loading

Choose a reason for hiding this comment

weltenwort Nov 30, 2021

Choose a reason for hiding this comment

mgiota Nov 30, 2021

Choose a reason for hiding this comment

fkanout Nov 30, 2021

Choose a reason for hiding this comment

mgiota Nov 30, 2021

Choose a reason for hiding this comment

ersin-erdal Dec 1, 2021 • edited Loading

Choose a reason for hiding this comment

mgiota Dec 1, 2021

Choose a reason for hiding this comment

fkanout Dec 1, 2021 • edited Loading

Choose a reason for hiding this comment

fkanout left a comment

Choose a reason for hiding this comment

kibana-ci commented Dec 1, 2021

💚 Build Succeeded

Metrics [docs]

Public APIs missing comments

API count

History

kibanamachine commented Dec 1, 2021

💚 Backport successful

fkanout commented Dec 1, 2021 • edited Loading

jasonrhodes commented Dec 1, 2021

fkanout commented Dec 2, 2021

mgiota commented Nov 24, 2021 •

edited

Loading

mgiota Nov 30, 2021 •

edited

Loading

ersin-erdal Dec 1, 2021 •

edited

Loading

fkanout Dec 1, 2021 •

edited

Loading

fkanout commented Dec 1, 2021 •

edited

Loading