-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More alerting services telemetry #60315
Comments
Pinging @elastic/kibana-alerting-services (Team:Alerting Services) |
Could also have some telemetry about how many users upgraded or started a trial because of alerting gold+ features |
Yes, this would be a great metric to track. +1 ^^ |
@arisonl can you provide a filtered list of the above that we would like to start tracking? |
A couple of thoughts on what we will want to measure from a product perspective (non-exhaustive and not final): Adoption:
Retention:
Usage:
Higher level questions:
There is a number of lower level items in the description of this issue and I would like to understand them better. If their purpose is to optimise the lower level facilities of the framework, I think that prioritisation is up to engineering. |
Hey all, quick bump on this thread. I know the team is heads down on GA but do we have an idea of when telemetry on connector type might land? |
@alexfrancoeur what depth of telemetry are you thinking about for connector types? I took a quick look and we should already be tracking the following today:
|
Hey Mike, thanks for clarifying. I think this level of granularity will work for now. I see the fields now in the our doc but the fields aren't mapped at the moment. I'll create an issue to do so. Appreciate the quick follow up! |
@alexfrancoeur ah that may be why, the telemetry team may be having issues mapping these because they contain dynamic keys. If they don't have a work around for this and need changes on the alerting side, let me know! cc @Bamieh |
Hey @mikecote -- Just wanted to voice the need for more telemetry related to "trial license" and connectors. (As you know ;)) We'll be enabling the first location-based alert in 7.11* and it will be Gold+ license. I am interested in knowing when customers upgrade their license to use the Tracking Alert. I am interested in knowing what connectors (by count) are being used per Tracking Alert. *Tracking Alert is included in 7.10 as experimental; and requires turning on a feature flag to enable. |
Thanks @kmartastic, I've added an item to the issue description to capture your scenario. We will be reviewing these requirements next week 🙏 |
We do not encourage dynamically mapped fields since they are not supported on our cluster. Dynamic fields come with a lot of drawbacks. Please reachout to the telemetry team to discuss possible solutions around this before working on the requirements to avoid sending more dynamic fields. |
I've updated the description to add some clarity for whoever pick this issue up. :) |
Updated the description to include an item for keeping track of rules that exceed their timeout. |
@YulNaumenko, thanks for driving this issue through 7.16 / 8.0 ❤️! After today's planning session and given we're past 8.0, we've done as much as we can regarding additional telemetry points, and we can now move this issue back into the backlog for now. |
Closing as 8.0 has sufficient telemetry. |
What is this issue about?
We have identified that there are some gaps in our telemetry and we would like to close those gaps.
The proposed list below is partial and most likely missing some important metrics, so please use them as inspiration rather than clear requirements.
To do:
Prioritisation:
There is probably a lot of telemetry we could add, so to keep this issue focused, here are the areas to address in order of priority:
Proposed missing telemetry
Below is a list of missing telemetry that we have identified.
We might not want to add them all, and there might be critical telemetry missing in this list - please use as inspiration for detailed research, rather than hard requirements.
For potential guardrails
Stability
read
?decrypt
?unknown
?license
? What is the count by rule type of execution failures with an error reason ofexecute
?7.x changes
status:failed
? How many per cluster?General usage of alerting / connectors / task manager
The text was updated successfully, but these errors were encountered: