-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update kubernetes apiserver metrics and dashboard #31973
Update kubernetes apiserver metrics and dashboard #31973
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! It's good to see such cleanups in the codebase!
CHANGELOG.next.asciidoc
Outdated
@@ -61,6 +61,7 @@ https://github.com/elastic/beats/compare/v8.2.0\...main[Check the HEAD diff] | |||
- make `system/filesystem` code sensitive to `hostfs` and migrate libraries to `elastic-agent-opts` {pull}31001[31001] | |||
- Fix kubernetes module's internal cache expiration issue. This avoid metrics like `kubernetes.container.cpu.usage.limit.pct` from not being populated. {pull}31785[31785] | |||
- add missing HealthyHostCount and UnHealthyHostCount for application ELB. {pull}31853[31853] | |||
- update kubernetes apiserver metricset to not collect deprecated metrics and fix dashboard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR number?
for _, event := range events { | ||
if ok, _ := event.HasKey("request.count"); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a really nice cleanup! 🍻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was ugly
@@ -42645,16 +42645,6 @@ Kubernetes API server metrics | |||
|
|||
|
|||
|
|||
*`kubernetes.apiserver.request.client`*:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a note in the PR's description that it's not a breaking change but a bug fix instead to avoid future confusions.
* Update kubernetes apiserver metrics and dashboard
What does this PR do?
Update kubernetes apiserver metrics and dashboard
Why is it important?
As described in #31834 apiserver metricset collects deprecated metrics from apiserver prometheus endpoint. These are:
http_request_duration_microseconds
http_request_size_bytes
http_response_size_bytes
http_requests_total
apiserver_request_latencies
etcd_object_counts
client
Some of those values can be taken instead from different prometheus fields:
http_request_duration_microseconds
---> Nonehttp_response_size_bytes
---> Nonehttp_requests_total
---> Noneapiserver_request_latencies
--->apiserver_request_duration_seconds
http_request_size_bytes
---> Noneetcd_object_counts
--->apiserver_storage_objects
Also
apiserver_watch_events_sizes
andapiserver_response_sizes
are interesting metrics we where not collecting.As part of this PR the following elasticsearch fields have been dropped (they where null in the latest kubernetes versions (after 1.20))
kubernetes.apiserver.http.*
kubernetes.apiserver.request.latency.*
kubernetes.request.client
and new fields have been added:
kubernetes.apiserver.watch.events.kind
kubernetes.apiserver.watch.events.size.bytes.*
kubernetes.apiserver.response.size.bytes.*
Also the OOTB dashboards where broken because they where using deprecated fields. They have been updated.
Checklist
My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have made corresponding change to the default configuration files
I have added tests that prove my fix is effective or that my feature works
I have added an entry in
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Closes Kubernetes Apiserver metricset collects deprecated metrics #31834
How to test this PR locally
Note
Bullet 3 of #31834 (comment). for storing prometheus histogram types as histograms in elasticsearch will be part of a follow up PR as the implementation needs further investigation. It requires a more generic solution and it is not only for apiserver.
Also removing the following elasticsearch fields as part of this PR should not be considered as a breaking changes, but as a bug fix as those fields where no longer populated to ES.
kubernetes.apiserver.http.*
kubernetes.apiserver.request.latency.*
kubernetes.request.client