-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory usage of Metricbeat prometheus collector #17004
Comments
Pinging @elastic/integrations-platforms (Team:Platforms) |
Thanks for starting this, the families iterator and grouping before constructing the events should save a fair amount of memory, right? That would also allow to forget about already sent groups |
Yes, I think this would be the best option. Memory usage will still increase with the number of metrics, but at a much lower rate. |
One thing to consider is to move to the same logic that Prometheus uses. They dont use |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Hi! We're labeling this issue as |
:+1 |
Hi! We're labeling this issue as |
I am creating this issue as brain dump to keep track of an issue that has recently appeared in some conversations.
Prometheus collector in Metricbeat can require a lot of memory to process some big Prometheus responses. In general this is not a problem, but this can be an issue in some cases, for example when collecting metrics from the federate API (this can be possibly workarounded by #14983), or when collecting metrics of big Kubernetes clusters or other services with lots of resources.
From my observations Metricbeat can take up to 20 times the memory of a Prometheus response to process it.
Prometheus response processing does the following:
At the moment of reporting, all the objects created to process a response are still in memory, and they can be a lot of objects.
Memory usage of point
1.
could be reduced with stream-parsing, we could assign metrics to events as soon as we parse them, so we don't need to keep the intermediary objects in memory. I did a quick test for this and memory usage was reduced about 20%.The bulk of the memory usage is in the grouping of metrics per label (
2.
), this is not so easy to solve because we don't know if we can have metrics with the same labels on different parts of the file.Some possible approaches for this could be:
2.
only group the families, but don't generate the events (family objects may take less memory than their equivalent maps), after grouping the families generate and send each one of the events.The text was updated successfully, but these errors were encountered: