[RFC] Performance metrics framework #6533
Labels
enhancement
Enhancement or improvement to existing feature or request
RFC
Issues requesting major changes
Roadmap:Stability/Availability/Resiliency
Project-wide roadmap label
Is your feature request related to a problem? Please describe.
Yes. Tasks such as Source Peer Recovery, Target Peer Recovery, Snapshots, Merges, Shard Initialization, Ultrawarm migrations etc. can significantly consume the system resources and impact search and indexing latency. Impact is hard to quantify with current state of metrics measured around them. Its not just lack of metrics, but also the limitations in measurement framework - Resource Tracking Framework, Stats API and and Performance Analyzer, to measure the consumption at a desired granularity.
Describe the solution you'd like
Using concepts from OpenTelemetry like Trace, Span, Context, Context Propagation, Event, Instrument and Meter.
This example demonstrates creation of spans, emitting events and metering resource usage using OpenTelemetry for Source Peer Recovery in OpenSearch.
Using the code reference from the Otel java manual to explain the intended framework -
Create Span, Set Attributes
Add Events
Metering Usage
The way Resource Tracking Framework records CPU and memory allocation at a Task level, it can also be used as one of the Instrument to meter CPU, Memory usage and Thread Contention of a thread level and the value of the meter can be observed at the Start and End event of any operation.
A sample log entry would look like -
Decorating events with system level metric using Performance Analyzer
Performance Analyzer (PA) agent (runs as a separate process) Metric Processor today parses the start and end event generated by PA plugin in system memory and decorates the event metrics with system resources consumption. For example, the shard search/bulk requests are decorated later with OS/Node statistics here.
Similarly, the event generated above can be integrated with PA event, to decorate them with system resources metrics, which can be useful to understand resource bottlenecks at a finer granularity of node->shard->operation->thread. Integration would require following enhancements -
Once integrated, we will see these metrics associated with each of these events - Metric reference - https://opensearch.org/docs/latest/monitoring-your-cluster/pa/reference/
Decorating events with system level metrics may not be useful for every one and will require PA agent to be installed and enabled on a node.
Both event generation and decoration part would be an opt-in feature and will be disabled by default.
System Overhead
Event generation part shouldn't have any additional overhead other than what we already incur while recording resource usage of tasks by Resource Tracking Framework. This is something which can be confirmed after the prototype and running benchmark.
Event decoration with system level metrics should be done with care and not at per http/shard request level. Number of background tasks such as Peer Recovery, snapshots etc are proportional to rate of growth of volume of data, indexing policy or cluster events and doesn't grow with number of requests. Thus theoretically, the system overhead should be very minimal and acceptable because of number of events generated by them. Also, event decoration is done outside of OpenSearch process in PA agent and is optional, so regular OpenSearch users who don't use this feature will not see any impact.
If we see utility of it, I can start working on the prototype and benchmark the overhead introduced in the system. I'm thinking of doing it in 2 phases as both event generation and decoration are independent and I will be prioritizing the prior.
This should take care of use cases like - #4401
Describe alternatives you've considered
Resource Tracking Framework, Stats API, Performance Analyzer.
Utilities
Examples
Appendix
Source Peer Recovery Spans
OpenSearch/server/src/main/java/org/opensearch/indices/recovery/PeerRecoverySourceService.java
Line 159 in 5989d01
OpenSearch/server/src/main/java/org/opensearch/indices/recovery/LocalStorePeerRecoverySourceHandler.java
Line 60 in 5989d01
OpenSearch/server/src/main/java/org/opensearch/indices/recovery/LocalStorePeerRecoverySourceHandler.java
Line 99 in 5989d01
OpenSearch/server/src/main/java/org/opensearch/indices/recovery/RecoverySourceHandler.java
Line 349 in 5989d01
OpenSearch/server/src/main/java/org/opensearch/indices/recovery/RecoverySourceHandler.java
Line 380 in 5989d01
OpenSearch/server/src/main/java/org/opensearch/indices/recovery/RecoverySourceHandler.java
Line 451 in 5989d01
OpenSearch/server/src/main/java/org/opensearch/indices/recovery/RecoverySourceHandler.java
Line 516 in 5989d01
OpenSearch/server/src/main/java/org/opensearch/indices/recovery/RecoverySourceHandler.java
Line 527 in 5989d01
OpenSearch/server/src/main/java/org/opensearch/indices/recovery/RecoverySourceHandler.java
Line 889 in 5989d01
OpenSearch/server/src/main/java/org/opensearch/indices/recovery/RecoverySourceHandler.java
Line 600 in 5989d01
OpenSearch/server/src/main/java/org/opensearch/indices/recovery/RecoverySourceHandler.java
Line 630 in 5989d01
OpenSearch/server/src/main/java/org/opensearch/indices/recovery/RecoverySourceHandler.java
Line 788 in 5989d01
OpenSearch/server/src/main/java/org/opensearch/indices/recovery/RecoverySourceHandler.java
Line 824 in 5989d01
The text was updated successfully, but these errors were encountered: