Create token metrics only when they are available #1092

eero-t · 2024-12-30T18:11:34Z

Description

This avoids generating useless token / request histogram metrics for services that use Orchestrator class, but never call its token processing functionality. Such dummy metrics can confuse telemetry users.

(It also helps in differentiating frontend megaservice metrics from backend megaservice ones, especially when multiple OPEA applications with wrapper microservices run in the same cluster.)

Issues

n/a.

Type of change

Bug fix (non-breaking change which fixes an issue)

Dependencies

n/a.

Tests

Manual testing with latest versions, to verify that:

services processing tokens, generate token histogram metrics
ones not processing them, produce only pending requests gauge

codecov · 2024-12-30T18:13:50Z

Codecov Report

Attention: Patch coverage is 96.42857% with 1 line in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
comps/cores/mega/orchestrator.py	96.42%	1 Missing ⚠️

Files with missing lines	Coverage Δ
comps/cores/mega/orchestrator.py	`91.36% <96.42%> (+0.40%)`	⬆️

eero-t · 2025-01-03T10:50:41Z

@Spycsh Could you review this?

(And maybe also #1107.)

eero-t · 2025-01-03T10:59:46Z

Rebased to main.

comps/cores/mega/orchestrator.py

Spycsh · 2025-01-06T02:50:34Z

This avoids generating useless token / request histogram metrics for services that use Orchestrator class, but never call its token processing functionality. Such dummy metrics can confuse telemetry users.

Why will this happen? Metrics are only updated when calling self.metrics.pending_update-like methods in schedule, right? These are all controllable code. So what you mean is that there are some other thing that can update the metrics?

eero-t · 2025-01-07T12:34:53Z

Why will this happen?

@Spycsh Because Prometheus client will start providing metrics after they've been created.

In current code, all metrics are created when Orchestrator/OrhestratorMetrics is instantiated: https://github.com/opea-project/GenAIComps/blob/main/comps/cores/mega/orchestrator.py#L33

Metrics are only updated when calling self.metrics.pending_update-like methods in schedule, right?

Those methods only update the value of the metric, they do not create them. This PR changes Histogram metric creation to be delayed until first call of the update methods.

eero-t · 2025-01-07T17:02:49Z

I dropped pending metric doc update & rebased to main. I'll have it in separate PR where I fix additional issues I noticed, which require pending requests metric type / name change.

eero-t · 2025-01-07T18:34:12Z

opea/dataprep-redis:latest does not seem to generate megaservice_* metrics anymore.

Has ServiceOrchestrator use been dropped from backend services?

eero-t · 2025-01-07T20:15:31Z

I dropped pending metric doc update & rebased to main. I'll have it in separate PR where I fix additional issues I noticed, which require pending requests metric type / name change.

Could not find any good fix for it, so I just filed a ticket on it: #1121

Spycsh · 2025-01-08T02:36:25Z

This avoids generating useless token / request histogram metrics for services that use Orchestrator class, but never call its token processing functionality. Such dummy metrics can confuse telemetry users.

(It also helps in differentiating frontend megaservice metrics from backend megaservice ones, especially when multiple OPEA applications with wrapper microservices run in the same cluster.)

OK so what you mean it that the dummy metrics will show zeros after initialization and before the first request and users should not see wrong values of request number... But you think the k8s will scrape the metrics even there are no requests and it is resource-consuming so you decide to delay the initialization only when there are requests. I agree with this approach.

Spycsh · 2025-01-08T02:40:42Z

opea/dataprep-redis:latest does not seem to generate megaservice_* metrics anymore.

Has ServiceOrchestrator use been dropped from backend services?

dataprep microservice itself should not generate megaservice_* metrics. Only megaservices like opea/chatqna do.

eero-t · 2025-01-08T09:59:28Z

OK so what you mean it that the dummy metrics will show zeros after initialization and before the first request and users should not see wrong values of request number...

Technically the zero counts are not wrong, but presence of token / LLM metrics is misleading for services that will never generate tokens (or use LLM). That's the main reason for this PR.

But you think the k8s will scrape the metrics even there are no requests and it is resource-consuming so you decide to delay the initialization only when there are requests. I agree with this approach.

Visibility

All OPEA originated services use HttpService i.e. provide HTTP access metrics [1]. To see those, serviceMonitors are installed for them when monitoring option is enabled in Helm charts. Meaning that any megaservice_* metrics they generate, will also be visible to user e.g. in Grafana.

Perf

I doubt skipping generation of extra metrics has any noticeable perf impact on the service providing the metrics (currently serviceMonitors are configured to poll them at 5s interval), but every little bit can help.

Each Prometheus Histogram type provides about dozen different metrics, and in larger clusters, amount of metrics needs to be reduced to keep telemetry stack resource usage & perf reasonable. Telemetry stack resource usage should be significant concern only when there's larger number of such pods though.

[1] There's large number of HTTP metrics, and some Python ones too. It would be good to have controls for limiting those in larger clusters, but I did not see any options for that in prometheus_fastapi_instrumentator API.

eero-t · 2025-01-10T14:04:31Z

@Spycsh from you comment in the bug #1121 (comment)

I realized that changing the method on first metric access is racy. It's possible that multiple threads end up in create method, before that method is changed to update one. Meaning that multiple identical metrics would be created, and Prometheus would barf on that.

=> I'll add lock & check to handle that.

This avoids generation of useless token/request histogram metrics for services that use Orchestrator class, but never call its token processing functionality. (Helps in differentiating frontend megaservice metrics from backend megaservice ones, especially when multiple OPEA applications run in the same cluster.) Also change Orchestrator CI test workaround to use unique prefix for each metric instance, instead of metrics being (singleton) class variables. Signed-off-by: Eero Tamminen <[email protected]>

As that that could be called from multiple request handling threads. Signed-off-by: Eero Tamminen <[email protected]>

eero-t requested a review from lvliang-intel as a code owner December 30, 2024 18:11

eero-t force-pushed the metrics-update branch from b059539 to d36281e Compare December 30, 2024 18:16

eero-t mentioned this pull request Dec 30, 2024

Add monitoring for rest of ChatQnA + DocSum components opea-project/GenAIInfra#655

Merged

1 task

eero-t force-pushed the metrics-update branch from d36281e to 62e24c2 Compare January 3, 2025 10:59

Spycsh reviewed Jan 6, 2025

View reviewed changes

comps/cores/mega/orchestrator.py Outdated Show resolved Hide resolved

eero-t force-pushed the metrics-update branch from 62e24c2 to 4cd8a71 Compare January 7, 2025 16:55

eero-t requested review from ftian1, letonghan and XinyaoWa as code owners January 7, 2025 16:55

eero-t force-pushed the metrics-update branch from 5241115 to 8655d3e Compare January 7, 2025 17:00

eero-t force-pushed the metrics-update branch from 8655d3e to 7f81fff Compare January 10, 2025 14:41

eero-t added 2 commits January 13, 2025 12:17

Add locking for latency metric creation / method change

0a4e313

As that that could be called from multiple request handling threads. Signed-off-by: Eero Tamminen <[email protected]>

eero-t force-pushed the metrics-update branch from 23cd2c5 to 0a4e313 Compare January 13, 2025 10:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create token metrics only when they are available #1092

Create token metrics only when they are available #1092

eero-t commented Dec 30, 2024 •

edited

Loading

codecov bot commented Dec 30, 2024 •

edited

Loading

eero-t commented Jan 3, 2025 •

edited

Loading

eero-t commented Jan 3, 2025

Spycsh commented Jan 6, 2025

eero-t commented Jan 7, 2025 •

edited

Loading

eero-t commented Jan 7, 2025

eero-t commented Jan 7, 2025

eero-t commented Jan 7, 2025

Spycsh commented Jan 8, 2025

Spycsh commented Jan 8, 2025

eero-t commented Jan 8, 2025 •

edited

Loading

eero-t commented Jan 10, 2025

Create token metrics only when they are available #1092

Are you sure you want to change the base?

Create token metrics only when they are available #1092

Conversation

eero-t commented Dec 30, 2024 • edited Loading

Description

Issues

Type of change

Dependencies

Tests

codecov bot commented Dec 30, 2024 • edited Loading

Codecov Report

eero-t commented Jan 3, 2025 • edited Loading

eero-t commented Jan 3, 2025

Spycsh commented Jan 6, 2025

eero-t commented Jan 7, 2025 • edited Loading

eero-t commented Jan 7, 2025

eero-t commented Jan 7, 2025

eero-t commented Jan 7, 2025

Spycsh commented Jan 8, 2025

Spycsh commented Jan 8, 2025

eero-t commented Jan 8, 2025 • edited Loading

eero-t commented Jan 10, 2025

eero-t commented Dec 30, 2024 •

edited

Loading

codecov bot commented Dec 30, 2024 •

edited

Loading

eero-t commented Jan 3, 2025 •

edited

Loading

eero-t commented Jan 7, 2025 •

edited

Loading

eero-t commented Jan 8, 2025 •

edited

Loading