[Bug]: Incorrect label count in prometheus metrics reporting #7611

micahjsmith · 2025-01-07T17:08:07Z

What happened?

Seeing an error in async callback deading with prometheus label count. This does not cause any completion requests to fail, but errors like this make me worried that prometheus metrics are not accurate representation of system health and performance. This is a regression as in v1.54.1 this error was not present.

Relevant log output

Task exception was never retrieved
future: <Task finished name='Task-769554' coro=<ServiceLogging.async_service_failure_hook() done, defined at /usr/local/lib/python3.13/site-packages/litellm/_service_logger.py:207> exception=ValueError('Incorrect label count')>
Traceback (most recent call last):
  File "/usr/local/lib/python3.13/site-packages/litellm/_service_logger.py", line 243, in async_service_failure_hook
    await self.prometheusServicesLogger.async_service_failure_hook(
    ...<2 lines>...
    )
  File "/usr/local/lib/python3.13/site-packages/litellm/integrations/prometheus_services.py", line 207, in async_service_failure_hook
    self.increment_counter(
    ~~~~~~~~~~~~~~~~~~~~~~^
        counter=obj,
        ^^^^^^^^^^^^
    ...<3 lines>...
        amount=1,  # LOG ERROR COUNT TO PROMETHEUS
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/local/lib/python3.13/site-packages/litellm/integrations/prometheus_services.py", line 131, in increment_counter
    counter.labels(labels, *additional_labels).inc(amount)
    ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.13/site-packages/prometheus_client/metrics.py", line 199, in labels
    raise ValueError('Incorrect label count')
ValueError: Incorrect label count



### Are you a ML Ops Team?

Yes

### What LiteLLM version are you on ?

v1.56.5

### Twitter / LinkedIn details

_No response_

The text was updated successfully, but these errors were encountered:

krrishdholakia · 2025-01-07T23:13:24Z

able to repro

Fixes #7611

* fix(main.py): pass custom llm provider on litellm logging provider update * fix(cost_calculator.py): don't append provider name to return model if existing llm provider Fixes BerriAI#7607 * fix(prometheus_services.py): fix prometheus system health error logging Fixes BerriAI#7611

micahjsmith added the bug Something isn't working label Jan 7, 2025

github-actions bot added the mlops user request label Jan 7, 2025

krrishdholakia self-assigned this Jan 7, 2025

krrishdholakia added a commit that referenced this issue Jan 7, 2025

fix(prometheus_services.py): fix prometheus system health error logging

550ecbf

Fixes #7611

krrishdholakia closed this as completed in 4e69711 Jan 8, 2025

krrishdholakia mentioned this issue Jan 8, 2025

Litellm dev 01 07 2025 p1 #7618

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Incorrect label count in prometheus metrics reporting #7611

[Bug]: Incorrect label count in prometheus metrics reporting #7611

micahjsmith commented Jan 7, 2025

krrishdholakia commented Jan 7, 2025

[Bug]: Incorrect label count in prometheus metrics reporting #7611

[Bug]: Incorrect label count in prometheus metrics reporting #7611

Comments

micahjsmith commented Jan 7, 2025

What happened?

Relevant log output

krrishdholakia commented Jan 7, 2025