Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emitting metrics of the disabled worker #1221

Closed
ad-m-ss opened this issue Jun 29, 2022 · 2 comments
Closed

Emitting metrics of the disabled worker #1221

ad-m-ss opened this issue Jun 29, 2022 · 2 comments
Labels

Comments

@ad-m-ss
Copy link

ad-m-ss commented Jun 29, 2022

Describe the bug

We use Celery on Kubernetes. After deployment, we have reported deleted workers as online via metrics. We use FLOWER_PURGE_OFFLINE_WORKERS to cleanup workers from UI, but it persists in metrics.

I can see where metrics should be updated (

self.metrics.worker_online.labels(worker_name).set(0)
) however, this action does not seem to be sufficient. Dashboard also use application.update_workers to refresh data (
if refresh:
try:
self.application.update_workers()
except Exception as e:
logger.exception('Failed to update workers: %s', e)
, then use heartbeat to determine when worker purge (
if not last_heartbeat or timestamp - last_heartbeat > options.purge_offline_workers:
).

To Reproduce
Steps to reproduce the behavior:

A detailed scenario has not been developed yet. Experiments will be developed after the initial discussion.

Expected behavior
A clear and concise description of what you expected to happen.

In perfect condition:

  • for a specified period of time emitting "0" for a deleted worker
  • removal of the metric after a long time of worker's inactivity.

We might also consider exposing heartbeat value for all workers.

Screenshots

We have list of worker in UI:

image

We have list of metrics:

image

Please note that two additional workers are reported as online.

System information
Output of python -c 'from flower.utils import bugreport; print(bugreport())' command

$ python -c 'from flower.utils import bugreport; print(bugreport())'
flower   -> flower:1.0.0 tornado:6.1 humanize:3.10.0
software -> celery:5.1.2 (sun-harmonics) kombu:5.1.0 py:3.9.6
            billiard:3.6.4.0 redis:3.5.3
platform -> system:Linux arch:64bit
            kernel version:5.4.196-108.356.amzn2.x86_64 imp:CPython
loader   -> celery.loaders.app.AppLoader
settings -> transport:redis results:redis://**REDACTED**.cache.amazonaws.com:6379/0

deprecated_settings: None
@ad-m-ss ad-m-ss added the bug label Jun 29, 2022
@drummerwolli
Copy link

probably the same like #1128 ?

@ad-m-ss
Copy link
Author

ad-m-ss commented Sep 12, 2022

Yes, my bad. I miss that issue when searching for duplicate myself.

@ad-m-ss ad-m-ss closed this as not planned Won't fix, can't repro, duplicate, stale Sep 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants