Fine performance metrics: Break down idle time on the Scheduler #7672

crusaderky · 2023-03-17T16:40:45Z

Part of Fine performance metrics meta-issue #7665
Blocked by Fine performance metrics: Store data on the Scheduler #7666
Blocked by Fine performance metrics: Break down idle time on the Worker #7671

With #7671 done, we know how much time we spend with workers idle because they are not getting enough Compute messages from the scheduler.
This can be further reclassified on the scheduler side, by adding negative corrections to Scheduler.cumulative_worker_metrics["execute", "n/a", "idle", "seconds"].

On the scheduler, we know for each worker:

time spent with tasks in processing state. The delta between this and the sum of worker metrics other than 'idle' shows e.g. time spent on imperfectly pipelined RTTs between worker and scheduler, e.g. it should increase when distributed.scheduler.worker-saturation is too low.
time spent with not enough tasks in processing state on the worker, but at least one task processing somewhere on the cluster, e.g. the workload is not fully parallelisable
time spent with zero tasks in processing state anywhere on the cluster, e.g. waiting for the Client. This should include the initial decision time between the moment the scheduler receives update_graph and when it releases the event loop.

The text was updated successfully, but these errors were encountered:

crusaderky added the diagnostics label Mar 17, 2023

This was referenced Mar 17, 2023

Fine performance metrics: client context manager #7667

Open

Fine performance metrics meta-issue #7665

Open

crusaderky mentioned this issue Mar 27, 2023

Fine performance metrics: Store data on the Scheduler #7666

Closed

This was referenced May 22, 2023

Worker crash causes computations to overlap #7825

Open

Fine performance metrics: apportion to Computations #7776

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine performance metrics: Break down idle time on the Scheduler #7672

Fine performance metrics: Break down idle time on the Scheduler #7672

crusaderky commented Mar 17, 2023

Fine performance metrics: Break down idle time on the Scheduler #7672

Fine performance metrics: Break down idle time on the Scheduler #7672

Comments

crusaderky commented Mar 17, 2023