A/B plot is misleading for p2p-disk memory usage and tick measures #1240

crusaderky · 2023-12-19T11:15:20Z

This A/B plot seems to highlight extremely severe regressions in memory usage for three p2p-disk tests:

The bar charts, however, tell us a completely different story:

While it is true that the test more than doubled its memory usage.... it is double of a negligible amount, which is still negligible.

Find a smart way to remove the false positive from the A/B test. We still want to measure RAM usage, in case something goes wrong and negligible use becomes non-negligible. Maybe decide that if both median measures are below a threshold (2 GiB/worker?) the bar should be suppressed.

The recently added measures for

Avg CPU (scheduler)
Max tick (worker)
Max tick (scheduler)

suffer from the same problem. We clearly don't care if avg cpu changed e.g. from 10 to 15%, or if the tick went up from 50ms to 75ms - but for the A/B plot they are 50% increases:

The text was updated successfully, but these errors were encountered:

fjetter · 2023-12-19T12:51:44Z

I believe you are talking about two different problems.

The first one about the p2p-disk is about absolute changes that you do not consider meaningful. I agree that this is misleading but I also think it's not bad since every major difference should be double checked.

Regarding the newly added metrics about tick duration, etc., this is rather a problem about our statistical evaluation. These quantities do have a substantial error in their measurement we are not accounting for. I never checked the math but with such large variations, there shouldn't be any signal.

crusaderky mentioned this issue Dec 19, 2023

Partial rechunks within P2P dask/distributed#8330

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A/B plot is misleading for p2p-disk memory usage and tick measures #1240

A/B plot is misleading for p2p-disk memory usage and tick measures #1240

crusaderky commented Dec 19, 2023

fjetter commented Dec 19, 2023

A/B plot is misleading for p2p-disk memory usage and tick measures #1240

A/B plot is misleading for p2p-disk memory usage and tick measures #1240

Comments

crusaderky commented Dec 19, 2023

fjetter commented Dec 19, 2023