You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Not long ago, a couple of PVF host tests were gated through the ci-only-tests feature as they were found to be flaky when run locally. @bkchr was arguing it is a bad idea (and it really is), but reviewers concluded the solution is acceptable for the moment.
Naturally, that was forgotten when building the monorepo. Right now, those tests are never run.
Of course, we could just put the feature back into the CI scripts and forget about it... Again. But the very fact that it happened has proven that the approach is really error-prone.
The tests in question are timeout-based. That is, the test is allowed to finish within some time frame, which proves that the work tested is executed with the expected parallelism (or, vice-versa, without any parallelism where it is undesirable). The concrete examples are ensure_parallel_execution and execute_queue_doesnt_stall_with_varying_executor_params, but that's not an exhaustive list; other timeout-based tests also present, we just have never seen them flaky and thus haven't gated them.
Under this issue, I'd like to collect ideas on how to improve them and make them less dependent on host load; ideally, they shouldn't be load-dependent at all.
Please speak out if you have any ideas.
The text was updated successfully, but these errors were encountered:
Not long ago, a couple of PVF host tests were gated through the
ci-only-tests
feature as they were found to be flaky when run locally. @bkchr was arguing it is a bad idea (and it really is), but reviewers concluded the solution is acceptable for the moment.Naturally, that was forgotten when building the monorepo. Right now, those tests are never run.
Of course, we could just put the feature back into the CI scripts and forget about it... Again. But the very fact that it happened has proven that the approach is really error-prone.
The tests in question are timeout-based. That is, the test is allowed to finish within some time frame, which proves that the work tested is executed with the expected parallelism (or, vice-versa, without any parallelism where it is undesirable). The concrete examples are
ensure_parallel_execution
andexecute_queue_doesnt_stall_with_varying_executor_params
, but that's not an exhaustive list; other timeout-based tests also present, we just have never seen them flaky and thus haven't gated them.Under this issue, I'd like to collect ideas on how to improve them and make them less dependent on host load; ideally, they shouldn't be load-dependent at all.
Please speak out if you have any ideas.
The text was updated successfully, but these errors were encountered: