Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eval broker: use write lock when reaping cancelable evals #16112

Merged
merged 1 commit into from
Feb 10, 2023

Conversation

tgross
Copy link
Member

@tgross tgross commented Feb 9, 2023

The eval broker's Cancelable method used by the cancelable eval reaper mutates
the slice of cancelable evals by removing a batch at a time from the slice. But
this method unsafely uses a read lock despite this mutation. Under normal
workloads this is likely to be safe but when the eval broker is under the heavy
load this feature is intended to fix, we're likely to have a race
condition. Switch this to a write lock, like the other locks that mutate the
eval broker state.

This changeset also adjusts the timeout to allow poorly-sized Actions runners
more time to schedule the appropriate goroutines. The test has also been updated
to use shoenig/test/wait so we can have sensible reporting of the results
rather than just a timeout error when things go wrong.

@tgross tgross added theme/testing Test related issues theme/flaky-tests labels Feb 9, 2023
@tgross tgross added this to the 1.5.0 milestone Feb 9, 2023
@tgross tgross added the backport/1.4.x backport to 1.4.x release line label Feb 9, 2023
@tgross tgross force-pushed the b-flaky-eval-broker-test branch from 1516fec to d85ee3c Compare February 9, 2023 16:38
@tgross tgross force-pushed the b-flaky-eval-broker-test branch from d85ee3c to e1b1df5 Compare February 9, 2023 17:01
@tgross tgross changed the title eval broker: extend timeout on integration test eval broker test: enable canceled eval reaper Feb 9, 2023
@tgross tgross force-pushed the b-flaky-eval-broker-test branch from e1b1df5 to 78aea89 Compare February 9, 2023 17:55
@tgross tgross force-pushed the b-flaky-eval-broker-test branch from 78aea89 to e462084 Compare February 9, 2023 18:17
@tgross tgross force-pushed the b-flaky-eval-broker-test branch from 8707739 to f57c841 Compare February 9, 2023 19:05
@tgross tgross self-assigned this Feb 9, 2023
The eval broker's `Cancelable` method used by the cancelable eval reaper mutates
the slice of cancelable evals by removing a batch at a time from the slice. But
this method unsafely uses a read lock despite this mutation. Under normal
workloads this is likely to be safe but when the eval broker is under the heavy
load this feature is intended to fix, we're likely to have a race
condition. Switch this to a write lock, like the other locks that mutate the
eval broker state.

This changeset also adjusts the timeout to allow poorly-sized Actions runners
more time to schedule the appropriate goroutines. The test has also been updated
to use `shoenig/test/wait` so we can have sensible reporting of the results
rather than just a timeout error when things go wrong.
@tgross tgross force-pushed the b-flaky-eval-broker-test branch from 2268e8c to d76273d Compare February 10, 2023 15:08
@tgross tgross changed the title eval broker test: enable canceled eval reaper eval broker: use write lock when reaping cancelable evals Feb 10, 2023
@tgross tgross marked this pull request as ready for review February 10, 2023 15:09
@tgross tgross requested review from angrycub and lgfa29 February 10, 2023 15:09
Copy link
Contributor

@lgfa29 lgfa29 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.4.x backport to 1.4.x release line theme/flaky-tests theme/testing Test related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants