Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Attempt to fix cancellation crash in repo fetching w/ worker thread
The stacktrace in #21478 suggests that the second `workerFuture.get()` (line 183 before) is snagging on a `CancellationException`. Closer inspection indicates that the exception handling in this entire block of code is just faulty -- one, `workerFuture.get()` on line 169 is very unlikely to throw an `InterruptedException` because this call happens after we've received a `DONE` from the signal queue, which is at the very end of the worker thread logic (in its own `finally` clause, actually); two, the second call to `workerFuture.get()` on line 183 doesn't actually do anything because `get()`-ing a cancelled future would just throw a `CancellationException` immediately. This CL attempts to fix these two glaring errors. It now tries to handle interrupts where it's likely to happen, which is at the call to `state.signalQueue.take()` -- this is where the Skyframe thread spends the most time blocked, and where a Ctrl-C from the user is most likely to land. We catch an `InterruptedException` here and interrupt the worker thread. To wait for the worker thread to finish, we uninterruptibly take from the signal queue instead of calling `workerFuture.get()`. Additionally, we now correctly handle the worker thread being interrupted by someone other the host Skyframe thread (the memory pressure handler, in all likelihood), by simply retrying the fetch instead of crashing Bazel. Fixes #21478 (maybe...?) PiperOrigin-RevId: 613348046 Change-Id: I692fa750cb8873f1bd403f16764d1845410a29f1
- Loading branch information