fix: make sure to join all arbiters #3932
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
fixes #3925
It is a combination of two effects.
First, we are using rocks db. It is implemented in C++ and makes use of global
data with desctuctors. Those dtors are run "after main". In practice, "after
main" means "after the main thread is exited". It is thus an error to use
rocksdb instance after main, even if just to destroy it.
Second, we use actix, and it doesn't obey the structured concurrency
discipline ouf of the box. Specifically, if you have a test like this:
it very well might be the case that, after the
some_text
function exits, thereare some leftover actix therads running!
In combination, this means that the main testing thread might exit first, while
some actix thread is also exiting and droping rocksdb, which will then observe
destroyed global state.
This PR tries to manually wait for actix threads. There's an API to join
Arbiter
, which we use. ForSyncArbiter
, the equivalent API seems to bemissing, so we just busy wait.
TODO:
figure out, why did this surface only actix upgrade? The problem seems
pre-existing.
figure out a proper way to join the
SyncActor
.figure out patterns that make sure that no therads are leaked by construction
(aka structured concurrency).