-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make PrimaryReplicaResyncer Fork to Generic Pool #69949
Make PrimaryReplicaResyncer Fork to Generic Pool #69949
Conversation
Reading ops from the translog snapshot must not run on the transport thread. When sending more than one batch of ops the listener (and thus `run`) would be invoked on the transport thread for all but the first batch of ops. => Forking to the generic pool like we do for sending ops during recovery.
Pinging @elastic/es-distributed (Team:Distributed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Armin, one question about cleanup
@@ -159,7 +160,8 @@ protected DiscoveryNode getDiscoveryNode(String id) { | |||
private volatile ReplicationTargets replicationTargets; | |||
|
|||
private final PrimaryReplicaSyncer primaryReplicaSyncer = new PrimaryReplicaSyncer( | |||
new TaskManager(Settings.EMPTY, threadPool, Collections.emptySet()), | |||
new MockTransport().createTransportService(Settings.EMPTY, threadPool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to shut this (at least the threadpool) down cleanly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just using the fake transport so I don;t think the service needs any shutting down here. The threadPool
is torn down via the parent IndexShardTestCase
-> I think we're good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right we already had a threadpool we're just using the transport service as a wrapper. Nvm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -159,7 +160,8 @@ protected DiscoveryNode getDiscoveryNode(String id) { | |||
private volatile ReplicationTargets replicationTargets; | |||
|
|||
private final PrimaryReplicaSyncer primaryReplicaSyncer = new PrimaryReplicaSyncer( | |||
new TaskManager(Settings.EMPTY, threadPool, Collections.emptySet()), | |||
new MockTransport().createTransportService(Settings.EMPTY, threadPool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right we already had a threadpool we're just using the transport service as a wrapper. Nvm.
Thanks David! |
Reading ops from the translog snapshot must not run on the transport thread. When sending more than one batch of ops the listener (and thus `run`) would be invoked on the transport thread for all but the first batch of ops. => Forking to the generic pool like we do for sending ops during recovery.
Reading ops from the translog snapshot must not run on the transport thread. When sending more than one batch of ops the listener (and thus `run`) would be invoked on the transport thread for all but the first batch of ops. => Forking to the generic pool like we do for sending ops during recovery.
Reading ops from the translog snapshot must not run on the transport thread. When sending more than one batch of ops the listener (and thus `run`) would be invoked on the transport thread for all but the first batch of ops. => Forking to the generic pool like we do for sending ops during recovery.
Reading ops from the translog snapshot must not run on the transport thread. When sending more than one batch of ops the listener (and thus `run`) would be invoked on the transport thread for all but the first batch of ops. => Forking to the generic pool like we do for sending ops during recovery.
We assert that the snapshot isn't closed on a transport thread, but we close it without forking off the transport thread in case of a failure. With this commit we fork on failure too. Relates elastic#69949 Closes elastic#70407
We can have a race here where the closed check passes and then we concurrently to a shard close try to fail the shard also. Previously this was covered by the catch below the changed code that would just ignore the already-closed exception but with elastic#69949 we're now forking to the generic pool for this logic and thus have to handle the exception in the callback as well.
We can have a race here where the closed check passes and then we concurrently to a shard close try to fail the shard also. Previously this was covered by the catch below the changed code that would just ignore the already-closed exception but with #69949 we're now forking to the generic pool for this logic and thus have to handle the exception in the callback as well.
We can have a race here where the closed check passes and then we concurrently to a shard close try to fail the shard also. Previously this was covered by the catch below the changed code that would just ignore the already-closed exception but with elastic#69949 we're now forking to the generic pool for this logic and thus have to handle the exception in the callback as well.
We can have a race here where the closed check passes and then we concurrently to a shard close try to fail the shard also. Previously this was covered by the catch below the changed code that would just ignore the already-closed exception but with elastic#69949 we're now forking to the generic pool for this logic and thus have to handle the exception in the callback as well.
We can have a race here where the closed check passes and then we concurrently to a shard close try to fail the shard also. Previously this was covered by the catch below the changed code that would just ignore the already-closed exception but with #69949 we're now forking to the generic pool for this logic and thus have to handle the exception in the callback as well.
We can have a race here where the closed check passes and then we concurrently to a shard close try to fail the shard also. Previously this was covered by the catch below the changed code that would just ignore the already-closed exception but with #69949 we're now forking to the generic pool for this logic and thus have to handle the exception in the callback as well.
Reading ops from the translog snapshot must not run on the transport thread.
When sending more than one batch of ops the listener (and thus
run
) would beinvoked on the transport thread for all but the first batch of ops.
=> Forking to the generic pool like we do for sending ops during recovery.