-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-fetch shard info of primary when new node joins #47035
Conversation
Pinging @elastic/es-distributed |
This was fixed in #47016. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I am not sure my review counts on this one 😃...
server/src/test/java/org/elasticsearch/gateway/ReplicaShardAllocatorIT.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some drive-by comments
server/src/test/java/org/elasticsearch/gateway/ReplicaShardAllocatorIT.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/gateway/AsyncShardFetchTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/gateway/ReplicaShardAllocatorIT.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM :) Thanks Nhat!
Might be best for David or Yannick to look over this as well though. I understand what's going on here just fine now, but might miss some implication of this change.
@DaveCTurner @ywelsch This PR blocks the work in #46959. It would be great if one of you can take a look. Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I suggested a comment and a few small changes.
server/src/main/java/org/elasticsearch/gateway/GatewayAllocator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/gateway/GatewayAllocator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/gateway/GatewayAllocator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/gateway/GatewayAllocator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/gateway/GatewayAllocator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/gateway/GatewayAllocator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/gateway/GatewayAllocator.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/gateway/GatewayAllocator.java
Outdated
Show resolved
Hide resolved
…r.java Co-Authored-By: David Turner <[email protected]>
…r.java Co-Authored-By: David Turner <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks @dnhatn
Test failures are fixed in #47196. |
@henningandersen @original-brownbear @DaveCTurner Thank you for reviewing. |
Today, we don't clear the shard info of the primary shard when a new node joins; then we might risk of making replica allocation decisions based on the stale information of the primary. The serious problem is that we can cancel the current recovery which is more advanced than the copy on the new node due to the old info we have from the primary. With this change, we ensure the shard info from the primary is not older than any node when allocating replicas. Relates #46959 This work was done by Henning in #42518. Co-authored-by: Henning Andersen <[email protected]>
Today, we don't clear the shard info of the primary shard when a new node joins; then we might risk of making replica allocation decisions based on the stale information of the primary. The serious problem is that we can cancel the current recovery which is more advanced than the copy on the new node due to the old info we have from the primary.
With this change, we ensure the shard info from the primary is not older than any node when allocating replicas.
Relates #46959
This work was done by @henningandersen in #42518.
Co-authored-by: Henning Andersen [email protected]