-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CCR: Optimize indexing ops using seq_no on followers #34099
Conversation
This change introduces the indexing optimization using sequence numbers on the FollowingEngine. This optimization uses the max_seq_no_updates which is tracked on the primary of the leader, and replicated to replicas and followers.
Pinging @elastic/es-distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Left some nits. Are you planning to add retry tests in a follow up?
* Checks if the given operation has been processed in this engine or not. | ||
* @return true if the given operation was processed; otherwise false. | ||
*/ | ||
protected boolean containsOperation(Operation op) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: call this "hasBeenProcessedBefore"?
* 1. Indexing operations are processed concurrently in an engine. However, operations of the same docID are processed | ||
* one by one under the docID lock. | ||
* | ||
* 2. An engine itself can resolve correctly if an operation is delivered multiple times. However, if an operation is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this note is correct? we don't execute if we see we did before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you forget to push something? this statement is not correct. There is no notion of an "optimized op" (for replicas) just an op with a seq# about the MSU. Also "However, if an operation is optimized and delivered multiple times, it will be appended into Lucene more than once." reads weird. Maybe as simple as "Operations that are optimized using the MSU optimization may not be processed twice as this will create duplicates in lucene. To avoid it we check the local checkpoint tracker to see if an operation was already processed".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've applied your suggestion.
* 4. The following proves that docID(O) does not exist on a follower when operation O is applied if MSU(O) <= LCP < seqno(O): | ||
* | ||
* 4.1) If such operation O' with docID(O’) = docID(O), and LCP < seqno(O’), then MSU(O) >= MSU(O') because O' was | ||
* delivered to the follower before O. MUS(0') on the leader is at least seqno(O) or seqno(0') and both > LCP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0' (zero) -> O'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is MUS(0') on the leader is at least seqno(O) or seqno(0')
?
x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/index/engine/FollowingEngine.java
Show resolved
Hide resolved
.../plugin/ccr/src/test/java/org/elasticsearch/xpack/ccr/index/engine/FollowingEngineTests.java
Show resolved
Hide resolved
@bleskes Thanks for reviewing. I've addressed your comments. Would you please have another look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codes looks great. I left more comments.
x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/index/engine/FollowingEngine.java
Show resolved
Hide resolved
* 1. Indexing operations are processed concurrently in an engine. However, operations of the same docID are processed | ||
* one by one under the docID lock. | ||
* | ||
* 2. An engine itself can resolve correctly if an operation is delivered multiple times. However, if an operation is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you forget to push something? this statement is not correct. There is no notion of an "optimized op" (for replicas) just an op with a seq# about the MSU. Also "However, if an operation is optimized and delivered multiple times, it will be appended into Lucene more than once." reads weird. Maybe as simple as "Operations that are optimized using the MSU optimization may not be processed twice as this will create duplicates in lucene. To avoid it we check the local checkpoint tracker to see if an operation was already processed".
* | ||
* 4. The following proves that docID(O) does not exist on a follower when operation O is applied if MSU_r(O) <= LCP < seqno(O): | ||
* | ||
* 4.1) Given two operations O and O' with docID(O’) = docID(O) and seqno(O) < seqno(O’) then MSU_p(O') on the primary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you mean MSU_p(O')
must be at least seqno(O’)
(as O' is an update)
* 4. The following proves that docID(O) does not exist on a follower when operation O is applied if MSU_r(O) <= LCP < seqno(O): | ||
* | ||
* 4.1) Given two operations O and O' with docID(O’) = docID(O) and seqno(O) < seqno(O’) then MSU_p(O') on the primary | ||
* must be at least seqno(O). Moreover, the MSU_r on a follower >= min(seqno(O), seqno(O')) after these operations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, the MSU_r on a follower >= min(seqno(O), seqno(O')) after these operations
I still don't follow this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I break it into two cases. I hope this is clear now.
.../plugin/ccr/src/test/java/org/elasticsearch/xpack/ccr/index/engine/FollowingEngineTests.java
Show resolved
Hide resolved
@bleskes I've addressed your comments. Could you please take another look? Thank you! |
Discussed with Boaz on another channel, we prefer to use the proof from @DaveCTurner and @ywelsch (thank you for making this 👍). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @dnhatn for the hard work.
* 2. Also MSU(O) <= MSU <= LCP < seqno(O) (we discard O if seqno(O) ≤ LCP) so the second invariant applies, | ||
* meaning that the O' was a delete. | ||
* <p> | ||
* Moreover, operations that are optimized using the MSU optimization will not be processed twice as this will create duplicates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a sentence - "Therefore, if MSU<= LCP < seqno(O) we know that O can safely be optimized with and added to lucene with addDocument. Moreover, operations"...
This change introduces the indexing optimization using sequence numbers in the FollowingEngine. This optimization uses the max_seq_no_updates which is tracked on the primary of the leader and replicated to replicas and followers. Relates #33656
Since #34099, the FollowingEngine will skip an operation which was already processed before. With that change, it should be okay to unmute testFollowIndexAndCloseNode.
Since #34099, the FollowingEngine will skip an operation which was already processed before. With that change, it should be okay to unmute testFollowIndexAndCloseNode.
This change introduces the indexing optimization using sequence numbers in the FollowingEngine. This optimization uses the max_seq_no_updates which is tracked on the primary of the leader and replicated to replicas and followers. Relates #33656
Since #34099, the FollowingEngine will skip an operation which was already processed before. With that change, it should be okay to unmute testFollowIndexAndCloseNode.
This PR enables the indexing optimization using sequence numbers on replicas. With this optimization, indexing on replicas should be faster and use less memory as it can forgo the version lookup when possible. This change also deactivates the append-only optimization on replicas. Relates #34099
This PR enables the indexing optimization using sequence numbers on replicas. With this optimization, indexing on replicas should be faster and use less memory as it can forgo the version lookup when possible. This change also deactivates the append-only optimization on replicas. Relates #34099
This change introduces the indexing optimization using sequence numbers
in the FollowingEngine. This optimization uses the max_seq_no_updates
which is tracked on the primary of the leader and replicated to replicas
and followers.
/cc @martijnvg @jasontedor