Allow engine to recover from translog upto a seqno #33032

dnhatn · 2018-08-21T17:37:58Z

This change allows an engine to recover from its local translog up to
the given seqno. The extended API can be used in these use cases:

When a replica starts following a new primary, it resets its index to
the safe commit, then replays its local translog up to the current
global checkpoint (see Reset replica engine before primary-replica resync #32867).
When a replica starts a peer-recovery, it can initialize the
start_sequence_number to the persisted global checkpoint instead of the
local checkpoint of the safe commit. A replica will then replay its
local translog up to that global checkpoint before accepting remote
translog from the primary. This change will increase the chance of
operation-based recovery. I will make this in a follow-up.

Relates #32867

/cc @bleskes

This change allows an engine to recover from its local translog up to the given seqno. The extended API can be used in these use cases: 1. When a replica starts following a new primary, it resets its index to the safe commit, then replays its local translog up to the current global checkpoint (see elastic#32867). 2. When a replica starts a peer-recovery, it can initialize the start_sequence_number to the persisted global checkpoint instead of the local checkpoint of the safe commit. A replica will then replay its local translog up to that global checkpoint before accepting remote translog from the primary. This change will increase the chance of operation-based recovery. I will make this in a follow-up. Relates elastic#32867

elasticmachine · 2018-08-21T17:37:59Z

Pinging @elastic/es-distributed

ywelsch

I've left a few comments

ywelsch · 2018-08-21T20:33:57Z

server/src/main/java/org/elasticsearch/index/engine/Engine.java

@@ -1623,10 +1623,10 @@ public long getLastWriteNanos() {
    public abstract int fillSeqNoGaps(long primaryTerm) throws IOException;

    /**
-     * Performs recovery from the transaction log.
+     * Performs recovery from the transaction log up to {@code recoverUpToSeqNo}.


can you add that it's an inclusive bound? (i.e. up to XYZ included)

ywelsch · 2018-08-21T20:39:48Z

server/src/main/java/org/elasticsearch/index/translog/Translog.java

        }
    }

-    public Snapshot newSnapshotFromGen(long minGeneration) throws IOException {
+    public Snapshot newSnapshotFromGen(TranslogGeneration fromGeneration, long upToSeqNo) throws IOException {


why did you change this to take a TranslogGeneration with the uuid instead of just the long minGeneration? It's not using that uuid anywhere here AFAICS.

This method might be interpreted as a range of translog generations or a range of sequence numbers if the parameter is a tuple of Longs. I changed to TranslogGeneration to avoid this issue. I will revert this change if you don't like it.

I think it's good though. less likely to misuse.

ok, makes sense.

ywelsch · 2018-08-21T20:46:15Z

server/src/main/java/org/elasticsearch/index/translog/Translog.java

+            return new Snapshot() {
+                int skippedOps = 0;
+                @Override
+                public int totalOperations() {


also override overriddenOperations and delegate to snapshot.overriddenOperations?

good catch!

ywelsch · 2018-08-21T21:13:37Z

server/src/main/java/org/elasticsearch/index/translog/Translog.java

+            if (upToSeqNo == Long.MAX_VALUE) {
+                return snapshot;
+            }
+            return new Snapshot() {


I think we should have a proper (top-level) class for this, supporting both min and max. min would be useful for newSnapshotFromMinSeqNo (see e.g. PrimaryReplicaResyncer, which still has to filter based on min = startingSeqNo, all of which could be accomplished through the Snapshot), and max would be useful for this one here (where it might also have a min).

We might even make the interface of this newSnapshot method purely sequence-number-based, where you can specify the range of operations to recover instead of the translog generation. That last part is not something I would change right away, but maybe something to look into later.

I added filter(Predicate<Operation>) method to the Snapshot for this purpose. However, I feel it's too broad; then I go with the filter class that you suggested.

dnhatn · 2018-08-22T00:31:41Z

@ywelsch I've addressed your comments. Would you please have another look? Thank you!

s1monw

LGTM

ywelsch

LGTM

dnhatn · 2018-08-22T11:57:19Z

Thanks @ywelsch and @s1monw.

This change allows an engine to recover from its local translog up to the given seqno. The extended API can be used in these use cases: When a replica starts following a new primary, it resets its index to the safe commit, then replays its local translog up to the current global checkpoint (see #32867). When a replica starts a peer-recovery, it can initialize the start_sequence_number to the persisted global checkpoint instead of the local checkpoint of the safe commit. A replica will then replay its local translog up to that global checkpoint before accepting remote translog from the primary. This change will increase the chance of operation-based recovery. I will make this in a follow-up. Relates #32867

* 6.x: Allow engine to recover from translog upto a seqno (#33032) TEST: Skip assertSeqNos for closed shards (#33130) TEST: resync operation on replica should acquire shard permit (#33103) Add proxy support to RemoteClusterConnection (#33062) Build: Line up IDE detection logic Security index expands to a single replica (#33131) Suppress more tests HLRC: request/response homogeneity and JavaDoc improvements (#33133) [Rollup] Move toAggCap() methods out of rollup config objects (#32583) Muted all these tests due to #33128 Fix race condition in scheduler engine test

dnhatn added >enhancement v7.0.0 :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. v6.5.0 labels Aug 21, 2018

dnhatn requested review from s1monw and ywelsch August 21, 2018 17:37

dnhatn requested a review from jasontedor August 21, 2018 17:37

dnhatn mentioned this pull request Aug 21, 2018

Reset replica engine before primary-replica resync #32867

Closed

ywelsch suggested changes Aug 21, 2018

View reviewed changes

feedback

4ee1ebc

dnhatn requested a review from ywelsch August 22, 2018 00:31

s1monw approved these changes Aug 22, 2018

View reviewed changes

ywelsch approved these changes Aug 22, 2018

View reviewed changes

dnhatn merged commit 262d3c0 into elastic:master Aug 22, 2018

dnhatn deleted the recover-upto-seqno branch August 22, 2018 11:57

dnhatn added the backport pending label Aug 22, 2018

dnhatn removed the backport pending label Aug 26, 2018

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow engine to recover from translog upto a seqno #33032

Allow engine to recover from translog upto a seqno #33032

dnhatn commented Aug 21, 2018 •

edited

Loading

elasticmachine commented Aug 21, 2018

ywelsch left a comment

ywelsch Aug 21, 2018

ywelsch Aug 21, 2018

dnhatn Aug 22, 2018

s1monw Aug 22, 2018

ywelsch Aug 22, 2018

ywelsch Aug 21, 2018

dnhatn Aug 22, 2018

ywelsch Aug 21, 2018

dnhatn Aug 22, 2018

dnhatn commented Aug 22, 2018

s1monw left a comment

ywelsch left a comment

dnhatn commented Aug 22, 2018

Allow engine to recover from translog upto a seqno #33032

Allow engine to recover from translog upto a seqno #33032

Conversation

dnhatn commented Aug 21, 2018 • edited Loading

elasticmachine commented Aug 21, 2018

ywelsch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dnhatn commented Aug 22, 2018

s1monw left a comment

Choose a reason for hiding this comment

ywelsch left a comment

Choose a reason for hiding this comment

dnhatn commented Aug 22, 2018

dnhatn commented Aug 21, 2018 •

edited

Loading