Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent CCR recovery from missing documents #38237

Merged
merged 19 commits into from
Feb 5, 2019

Conversation

Tim-Brooks
Copy link
Contributor

Currently the snapshot/restore process manually sets the global
checkpoint to the max sequence number from the restored segements. This
does not work for Ccr as this will lead to documents that would be
recovered in the normal followering operation from being recovered.

This commit fixes this issue by setting the initial global checkpoint to
the existing local checkpoint.

@Tim-Brooks Tim-Brooks added >bug v7.0.0 :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features v6.7.0 labels Feb 1, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@Tim-Brooks
Copy link
Contributor Author

This is not ready for production. I pushed it up here so we can discuss.

This commit:

  1. Adds a test that will occasionally fail due to the local checkpoint != the max sequence number. Currently the test is set to run 5 times to guarantee consistent failures. However, that will be removed eventually.

  2. This commit comments out store.bootstrapNewHistory(). This method manually sets the the local checkpoint == the max sequence number. We need to discuss how to deal with this as I think the bootstrap new history method probably offers some behavior we want to retain (even if the sequence number part does not work for CCR).

  3. This commit attempts to advance the global checkpoint to the local check point after putting no-ops (for the normal non-ccr restore process). However, that cannot happen here as the replicationTracker.isPrimaryMode() is not yet in primary mode and that triggers an assertion. I'm not sure if we need to advance the global checkpoint in StoreRecovery#restore as I see that when the replication tracker goes to primary mode, it will advance the global checkpoint there. But this is something we should discuss.

    /**
     * Initializes the global checkpoint tracker in primary mode (see {@link #primaryMode}. Called on primary activation or promotion.
     */
    public synchronized void activatePrimaryMode(final long localCheckpoint) {
        assert invariant();
        assert primaryMode == false;
        assert checkpoints.get(shardAllocationId) != null && checkpoints.get(shardAllocationId).inSync &&
            checkpoints.get(shardAllocationId).localCheckpoint == SequenceNumbers.UNASSIGNED_SEQ_NO :
            "expected " + shardAllocationId + " to have initialized entry in " + checkpoints + " when activating primary";
        assert localCheckpoint >= SequenceNumbers.NO_OPS_PERFORMED;
        primaryMode = true;
        updateLocalCheckpoint(shardAllocationId, checkpoints.get(shardAllocationId), localCheckpoint);
        updateGlobalCheckpointOnPrimary();
        assert invariant();
    }

@ywelsch
Copy link
Contributor

ywelsch commented Feb 4, 2019

@tbrooks8 I've fixed things up as far as I think how we should handle this. I've also added a unit test. The integration test testFollowIndexWithConcurrentMappingChanges looks to fail sometimes, unrelated to the changes I've made here. It has some problems with concurrent mapping updates (where it tries to both update a field as text as well as a long).

While this fixes the problem here, it exposes another issue I think, namely that the primary will start off (i.e. be marked as started) with a history that contains gaps, i.e. local checkpoint != max sequence number. This can turn out to be problematic for replicas, because peer recovery only completes if all gaps are filled on the primary (see call to cancellableThreads.execute(() -> shard.waitForOpsToComplete(endingSeqNo)); in RecoverySourceHandler). This means that if someone does a PUT FOLLOW with wait_for_active_shards > 1, and we restore an IndexCommit with gaps, then the wait condition will time out, as we only start the following (which will complete the gaps) once the wait condition has passed. I think we should have initiateFollowing to first resume the following, and then execute the wait condition. This problem should be verifiable by extending the existing test with a wait_for_active_shards > 1 condition.

@ywelsch ywelsch changed the title WIP: Prevent Ccr recovery from missing documents Prevent CCR recovery from missing documents Feb 4, 2019
Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Tim-Brooks
Copy link
Contributor Author

@ywelsch your changes look good to me. There is a test failing (reproducibly). Looks like we assert that the sequence numbers match between Lucene and the translog. Did you mean for this to be the assertion for testRestoreShard:

closeShard(target, false);
closeShards(source);

instead of:

closeShard(source, false);
closeShards(target);

I assume we should not be asserting the ops are the same for the target since there will be some no-ops? I can update the PR, I just wanted to check if that is what you intended.

@ywelsch
Copy link
Contributor

ywelsch commented Feb 5, 2019

No, I only wanted the source index to be leniently closed (because we force-fully inject a gap). The problem was that the test was not ensuring that the index is restored with soft-deletes enabled when it is snapshotted with soft-deletes enabled. I've fixed this now

@ywelsch
Copy link
Contributor

ywelsch commented Feb 5, 2019

@elasticmachine run elasticsearch-ci/default-distro

@ywelsch
Copy link
Contributor

ywelsch commented Feb 5, 2019

@elasticmachine run elasticsearch-ci/2

@Tim-Brooks Tim-Brooks merged commit c2a8fe1 into elastic:master Feb 5, 2019
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Feb 5, 2019
* master: (23 commits)
  Lift retention lease expiration to index shard (elastic#38380)
  Make Ccr recovery file chunk size configurable (elastic#38370)
  Prevent CCR recovery from missing documents (elastic#38237)
  re-enables awaitsfixed datemath tests (elastic#38376)
  Types removal fix FullClusterRestartIT warnings (elastic#38445)
  Make sure to reject mappings with type _doc when include_type_name is false. (elastic#38270)
  Updates the grok patterns to be consistent with logstash (elastic#27181)
  Ignore type-removal warnings in XPackRestTestHelper (elastic#38431)
  testHlrcFromXContent() should respect assertToXContentEquivalence() (elastic#38232)
  add basic REST test for geohash_grid (elastic#37996)
  Remove DiscoveryPlugin#getDiscoveryTypes (elastic#38414)
  Fix the clock resolution to millis in GetWatchResponseTests (elastic#38405)
  Throw AssertionError when no master (elastic#38432)
  `if_seq_no` and `if_primary_term` parameters aren't wired correctly in REST Client's CRUD API (elastic#38411)
  Enable CronEvalToolTest.testEnsureDateIsShownInRootLocale (elastic#38394)
  Fix failures in BulkProcessorIT#testGlobalParametersAndBulkProcessor. (elastic#38129)
  SQL: Implement CURRENT_DATE (elastic#38175)
  Mute testReadRequestsReturnLatestMappingVersion (elastic#38438)
  [ML] Report index unavailable instead of waiting for lazy node (elastic#38423)
  Update Rollup Caps to allow unknown fields (elastic#38339)
  ...
Tim-Brooks added a commit to Tim-Brooks/elasticsearch that referenced this pull request Feb 5, 2019
Currently the snapshot/restore process manually sets the global
checkpoint to the max sequence number from the restored segements. This
does not work for Ccr as this will lead to documents that would be
recovered in the normal followering operation from being recovered.

This commit fixes this issue by setting the initial global checkpoint to
the existing local checkpoint.
@Tim-Brooks Tim-Brooks deleted the ccr_initial_global_checkpoint branch December 18, 2019 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features v6.7.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants