Implement CCR bootstrap from remote #35975

Tim-Brooks · 2018-11-28T04:30:42Z

CCR Bootstrap from Remote

Pre-feature freeze

Post-feature freeze

The text was updated successfully, but these errors were encountered:

cjcenizal · 2018-12-04T19:25:33Z

@tbrooks8 @sebelga and I were wondering if the API could provide any information about the bootstrapping process, which we could display in the UI? For example, whether or not bootstrapping is in progress, how many documents have been replicated, how many remain, and whether there have been any errors.

This is related to #35975. It implements a basic restore functionality for the CcrRepository. When the restore process is kicked off, it configures the new index as expected for a follower index. This means that the index has a different uuid, the version is not incremented, and the Ccr metadata is installed. When the restore shard method is called, an empty shard is initialized.

This is related to elastic#35975. It implements a basic restore functionality for the CcrRepository. When the restore process is kicked off, it configures the new index as expected for a follower index. This means that the index has a different uuid, the version is not incremented, and the Ccr metadata is installed. When the restore shard method is called, an empty shard is initialized.

This is related to #35975. It implements a basic restore functionality for the CcrRepository. When the restore process is kicked off, it configures the new index as expected for a follower index. This means that the index has a different uuid, the version is not incremented, and the Ccr metadata is installed. When the restore shard method is called, an empty shard is initialized.

jen-huang · 2018-12-19T20:27:56Z

@tbrooks8 Do you have any information regarding CJ's questions ☝️ about what kind of information we could display the UI? We are trying to figure this out for 6.7 UI work, any preliminary info/docs about new APIs or changes to existing ones would be appreciated. Thank you!

Tim-Brooks · 2018-12-19T20:43:50Z

My work does not currently included any new external APIs. The recovery from remote is implemented as a normal recovery (through a repository). There are pre-existing apis for recoveries: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-recovery.html.

If we need new APIs for information, that work still needs to be scoped out and those discussions should include people like @ywelsch and @jasontedor. My work primarily involves transferring segment files from the leader to the follower through our existing recovery infrastructure.

sebelga · 2018-12-20T07:59:05Z

@tbrooks8 thanks for clarifying. This sounds like something to add in the detail panel of Index Management under the Summary tab. What do you think @bmcconaghy @yaronp68 ?

This is related to #35975. When the shard restore process is complete, the index mappings need to be updated to ensure that the data in the files restores is compatible with the follower mappings. This commit implements a mapping update as the final step in a shard restore.

This is related to elastic#35975. When the shard restore process is complete, the index mappings need to be updated to ensure that the data in the files restores is compatible with the follower mappings. This commit implements a mapping update as the final step in a shard restore.

This is related to #35975. When the shard restore process is complete, the index mappings need to be updated to ensure that the data in the files restores is compatible with the follower mappings. This commit implements a mapping update as the final step in a shard restore.

This is related to #35975. It implements a file based restore in the CcrRepository. The restore transfers files from the leader cluster to the follower cluster. It does not implement any advanced resiliency features at the moment. Any request failure will end the restore.

This is related to elastic#35975. This commit adds timeout functionality to the local session on a leader node. When a session is started, a timeout is scheduled using a repeatable runnable. If the session is not accessed in between two runs the session is closed. When the sssion is closed, the repeating task is cancelled.

This is related to elastic#35975. This commit implements rate limiting on the follower side using the `RateLimitingInputStream`.

…hard * Changed the shard changes api to include a special metadata in the exception being thrown to indicate that the ops are no longer there. * Changed ShardFollowNodeTask to handle this exception with special metadata and mark a shard as fallen behind its leader shard. The shard follow task will then abort its on going replication. The code that does the restore from ccr repository still needs to be added. This change should make that change a bit easier. Relates to elastic#35975

This is related to elastic#35975. We do not want a slow master to fail a recovery from remote process due to a slow put mappings call. This commit increases the master node timeout on this call to 30 mins.

This is related to #35975. We do not want a slow master to fail a recovery from remote process due to a slow put mappings call. This commit increases the master node timeout on this call to 30 mins.

Relates elastic#35975 Relates elastic#39006

We simulate remote recovery in ShardFollowTaskReplicationTests by bootstrapping the follower with the safe commit of the leader. Relates #35975

This is related to #35975. It adds documentation on the remote recovery process. Additionally, it adds documentation about the various settings that can impact the process.

ywelsch · 2020-07-23T08:05:44Z

@tbrooks8 can you check what remains to be done here so that we can close this issue?

Tim-Brooks · 2021-10-06T14:09:40Z

Closing as all of the relevant tasks have been completed.

Tim-Brooks added Meta v7.0.0 :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features v6.6.0 labels Nov 28, 2018

Tim-Brooks self-assigned this Nov 28, 2018

Tim-Brooks changed the title ~~Implement CCR bootstrap from Remote~~ Implement CCR bootstrap from remote Nov 28, 2018

Tim-Brooks added the >enhancement label Nov 30, 2018

Tim-Brooks mentioned this issue Dec 6, 2018

Implement basic CcrRepository restore #36287

Merged

Tim-Brooks mentioned this issue Dec 12, 2018

Implement basic CcrRepository restore (#36287) #36551

Merged

Tim-Brooks added v6.7.0 and removed v6.6.0 labels Dec 18, 2018

Tim-Brooks mentioned this issue Dec 20, 2018

Update index mappings when ccr restore complete #36879

Merged

Tim-Brooks mentioned this issue Dec 20, 2018

Update index mappings when ccr restore complete (#36879) #36920

Merged

Tim-Brooks mentioned this issue Jan 4, 2019

Implement ccr file restore #37130

Merged

Tim-Brooks mentioned this issue Jan 14, 2019

Add local session timeouts to leader node #37438

Merged

Tim-Brooks added a commit to Tim-Brooks/elasticsearch that referenced this issue Jan 15, 2019

Implement follower rate limiting for file restore

adae8e8

This is related to elastic#35975. This commit implements rate limiting on the follower side using the `RateLimitingInputStream`.

Tim-Brooks mentioned this issue Jan 15, 2019

Implement follower rate limiting for file restore #37449

Merged

jasontedor added v8.0.0 and removed v7.0.0 labels Feb 6, 2019

danielmitterdorfer added v7.2.0 and removed v6.7.0 labels Feb 7, 2019

Tim-Brooks mentioned this issue Feb 8, 2019

Set update mappings mater node timeout to 30 min #38652

Merged

dnhatn added a commit to dnhatn/elasticsearch that referenced this issue Feb 17, 2019

Simulate remote recovery in ShardFollowTaskReplicationTests

8ae54ac

Relates elastic#35975 Relates elastic#39006

dnhatn mentioned this issue Feb 17, 2019

Add remote recovery to ShardFollowTaskReplicationTests #39007

Merged

dnhatn added a commit that referenced this issue Feb 18, 2019

Add remote recovery to ShardFollowTaskReplicationTests (#39007)

25503ef

We simulate remote recovery in ShardFollowTaskReplicationTests by bootstrapping the follower with the safe commit of the leader. Relates #35975

dnhatn added a commit that referenced this issue Feb 18, 2019

Add remote recovery to ShardFollowTaskReplicationTests (#39007)

2947ccf

We simulate remote recovery in ShardFollowTaskReplicationTests by bootstrapping the follower with the safe commit of the leader. Relates #35975

dnhatn added a commit that referenced this issue Feb 18, 2019

Add remote recovery to ShardFollowTaskReplicationTests (#39007)

62ed1aa

We simulate remote recovery in ShardFollowTaskReplicationTests by bootstrapping the follower with the safe commit of the leader. Relates #35975

dnhatn added a commit that referenced this issue Feb 18, 2019

Add remote recovery to ShardFollowTaskReplicationTests (#39007)

9dc8975

We simulate remote recovery in ShardFollowTaskReplicationTests by bootstrapping the follower with the safe commit of the leader. Relates #35975

dnhatn mentioned this issue Feb 19, 2019

Replay history of operations in remote recovery #39153

Closed

Tim-Brooks mentioned this issue Feb 27, 2019

Add documentation on remote recovery #39483

Merged

Tim-Brooks added a commit that referenced this issue Mar 5, 2019

Add documentation on remote recovery (#39483)

ee41b22

This is related to #35975. It adds documentation on the remote recovery process. Additionally, it adds documentation about the various settings that can impact the process.

Tim-Brooks added a commit that referenced this issue Mar 5, 2019

Add documentation on remote recovery (#39483)

ee7c019

This is related to #35975. It adds documentation on the remote recovery process. Additionally, it adds documentation about the various settings that can impact the process.

Tim-Brooks added a commit that referenced this issue Mar 5, 2019

Add documentation on remote recovery (#39483)

19fa399

This is related to #35975. It adds documentation on the remote recovery process. Additionally, it adds documentation about the various settings that can impact the process.

Tim-Brooks added a commit that referenced this issue Mar 5, 2019

Add documentation on remote recovery (#39483)

5e6953a

This is related to #35975. It adds documentation on the remote recovery process. Additionally, it adds documentation about the various settings that can impact the process.

Tim-Brooks mentioned this issue Mar 20, 2019

Expand following documentation in ccr overview #39936

Merged

jakelandis added v7.3.0 and removed v7.2.0 labels Jun 17, 2019

jpountz removed v7.3.0 v8.0.0 labels Jul 5, 2019

rjernst added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label May 4, 2020

Tim-Brooks closed this as completed Oct 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement CCR bootstrap from remote #35975

Implement CCR bootstrap from remote #35975

Tim-Brooks commented Nov 28, 2018 •

edited

Loading

cjcenizal commented Dec 4, 2018

jen-huang commented Dec 19, 2018

Tim-Brooks commented Dec 19, 2018

sebelga commented Dec 20, 2018

ywelsch commented Jul 23, 2020

Tim-Brooks commented Oct 6, 2021

Implement CCR bootstrap from remote #35975

Implement CCR bootstrap from remote #35975

Comments

Tim-Brooks commented Nov 28, 2018 • edited Loading

CCR Bootstrap from Remote

Pre-feature freeze

Post-feature freeze

cjcenizal commented Dec 4, 2018

jen-huang commented Dec 19, 2018

Tim-Brooks commented Dec 19, 2018

sebelga commented Dec 20, 2018

ywelsch commented Jul 23, 2020

Tim-Brooks commented Oct 6, 2021

Tim-Brooks commented Nov 28, 2018 •

edited

Loading