Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent snapshot backed indices to be followed using CCR #70580

Merged
merged 5 commits into from
Mar 24, 2021

Conversation

tlrx
Copy link
Member

@tlrx tlrx commented Mar 18, 2021

Today nothing prevents CCR's auto-follow patterns to pick up snapshot backed indices on a remote cluster. This can lead to various errors on the follower cluster that are not obvious to troubleshoot for a user (ex: multiple engine factories provided).

This pull request adds verifications to CCR to make it fail faster when a user tries to follow an index that is backed by a snapshot, providing a more obvious error message.

@tlrx tlrx added >enhancement :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features v8.0.0 v7.13.0 v7.12.1 labels Mar 18, 2021
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Mar 18, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@tlrx
Copy link
Member Author

tlrx commented Mar 23, 2021

Build failure is #70621, I'll merge master again to pick up the test muting.

@tlrx tlrx requested a review from henningandersen March 23, 2021 10:03
Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

"index to follow [%s] is a searchable snapshot index and cannot be used for cross-cluster replication purpose",
indexToFollow.getName()
);
LOGGER.warn(message);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this needs to be a warning? It seems like a documented behavior and therefore this could be just DEBUG?

I wonder if we should update docs for auto-follow to state that searchable snapshots cannot be followed and will be ignored by auto-follow patterns?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this needs to be a warning? It seems like a documented behavior and therefore this could be just DEBUG?

I did not find any documentation relative to this - hence the initial WARN level - but maybe you found something in the CCR doc? I pushed e8ab12a to move to debug.

I wonder if we should update docs for auto-follow to state that searchable snapshots cannot be followed and will be ignored by auto-follow patterns?

I think we should and I'd like to do a follow up PR for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not see any such info, I just thought it was more logical to simply ignore these than to warn, in particular because of the issues to use * patterns when auto-following like #67686

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I agree.

final String mountedIndex = "mounted-" + testPrefix;
if ("leader".equals(targetCluster)) {
final String repositoryPath = System.getProperty("tests.leader_cluster_repository_path") + '/' + testPrefix;
assertThat("Missing system property [tests.leader_cluster_repository_path]", repositoryPath, not(emptyOrNullString()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can never be true? Did you mean to check that the system property is set?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this!

Indeed I wanted to check the existence of the sysprop but I later added a suffix which made this assertion useless. Fixing the assertion allowed to catch a bug where the sysprop was not correctly passed everywhere so I pushed a2a5fdf to fix all of that.

{
try (var leaderClient = buildLeaderClient()) {
final String repositoryPath = System.getProperty("tests.leader_cluster_repository_path") + '/' + testPrefix;
assertThat("Missing system property [tests.leader_cluster_repository_path]", repositoryPath, not(emptyOrNullString()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant to check the value of the system property and not the value of the concatenated string (which cannot be empty).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a2a5fdf

@tlrx tlrx merged commit efa6aea into elastic:master Mar 24, 2021
@tlrx tlrx deleted the ccr-with-searchable-snapshots branch March 24, 2021 09:58
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Mar 24, 2021
Today nothing prevents CCR's auto-follow patterns to pick
up snapshot backed indices on a remote cluster. This can
lead to various errors on the follower cluster that are not
obvious to troubleshoot for a user (ex: multiple engine
factories provided).

This commit adds verifications to CCR to make it fail faster
when a user tries to follow an index that is backed by a
snapshot, providing a more obvious error message.

Backport of elastic#70580
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Mar 24, 2021
Today nothing prevents CCR's auto-follow patterns to pick
up snapshot backed indices on a remote cluster. This can
lead to various errors on the follower cluster that are not
obvious to troubleshoot for a user (ex: multiple engine
factories provided).

This commit adds verifications to CCR to make it fail faster
when a user tries to follow an index that is backed by a
snapshot, providing a more obvious error message.

Backport of elastic#70580
@tlrx
Copy link
Member Author

tlrx commented Mar 24, 2021

Thanks Henning!

tlrx added a commit that referenced this pull request Mar 24, 2021
Today nothing prevents CCR's auto-follow patterns to pick
up snapshot backed indices on a remote cluster. This can
lead to various errors on the follower cluster that are not
obvious to troubleshoot for a user (ex: multiple engine
factories provided).

This commit adds verifications to CCR to make it fail faster
when a user tries to follow an index that is backed by a
snapshot, providing a more obvious error message.

Backport of #70580
tlrx added a commit that referenced this pull request Mar 24, 2021
Today nothing prevents CCR's auto-follow patterns to pick
up snapshot backed indices on a remote cluster. This can
lead to various errors on the follower cluster that are not
obvious to troubleshoot for a user (ex: multiple engine
factories provided).

This commit adds verifications to CCR to make it fail faster
when a user tries to follow an index that is backed by a
snapshot, providing a more obvious error message.

Backport of #70580
tlrx added a commit that referenced this pull request Apr 6, 2021
…70863)

This commit adds a note in CCR document about auto-follow 
patterns that should not match searchable snapshots indices.

Relates #70580 (comment)
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Apr 6, 2021
…lastic#70863)

This commit adds a note in CCR document about auto-follow 
patterns that should not match searchable snapshots indices.

Relates elastic#70580 (comment)
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Apr 6, 2021
…lastic#70863)

This commit adds a note in CCR document about auto-follow 
patterns that should not match searchable snapshots indices.

Relates elastic#70580 (comment)
tlrx added a commit that referenced this pull request Apr 6, 2021
…70863) (#71319)

This commit adds a note in CCR document about auto-follow 
patterns that should not match searchable snapshots indices.

Relates #70580 (comment)
tlrx added a commit that referenced this pull request Apr 6, 2021
…70863) (#71320)

This commit adds a note in CCR document about auto-follow 
patterns that should not match searchable snapshots indices.

Relates #70580 (comment)
tlrx added a commit that referenced this pull request Jun 28, 2021
The test AutoFollowIT.testAutoFollowSearchableSnapshotsFails was 
added in #70580 in order to test that mounted indices of a leader 
cluster are not auto-followed in a follower cluster using CCR.

This test sometimes fails because it expects 2 indices to be 
followed (the -regular and the -index indices) but not the mounted 
one. This looks wrong as the -index index is deleted soon after it 
is snapshotted, and this index only exist to create a snapshot that 
can be later mounted as an index in the leader cluster.

This commit changes the test so that the -index index, the 
repository and the snapshot are created at the beginning of the 
test. Then the test creates the mounted index and the regular 
one and can now asserts that only the regular one was 
auto-followed.

Closes #74486
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Jun 28, 2021
…4498)

The test AutoFollowIT.testAutoFollowSearchableSnapshotsFails was
added in elastic#70580 in order to test that mounted indices of a leader
cluster are not auto-followed in a follower cluster using CCR.

This test sometimes fails because it expects 2 indices to be
followed (the -regular and the -index indices) but not the mounted
one. This looks wrong as the -index index is deleted soon after it
is snapshotted, and this index only exist to create a snapshot that
can be later mounted as an index in the leader cluster.

This commit changes the test so that the -index index, the
repository and the snapshot are created at the beginning of the
test. Then the test creates the mounted index and the regular
one and can now asserts that only the regular one was
auto-followed.

Closes elastic#74486
tlrx added a commit that referenced this pull request Jun 28, 2021
The test AutoFollowIT.testAutoFollowSearchableSnapshotsFails was
added in #70580 in order to test that mounted indices of a leader
cluster are not auto-followed in a follower cluster using CCR.

This test sometimes fails because it expects 2 indices to be
followed (the -regular and the -index indices) but not the mounted
one. This looks wrong as the -index index is deleted soon after it
is snapshotted, and this index only exist to create a snapshot that
can be later mounted as an index in the leader cluster.

This commit changes the test so that the -index index, the
repository and the snapshot are created at the beginning of the
test. Then the test creates the mounted index and the regular
one and can now asserts that only the regular one was
auto-followed.

Backport of #74498
Closes #74486
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/CCR Issues around the Cross Cluster State Replication features >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.12.1 v7.13.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants