Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] [Remote Store Migration] Remote index settings are not applied if replica count is decremented during an ongoing migration #14797

Closed
shourya035 opened this issue Jul 17, 2024 · 2 comments · Fixed by #14792
Assignees
Labels
bug Something isn't working Storage:Remote untriaged

Comments

@shourya035
Copy link
Member

Describe the bug

Today, we apply remote store index settings on an index migrating from docrep if the following conditions are met:

  • All primary shards of the index are in STARTED state and are on remote nodes
  • No RELOCATING replica shard copies
  • All STARTED replicas are on remote nodes

We use the existing IndexMetadataUpdater flow invoked by AllocationService to perform this update. This code path is only invoked if the cluster is in:

  • MIXED mode
  • Direction is set to REMOTE_STORE

We have recently found an issue wherein the index settings are not applied because:

  • Primary moved over to remote nodes
  • Replica count was reduced to 0 from 1
  • Replica was never migrated to remote nodes (since the shard copy was deleted with the replica count reduction)

The code flow for decreasing replica count does not seem to be flowing through the RoutingAllocation logic and the IndexMetadataUpdater path was never executed.

Related component

Storage:Remote

To Reproduce

  • Create a docrep enabled cluster
  • Create an index with 1 primary and 1 replica shard
  • Introduce remote store enabled nodes in the cluster
  • Move over the primary shard copy from docrep nodes to remote store enabled nodes
  • Decrease the replica count to 0
  • Stop all the docrep enabled nodes and attempt to switch compatibility mode to STRICT

Compatibility mode switch would fail with the following error:

can not switch to STRICT compatibility mode since all indices in the cluster does not have remote store based index settings

Expected behavior

Remote Store based index settings should be applied even when the replica count of an index is decreased during the migration process.

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@ashking94
Copy link
Member

@shourya035 If the replica count was to increase, would the new replica be created on the node with remote store attributes?

@shourya035
Copy link
Member Author

@ashking94 Yes. We have that handled. If there is a replica count increase during migration, the new shard copy would be allocated to the remote store enabled nodes directly.

@github-project-automation github-project-automation bot moved this from 🆕 New to ✅ Done in Storage Project Board Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Storage:Remote untriaged
Projects
Status: ✅ Done
2 participants