Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prioritized Replica Recovery is reversed by date #13249

Closed
pickypg opened this issue Sep 1, 2015 · 5 comments
Closed

Prioritized Replica Recovery is reversed by date #13249

pickypg opened this issue Sep 1, 2015 · 5 comments
Assignees
Labels
>bug :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source.

Comments

@pickypg
Copy link
Member

pickypg commented Sep 1, 2015

Prioritized allocation enables the recovery in the order of index.priority > index.creation_date > index.name (reversed). However, I've found that when allowing it to work based on index.creation_date (the default mechanism), it does it in the reverse of the expected order relative to replicas.

It's easy enough to reproduce with enough daily indices by manually deleting the replicas from one node, throttling the heck out of recovery, and speeding up monitoring:

PUT /_cluster/settings
{
  "transient": {
    "cluster.routing.allocation.node_concurrent_recoveries" : 1,
    "indices.recovery.concurrent_streams" : 1,
    "indices.recovery.concurrent_small_file_streams" : 1,
    "indices.recovery.max_bytes_per_sec" : "1mb",
    "marvel.agent.interval" : "500ms"
  }
}

As I was watching it, I decided to take some screenshots:

  1. Reversed Priority 1 of 2
  2. Reversed Priority 2 of 2

This also appears to not be honoring the index.priority either, as I tried to use it as a workaround and it did not impact the recovery order at all, which makes me assume that this is not even coming into play during replica recovery.

@pickypg pickypg added >bug :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. labels Sep 1, 2015
@pickypg pickypg changed the title Prioritized Recovery is reversed by date Prioritized Replica Recovery is reversed by date Sep 1, 2015
@clintongormley
Copy link
Contributor

@s1monw could you take a look at this please?

@s1monw
Copy link
Contributor

s1monw commented Sep 1, 2015

I don't understand what you are testing here. I can't see the priorities you are giving, I don't see if the replicas where allocated before and if not there will be no ordering as far as I can tell. I don't see if primaries got allocated first and I wonder what you expected to see sorry it's unclear.

@nik9000
Copy link
Member

nik9000 commented Sep 1, 2015

I don't understand what you are testing here. I can't see the priorities you are giving, I don't see if the replicas where allocated before and if not there will be no ordering as far as I can tell. I don't see if primaries got allocated first and I wonder what you expected to see sorry it's unclear.

I'm not familiar with the screenshot source but it looks like the indexes are recovering oldest to newest rather than newest to oldest. But I'm likely reading that wrong.

@pickypg
Copy link
Member Author

pickypg commented Sep 1, 2015

@s1monw

I don't understand what you are testing here.

Replica recovery order with 2 nodes.

  1. I throttled recovery as shown.
  2. I took the second node offline.
  3. I deleted all of its .marvel-* indices from the offline node.
  4. I restarted the offline node and watched recovery.

I can't see the priorities you are giving.

I only set index.priority after seeing the images above. I picked arbitrary indices in the middle of the group and gave higher values for them individually (e.g., .marvel-2015.08.22 I gave the priority of 200). All of the creation dates are going to be roughly around midnight of the date of the index (no weirdness or cheating on creation of the indices).

I don't see if primaries got allocated first

They did. Synced flushed replica shards (not shown) also got recovered before these replicas were recovered.

I wonder what you expected to see sorry it's unclear.

I expected to see what @nik9000 suggested: the newest to oldest recovery of the replicas. Basically, .marvel-2015.08.28's replica should be recovered before .marvel-2015.08.27's replica, which should be recovered before .marvel-2015.08.26's replica (and so on).

It seems like the replica's do not consider priority in their recovery order and the oldest indices are being recovered.

@s1monw
Copy link
Contributor

s1monw commented Sep 1, 2015

I deleted all of its .marvel-* indices from the offline node.

if you don't let the gateway allocator fetch any replicas to recover it won't respect priorities and will leave the rest to the shard balancer. The balancer will do it's own sorting at this point. This has never been implemented

s1monw added a commit to s1monw/elasticsearch that referenced this issue Sep 8, 2015
Today we try to allocate primaries first and then replicas
but don't take the index creation date and priority into account
as we do in the GatewayAlloactor.

Closes elastic#13249
s1monw added a commit that referenced this issue Sep 8, 2015
Today we try to allocate primaries first and then replicas
but don't take the index creation date and priority into account
as we do in the GatewayAlloactor.

Closes #13249
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source.
Projects
None yet
Development

No branches or pull requests

4 participants