-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should search preference _primary take primary relocation target into account? #26335
Comments
I'm struggling to find a use case for |
If I recall correctly,
However, having |
@dakrone thanks for filling in the context and history. I should have done it in the first place.
I think this use case is better served by the |
Those work also, so +1 to removing the others if no one else objects. Perhaps the title of this issues should be changed to reflect the discussion? |
From what I understand, we only need to keep the two preference |
@liketic there are others that should still be kept, I believe the consensus was just to remove |
@dakrone Do we have a valid use case for |
@bleskes other than them being nicer than having to determine the nodes themselves? I'm not sure. The only thing I could think would be a very script-heavy update where you wanted to use |
The question is why would you like to do that? I would op to remove those. It just confuses people to think that primaries are special (and we should remove cases where they are..) |
That (the update scenario) was the only scenario I could think of
Sure, if you think the scenario I mentioned isn't valid
I'm not sure if we're going to be able to move away from update scripts running on the primary shard? At least definitely not on the short term |
So it's OK to remove |
@dakrone OK. I'll add a discuss label to this so it will be picked up in fix it friday and we make a final decision. I tend to say remove them all (we can fix the update issue by making it read from any replica, but I agree that's not happening soon) @liketic can you hold off for a few days? I appreciate the willing to contribute. |
@jasontedor, I just want to make sure that I understand the requirement correctly. We will deprecate the preference "_primary*" and "_replica*" in 6.x and remove them in 7.x. Thank you. |
@dnhatn Yes. |
@jasontedor I am not sure if the recommendation to use |
@dnhatn Not reading from the primary is okay in the case of optimistic concurrency control; if the read is from a stale value, the dependent write will be rejected anyway and the read will simply have to be retried. This is okay for a workflow using optimistic concurrency control. Therefore, I think we can simply remove this recommendation from the documentation. |
The shard preference `_primary`, `_replica` and its variants were useful for the asynchronous replication. However, with the current impl, they are no longer useful and should be removed. Closes elastic#26335
The shard preference _primary, _replica and its variants were useful for the asynchronous replication. However, with the current impl, they are no longer useful and should be removed. Closes #26335
Tells v6 users that we are going to remove these options in v7. Relates #26335
Sorry, late to the conversation on this but we use
|
@aewhite you can better use a random string which will stick to a random copy (see https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-preference.html ). ES assumes primaries do just as much work as replicas when balancing the cluster. As noted about the |
I too am frustrated that this is being removed. We have lots of situations with very large indexes and very low query volume. When only one query is likely to be executing at a time, it is preferred to use the primary shards so that the index files can be cached into OS disk cache. Having the queries go back and forth between primary and replica simply means doubling the amount of RAM required for the same performance. The "random string" doesn't help at all for this use case and I'm not even sure why it was mentioned. We have low query volume (like, 1 QPS) so hot spots are not an issue. And we want the system to automatically fail over to the replica when the primary machine goes down and suffer some degraded performance until the machine is restored. Large index + low query volume is quite a common use case in our world, and it seems that you're forcing us to double our OS disk cache RAM requirement. Do you have recommendations for reducing hardware cost for this scenario? |
It makes sense that you want to route repeated queries to the same set of shard copies each time to make best use of disk cache, but there's no need for those copies to be primaries.
I think you're misunderstanding what happens when using Indeed, the random string should give you more even node usage than |
I see. So you're saying to use the same preference value for all queries, and not a random unique one for every query. Okay - so that makes sense. And I presume that if the normal shard targeted by preference=X goes down, then the replica will be used? If so, then I completely agree that this is a fine solution. Thanks! |
Yes. |
Today it is unclear what guarantees are offered by the search preference feature, and we claim a guarantee that is stronger than what we really offer: > A custom value will be used to guarantee that the same shards will be used > for the same custom value. This commit clarifies this documentation and explains more clearly why `_primary`, `_replica`, etc. are deprecated in `6.x` and removed in `master`. Relates elastic#31929 elastic#26335 elastic#26791.
Today it is unclear what guarantees are offered by the search preference feature, and we claim a guarantee that is stronger than what we really offer: > A custom value will be used to guarantee that the same shards will be used > for the same custom value. This commit clarifies this documentation and explains more clearly why `_primary`, `_replica`, etc. are deprecated in `6.x` and removed in `master`. Relates #31929 #26335 #26791.
Today it is unclear what guarantees are offered by the search preference feature, and we claim a guarantee that is stronger than what we really offer: > A custom value will be used to guarantee that the same shards will be used > for the same custom value. This commit clarifies this documentation and explains more clearly why `_primary`, `_replica`, etc. are deprecated in `6.x` and removed in `master`. Relates #31929 #26335 #26791.
Today it is unclear what guarantees are offered by the search preference feature, and we claim a guarantee that is stronger than what we really offer: > A custom value will be used to guarantee that the same shards will be used > for the same custom value. This commit clarifies this documentation and explains more clearly why `_primary`, `_replica`, etc. are deprecated in `6.x` and removed in `master`. Relates #31929 #26335 #26791.
Today it is unclear what guarantees are offered by the search preference feature, and we claim a guarantee that is stronger than what we really offer: > A custom value will be used to guarantee that the same shards will be used > for the same custom value. This commit clarifies this documentation and explains more clearly why `_primary`, `_replica`, etc. are deprecated in `6.x` and removed in `master`. Relates #31929 #26335 #26791.
When using the search preference
_primary
, and executing a search on a cluster where a concerned primary is about to complete relocation, it's possible that the search for that primary will fail, as it will hit the primary relocation source just when the relocation is completed and the source shard is shut down. My question is:Should the search preference
_primary
take the primary relocation target into account?Put differently, should
OperationRouting.preferenceActiveShardIterator(...)
return the primary relocation target as backup when specifying_primary
preference?Note that the default search preference always takes relocation targets into account (and puts them last together with the "regular" initializing shards).
Related question:
Should the search preference
_replica
also take replica relocation targets into account?More background to this:
There was a test failure of SearchWhileCreatingIndexIT which illustrates the issue:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+g1gc/4012/consoleFull
The text was updated successfully, but these errors were encountered: