Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve shards evictions in searchable snapshot cache service #67517

Merged
merged 2 commits into from
Jan 14, 2021

Conversation

tlrx
Copy link
Member

@tlrx tlrx commented Jan 14, 2021

The searchable snapshot's cache service is notified when cache files
of a specific shard must be evicted. The notifications are usually done
in a cluster state applier thread that calls the CacheService#
markShardAsEvictedInCache method.

The markShardAsEvictedInCache adds the shard to an internal set
of ShardEviction and submits the eviction of the shard to the generic
thread pool. Because there's nothing preventing the cache service
(and persistent cache service) to be closed before all shared evictions
are processed, it is possible that invalidating a cache file fails and trips
an assertion (as it happened in many tests failures recently #66958, #66730).

This commit changes the CacheService so that it now waits for the evictions
of shards to complete before closing the cache and persistent cache services.

Backport of #67160 for 7.12

…c#67160)

The searchable snapshot's cache service is notified when cache files
of a specific shard must be evicted. The notifications are usually done
in a cluster state applier thread that calls the CacheService#
markShardAsEvictedInCache method.

The markShardAsEvictedInCache adds the shard to an internal set
of ShardEviction and submits the eviction of the shard to the generic
 thread pool. Because there's nothing preventing the cache service
(and persistent cache service) to be closed before all shared evictions
are processed, it is possible that invalidating a cache file fails and trips
an assertion (as it happened in many tests failures recently elastic#66958, elastic#66730).

This commit changes the CacheService so that it now waits for the evictions
of shards to complete before closing the cache and persistent cache services.
@tlrx tlrx added :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs backport v7.12.0 labels Jan 14, 2021
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jan 14, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@tlrx tlrx merged commit 83d9a9e into elastic:7.x Jan 14, 2021
@tlrx tlrx deleted the improve-shards-evictions-7.x branch January 14, 2021 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants