Improve shards evictions in searchable snapshot cache service #67160

tlrx · 2021-01-07T14:26:41Z

The searchable snapshot's cache service is notified when cache files of a specific shard must be evicted. The notifications are usually done in a cluster state applier thread that calls the CacheService#markShardAsEvictedInCache method.

The markShardAsEvictedInCache adds the shard to an internal set of ShardEviction and submits the eviction of the shard to the generic thread pool. Because there's nothing preventing the cache service (and persistent cache service) to be closed before all shared evictions are processed, it is possible that invalidating a cache file fails and trips an assertion (as it happened in many tests failures recently #66958, #66730).

This pull request changes the CacheService so that it now waits for the evictions of shards to complete before closing the cache and persistent cache services. Like before, it allows searchable snapshot shards that have been previously marked as evicted to start by forcing the eviction of existing cache files. Finally, it removes the KeyedLock used before and improves test coverage.

elasticmachine · 2021-01-07T14:26:45Z

Pinging @elastic/es-distributed (Team:Distributed)

tlrx · 2021-01-07T15:04:39Z

...st/java/org/elasticsearch/xpack/searchablesnapshots/AbstractSearchableSnapshotsTestCase.java

+    /**
+     * Generates one or more cache files using the specified {@link CacheService}. Each cache files have been written at least once.
+     */
+    protected List<CacheFile> randomCacheFiles(CacheService cacheService) throws Exception {


This has been moved from the PersistentCacheTests class and sightly changed so that it always generate at least one random cache file and always read/write a range in each cache file.

tlrx · 2021-01-07T15:13:19Z

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

+        if (pendingFutures.isEmpty() == false) {
+            try {
+                final CountDownLatch latch = new CountDownLatch(pendingFutures.size());
+                pendingFutures.forEach(completableFuture -> completableFuture.whenComplete((integer, throwable) -> latch.countDown()));


I think it is ok to only wait for running shard evictions to complete and not process other shard evictions that have not been processed yet: if the shard files are still on disk they will be reused or removed after node starts again and if the shard files have been deleted from disk they won't be reloaded at node start up by the persistent cache logic.

Agreed. We want to terminate as fast as we can.

I wonder if a simple read-write lock scheme could suffice? Processing the eviction takes the read-lock (and checks that state==started) and stop takes the write lock.

tlrx · 2021-01-07T15:36:38Z

@henningandersen This relates one of your #66173 (comment) - sorry for the time it took me to get there.

This commit removes an assertion that makes many tests to fail on CI until #67160 is merged, which has been opened to fix the underlying issues around that assertion tripping (and brings back the assertion). Relates #67160

This commit removes an assertion that makes many tests to fail on CI until elastic#67160 is merged, which has been opened to fix the underlying issues around that assertion tripping (and brings back the assertion). Relates elastic#67160

This commit removes an assertion that makes many tests to fail on CI until #67160 is merged, which has been opened to fix the underlying issues around that assertion tripping (and brings back the assertion). Relates #67160

henningandersen

Thanks for addressing this. I have a few suggestions to simplify the code a bit. I might have missed a finer point on why this cannot work out.

henningandersen · 2021-01-12T08:54:37Z

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

+        synchronized (shardsEvictionsMutex) {
+            if (allowShardsEvictions) {
+                final ShardEviction shardEviction = new ShardEviction(snapshotUUID, snapshotIndexName, shardId);
+                if (addPendingShardEviction(shardEviction)) {


I wonder if we could remove runningShardsEvictions if we make pendingShardsEvictions a Map<ShardEviction, Future> and register the future that threadPool.generic().submit() returns in the map? Since we do this under the lock, we should be able to check whether it exists before submitting and registering in the map.

evictCacheFilesIfNeeded would get from the map (under lock) and wait for the future (not under lock).

I agree this simplifies the code. One of the motivation to differentiate pending and running shard evictions was to allow a searchable snapshot shard to immediately execute the eviction of cache files, while in your suggestion it would mean that the shard will have to wait for the submitted runnable to complete in the generic thread pool before moving forward with recovery. Maybe we are OK with this?

That is a good point, thanks for clarifying.

With the current threading this seems impossible to happen, since evictCacheFilesIfNeeded is always called on the generic thread pool. Since the evict of shard folders must happen before restoring the same searchable snapshot shard again, the initial shard evict job would come before a subsequent restore of the same searchable snapshot shard in the generic queue. Thus if the restore is running, the evict job must either have started to run or completed.

I would be inclined to rely on this for now. If those were separate threads, I suppose waiting would like be out of the question so would require a new design if we change the threading model.

the initial shard evict job would come before a subsequent restore of the same searchable snapshot shard in the generic queue

Thanks Henning, I think you're right. I went with your suggestion of simplification.

henningandersen · 2021-01-12T09:01:45Z

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

+        if (pendingFutures.isEmpty() == false) {
+            try {
+                final CountDownLatch latch = new CountDownLatch(pendingFutures.size());
+                pendingFutures.forEach(completableFuture -> completableFuture.whenComplete((integer, throwable) -> latch.countDown()));


Agreed. We want to terminate as fast as we can.

I wonder if a simple read-write lock scheme could suffice? Processing the eviction takes the read-lock (and checks that state==started) and stop takes the write lock.

henningandersen · 2021-01-12T09:05:03Z

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

+            try {
+                final CountDownLatch latch = new CountDownLatch(pendingFutures.size());
+                pendingFutures.forEach(completableFuture -> completableFuture.whenComplete((integer, throwable) -> latch.countDown()));
+                if (latch.await(shardsEvictionsStopTimeout.duration(), shardsEvictionsStopTimeout.timeUnit()) == false) {


I am not sure we want a timeout here? I have a hard time reason through what it means when timed out - and if we can safely handle that, we should just avoid the wait altogether (not proposing that). I see value though in first waiting for some seconds, then outputting a warning before waiting indefinitely.

I think we can also just do without a timeout, in tests we will interrupt if the shutdown takes too long anyway so just no timeout and logging the interrupted exception is good enough IMO

Sure. I went with Henning's suggestion of using a read/write lock for this.

original-brownbear

Some smaller comments, didn't go through all details now before since there's some open design questions from Henning.

original-brownbear · 2021-01-12T09:08:24Z

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

+        }
+        if (pendingFutures.isEmpty() == false) {
+            try {
+                final CountDownLatch latch = new CountDownLatch(pendingFutures.size());


Maybe it would be better to use a GroupedActionListener + PlainActionFuture here so we collect the Exceptions if any occur instead of suppressing them?

original-brownbear · 2021-01-12T09:09:48Z

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

@@ -127,6 +130,13 @@
        Setting.Property.NodeScope
    );

+    public static final Setting<TimeValue> SNAPSHOT_CACHE_SHARD_EVICTIONS_SHUTDOWN_TIMEOUT = Setting.timeSetting(


I don't think this really needs a timeout setting does it? It's really only relevant for tests to begin with and we can just hard-code chose a reasonable value like 10s (that's what we did in pretty much all other similar places like waiting for recoveries to finish/cancel etc).

No it does not, I went with a hard coded timeout of 10s then wait indefinitely.

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

original-brownbear · 2021-01-12T09:20:26Z

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

-            try {
-                if (evictedShards.remove(shardEviction)) {
-                    runnable.run();
+    CompletableFuture<Integer> processShardEvictionIfNeeded(ShardEviction shardEviction) {


NIT: This can just return Future

original-brownbear · 2021-01-12T09:22:26Z

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

+                assert runningShardsEvictions.get(shardEviction) == null : "found a running shard eviction for " + shardEviction;
+                return CompletableFuture.completedFuture(0);
+
+            } else if (runningShardsEvictions.containsKey(shardEviction)) {


I think this could just be simplified with computeIfAbsent in an else branch?

original-brownbear · 2021-01-12T09:23:30Z

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

+            try {
+                final CountDownLatch latch = new CountDownLatch(pendingFutures.size());
+                pendingFutures.forEach(completableFuture -> completableFuture.whenComplete((integer, throwable) -> latch.countDown()));
+                if (latch.await(shardsEvictionsStopTimeout.duration(), shardsEvictionsStopTimeout.timeUnit()) == false) {


I think we can also just do without a timeout, in tests we will interrupt if the shutdown takes too long anyway so just no timeout and logging the interrupted exception is good enough IMO

tlrx · 2021-01-14T14:39:08Z

Thanks a lot @henningandersen for the multiple comments and suggestions!

…c#67160) The searchable snapshot's cache service is notified when cache files of a specific shard must be evicted. The notifications are usually done in a cluster state applier thread that calls the CacheService# markShardAsEvictedInCache method. The markShardAsEvictedInCache adds the shard to an internal set of ShardEviction and submits the eviction of the shard to the generic thread pool. Because there's nothing preventing the cache service (and persistent cache service) to be closed before all shared evictions are processed, it is possible that invalidating a cache file fails and trips an assertion (as it happened in many tests failures recently elastic#66958, elastic#66730). This commit changes the CacheService so that it now waits for the evictions of shards to complete before closing the cache and persistent cache services.

#67519) The searchable snapshot's cache service is notified when cache files of a specific shard must be evicted. The notifications are usually done in a cluster state applier thread that calls the CacheService# markShardAsEvictedInCache method. The markShardAsEvictedInCache adds the shard to an internal set of ShardEviction and submits the eviction of the shard to the generic thread pool. Because there's nothing preventing the cache service (and persistent cache service) to be closed before all shared evictions are processed, it is possible that invalidating a cache file fails and trips an assertion (as it happened in many tests failures recently #66958, #66730). This commit changes the CacheService so that it now waits for the evictions of shards to complete before closing the cache and persistent cache services.

#67517) The searchable snapshot's cache service is notified when cache files of a specific shard must be evicted. The notifications are usually done in a cluster state applier thread that calls the CacheService# markShardAsEvictedInCache method. The markShardAsEvictedInCache adds the shard to an internal set of ShardEviction and submits the eviction of the shard to the generic thread pool. Because there's nothing preventing the cache service (and persistent cache service) to be closed before all shared evictions are processed, it is possible that invalidating a cache file fails and trips an assertion (as it happened in many tests failures recently #66958, #66730). This commit changes the CacheService so that it now waits for the evictions of shards to complete before closing the cache and persistent cache services.

Improve shards evictions in searchable snapshot cache service

a48474a

tlrx added >enhancement :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.12.0 v7.11.1 labels Jan 7, 2021

elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jan 7, 2021

This was referenced Jan 7, 2021

Fix SearchableSnapshotsPersistentCacheIntegTests.testCacheSurviveRestart #66578

Closed

SearchableSnapshotDirectory should not evict cache files when closed #66173

Merged

tlrx commented Jan 7, 2021

View reviewed changes

tlrx requested review from original-brownbear and henningandersen January 7, 2021 15:33

tlrx added 2 commits January 8, 2021 15:33

Merge branch 'master' into improve-shards-evictions

fd1de88

Merge branch 'master' into improve-shards-evictions

44952bf

tlrx mentioned this pull request Jan 11, 2021

Remove assertion that makes many test failing #67265

Merged

tlrx mentioned this pull request Jan 11, 2021

Remove assertion that makes many test failing (#67265) #67279

Merged

tlrx mentioned this pull request Jan 11, 2021

Remove assertion that makes many test failing (#67265) #67280

Merged

henningandersen reviewed Jan 12, 2021

View reviewed changes

original-brownbear reviewed Jan 12, 2021

View reviewed changes

tlrx added 3 commits January 12, 2021 15:26

feedback

b5e5199

Merge branch 'master' into improve-shards-evictions

f9b9b2a

Merge branch 'master' into improve-shards-evictions

06f24af

tlrx added 11 commits January 14, 2021 10:29

Merge branch 'master' into improve-shards-evictions

8ad36ba

private

4e32df5

assert generic

f2c6e21

waitForPendingEvictions

29f30fa

testRepopulateCachevvvvv

8a6a536

exists

5293eca

tests

b9cfded

remove pendingFutures.isEmpty() == false

6c29e73

use mutex

ba8a029

unlock if success

1e5c965

deletions

aae2f78

tlrx merged commit 99340fe into elastic:master Jan 14, 2021

tlrx deleted the improve-shards-evictions branch January 14, 2021 14:38

tlrx mentioned this pull request Jan 14, 2021

Improve shards evictions in searchable snapshot cache service #67517

Merged

tlrx mentioned this pull request Jan 14, 2021

Improve shards evictions in searchable snapshot cache service #67519

Merged

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve shards evictions in searchable snapshot cache service #67160

Improve shards evictions in searchable snapshot cache service #67160

tlrx commented Jan 7, 2021 •

edited

Loading

elasticmachine commented Jan 7, 2021

tlrx Jan 7, 2021

tlrx Jan 7, 2021

henningandersen Jan 12, 2021

tlrx commented Jan 7, 2021

henningandersen left a comment

henningandersen Jan 12, 2021

tlrx Jan 12, 2021

henningandersen Jan 12, 2021 •

edited

Loading

tlrx Jan 12, 2021

henningandersen Jan 12, 2021

henningandersen Jan 12, 2021

original-brownbear Jan 12, 2021

tlrx Jan 12, 2021

original-brownbear left a comment

original-brownbear Jan 12, 2021

original-brownbear Jan 12, 2021

tlrx Jan 12, 2021

original-brownbear Jan 12, 2021

original-brownbear Jan 12, 2021

original-brownbear Jan 12, 2021

tlrx commented Jan 14, 2021

Improve shards evictions in searchable snapshot cache service #67160

Improve shards evictions in searchable snapshot cache service #67160

Conversation

tlrx commented Jan 7, 2021 • edited Loading

elasticmachine commented Jan 7, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tlrx commented Jan 7, 2021

henningandersen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

henningandersen Jan 12, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

original-brownbear left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tlrx commented Jan 14, 2021

tlrx commented Jan 7, 2021 •

edited

Loading

henningandersen Jan 12, 2021 •

edited

Loading