Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception in CacheLoader prevents further refreshing #1807

Closed
mkeller-ergon opened this issue Dec 5, 2024 · 3 comments
Closed

Exception in CacheLoader prevents further refreshing #1807

mkeller-ergon opened this issue Dec 5, 2024 · 3 comments

Comments

@mkeller-ergon
Copy link

mkeller-ergon commented Dec 5, 2024

When using a synchronous Caffeine Cache with refreshAfterWrite, throwing an exception in the CacheLoader leads to the cache not being refreshed anymore.
Scenario:

  • The first call to the cache successfully loads a value "value1"
  • Time passes and the value must now be refreshed
  • The second call produces an exception. This is swallowed and "value1" is returned to the caller (OK)
  • Any further call will return "value1" without calling the CacheLoader anymore, even after the refresh timeout has expired again.

Expected behavior: The Exception should be completely swallowed and the existing entry still marked as expired so that the third call will lead to a reload again.

Minimal example using Caffeine 3.1.8:

class CacheTest {
	@Test
	void shouldRefetchAfterException () throws Exception {
		LoadingCache<String, String> cache = Caffeine.newBuilder()
			.initialCapacity(1)
			.refreshAfterWrite(2, TimeUnit.SECONDS)
			.build(new CacheLoader<>() {
				private int i = 0;
				@Override
				public String load (String key) {
					i++;
					if (i == 2) {
						throw new RuntimeException("exception");
					}
					return "value" + i;
				}
			});

		assertThat(cache.get("key")).isEqualTo("value1");
		Thread.sleep(Duration.ofSeconds(3));
		assertThat(cache.get("key")).isEqualTo("value1"); // produces an exception internally and thus returns the old value (OK)
		assertThat(cache.get("key")).isEqualTo("value3"); // Assertion fails; should call the CacheLoader again but doesn't
	}
}

There's also an exception logged by Caffeine on the call with the exception which is probably ok in order to log the source problem:

java.util.concurrent.CompletionException: java.lang.RuntimeException: exception
	at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:315)
	at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:320)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1770)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1760)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
Caused by: java.lang.RuntimeException: exception
	at CacheTest$1.load(CacheTest.java:29)
	at CacheTest$1.load(CacheTest.java:22)
	at com.github.benmanes.caffeine.cache.CacheLoader.reload(CacheLoader.java:169)
	at com.github.benmanes.caffeine.cache.CacheLoader.lambda$asyncReload$2(CacheLoader.java:196)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
	... 6 more

Interestingly, the above test is green when adding this method to the CacheLoader:

@Override
public CompletableFuture<String> asyncReload (String key, String oldValue, Executor executor) {
	return completedFuture(load(key));
}

It appears that even though my cache was configured to be synchonous, some calls are still done asynchronous.

@mkeller-ergon mkeller-ergon changed the title Exception in CacheLoader prevents further refresing Exception in CacheLoader prevents further refreshing Dec 5, 2024
@ben-manes
Copy link
Owner

ben-manes commented Dec 5, 2024

The refresh feature is defined as an asynchronous load, so it always delegates to asyncReload which passes in the Caffeine.executor setting. That defaults to ForkJoinPool.commonPool().

/**
* Specifies that active entries are eligible for automatic refresh once a fixed duration has
* elapsed after the entry's creation, or the most recent replacement of its value. The semantics
* of refreshes are specified in {@link LoadingCache#refresh}, and are performed by calling
* {@link AsyncCacheLoader#asyncLoad}.
* <p>
* Automatic refreshes are performed when the first stale request for an entry occurs. The request
* triggering the refresh will make a synchronous call to {@link AsyncCacheLoader#asyncLoad} to
* obtain a future of the new value. If the returned future is already complete, it is returned
* immediately. Otherwise, the old value is returned.
* <p>
* <b>Note:</b> <i>all exceptions thrown during refresh will be logged and then swallowed</i>.

The unit test thread races with the refresh, so it asserts before the reload has completed. If you wait until that completes then the test is successful. Without changing your configuration, adding this line before the final assertion causes the test to pass.

ForkJoinPool.commonPool().awaitQuiescence(1, TimeUnit.SECONDS);

A different executor could be specified, such as Runnable::run to perform any potentially async work on the caller's thread. In that setting it would perform similar to your workaround.

When the future completes immediately we will override the return value, which was the stale value looked up, and return the reloaded one instead. Of course most of the time that won't happen if the reload is async, but it was requested by a few users to return it if an immediate future (#520, #688, #699).

@mkeller-ergon
Copy link
Author

Hi
thanks for the explanation. However I find this highly unintuitive. This means that when I use caffeine to cache a http response for say 1 hour and 2 hours later a request finally comes in, it will still receive the old, completely outdated response and only the next call will contain the updated response?
That's not what I'd expect from an explicitly synchronous cache that I can get completely outdated results...

@ben-manes
Copy link
Owner

ben-manes commented Dec 6, 2024

It sounds like you want expiration, in which the entry is not usable and must be loaded anew. In that case you would use expireAfterWrite. If you think of data's lifecycle as fresh, stale, or spoiled then prior to the refresh period it is fresh (drinkable milk), within the period it is stale and a fresh copy should be optimistically obtained (sour milk, add to shopping list), and after expiration it is spoiled and a new copy must be obtained (curdled milk, thrown away, must go to store immediately for your breakfast cereal). This is sometimes referred to as soft vs hard time-to-live expiration or refresh ahead expiration.

refreshAfterWrite is intended to be used in conjunction with an expiration policy as a latency hiding technique. If the entry is popular, such as an auth token used on every lookup, then when it expires many requests would observe a periodic latency spike as they waited for the new value to be loaded. If popular then an optimistic reload could be performed in the background to hide this penalty for a better user experience.

Some users confused this feature with a periodic full reload of the cache contents where they never want eviction, like in your example. That is an easy mistake of confusing a replica with a cache, which are similar but different concepts, and the main design choice is the replication strategy (hot, warm, cold). In the periodic case (warm replication), using a scheduled task to reload an unmodifiable Map is simpler and does not require an external library to implement.

wrt terminology, synchronous refers to whether the caller's api is blocking or non-blocking ("happening or existing at the same time"). The refresh is an optimistic load so by its nature it is non-blocking and generally not user visible api, except when explicit called using LoadingCache.refresh(key) which returns a future. It is not synchronized which would mean the data operations are sequential as this is a concurrent cache ("the act of coordinating events to operate a system in unison"). The API and documentation tries to be careful in terminology and clear about behaviors, but also uses terms lightly because there can be conflicting definitions that lead to misunderstandings (e.g. Go defines parallel and concurrent as the opposite of Java's, simply ignoring precedent as they try to popularize their uniquely novel definition).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants