LoadingCache::get returns value after invoking RemovalListener #177

jandam · 2017-08-15T13:30:08Z

LoadingCache::get returns identical instance on which RemovalListener was already called. I would expect that value is invalidated/evicted before calling RemovalListener::onRemove.

  private static final int PAGE_COUNT = 1 << 10; 
 Caffeine<Long, Page> builder = (Caffeine) Caffeine.newBuilder()
                .maximumSize(PAGE_COUNT)
                .executor(Runnable::run);
                .removalListener(removalListener);
        cache = builder.build(loader);

I'm experiencing issue in following sequence. Tn - thread number 'n'

BEGIN
T1: cache.get(key1)
get value1 for key1 started (value1 is present in cache)
T2: cache.get(key2)
value for key2 is missing - so it is automatically loaded
T2: loader.load(key2)
triggers eviction for key1, cause=SIZE
T2: return cache.get(key2)
return with newly created value for key2
T1: removalListener.onRemoval(key1, value1, cause=SIZE)
notification that value1 for key1 was removed
another issue:
I think that xValue = cache.getIfPresent(key1) must be xValue==null || xValue != value1. But some times I get xValue == value1 - it means that value1 is still present in cache.
T1: return cache.get(key1)
ERROR
returns same instance of value1 for key1 on which was called onRemove
I would expect that value1 for key1 is protected from eviction or loader.load(key1) is called.

--
Thank you for advice/analysis

The text was updated successfully, but these errors were encountered:

ben-manes · 2017-08-15T14:34:28Z

Sorry, I don't understand the problem. Can you provide a test case?

jandam · 2017-08-15T16:03:19Z

Problem:
LoadingCache.get returns value on which RemovalListener.onRemoval was already called.
I want to use onRemoval for recycling resources. So I "release" the resource. But LoadingCache.get returns already "released" resource. I would expect to return new "loaded" instance.

Tomorrow I will try to send test case.

ben-manes · 2017-08-15T16:03:42Z

You may be observing that the removal listener is asynchronous from the operation. It will not be called atomically within the map operation that deleted that mapping and should happen sometime afterwards.

* Notifies the listener that a removal occurred at some point in the past.
* <p>
* This does not always signify that the key is now absent from the cache, as it may have already
* been re-added.

That's nicer in the common case where you don't want to delay other writes or the calling thread, but only when the work is fairly decoupled. If you need to run code synchronously with the removal then you can try CacheWriter instead. The writer will be invoked within the atomic computation, so a failure bubbles up to the caller.

I'm not sure if the rest is similar or not. Since the operations delegate to ConcurrentHashMap they should be atomic.

ben-manes · 2017-08-15T16:11:22Z

hmm, I am not sure then. The internal node might be recycled for expiration or weak/soft values, but that isn't the case here. I agree it shouldn't be able to give you a stale read after the mapping is gone, though a reader could obtain the value just prior to that.

Maaartinus · 2017-08-15T16:19:33Z

I guess, I understand the OP. They problem is not that the notification happens later, but rather that after the notification containing value1, the same value1 is still present. The test case in pseudocode could be like

cache.put(key1, value1);
cache.addListener(new Listener() {
    onRemoval(K key1, V value1, Cause cause) {
        assertTrue(value1 != cache.getIfPresent(key1));
    }
}
causeEviction();

No idea if this really happens, but I'd bet, I'm interpreting it right as he speaks about recycling resources.

ben-manes · 2017-08-15T16:51:29Z

I just checked all calls to BoundedLocalCache#notifyRemoval and it is invoked after the computation completes. If @jandam is using a version prior to 2.4.0, there was a stale notification bug causing the listener to observe the wrong value. A similar problem could exist, so I'm not discounting having bugs and flawed tests. If he's able to reproduce it in a unit test then hopefully I can work it out from there and resolve it.

jandam · 2017-08-15T16:57:07Z

I suppose that I'm using latest version: 2.5.x. Have to check tomorrow. Martin 15. 8. 2017 v 18:51, Ben Manes <[email protected]>:

…

I just checked all calls to BoundedLocalCache#notifyRemoval and it is invoked after the computation completes. If @jandam is using a version prior to 2.4.0, there was a stale notification bug causing the listener to observe the wrong value. A similar problem could exist, so I'm not discounting having bugs and flawed tests. If he's able to reproduce it in a unit test then hopefully I can work it out from there and resolve it. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

jandam · 2017-08-16T05:55:12Z

Tested on 2.5.2 and 2.5.4

jandam · 2017-08-16T07:09:49Z

Attached test with CacheLoader. Same issue with cache.get(, )
caffeine-issue-177.zip

ben-manes · 2017-08-16T16:20:58Z

In your test, I think there is an inherent race that can't be solved at the cache. The live value can be read from the cache by T1, the entry evicted and released by T2, and still in use by T1. A context switch makes that entirely possible. Imagine if you reduced the cache size to zero, you would expect that to happen too. Changing your code to that and it fails without threads.

The cache size and eviction policy make all the difference in your test, since entries are loaded and not reused. If the cache size is large enough then the race won't occur because eviction is delayed just long enough to not crash. Switching to Guava works at your cache size but fails at smaller. That's due to using LRU so the accessed entry is delayed by O(n) operations. Caffeine may evict much sooner to avoid cache pollution, which increases the hit rate by favoring frequency over recency.

In your case you need to provide additional synchronization around the value to handle races on mutable state. A simple approach is to synchronize on the page during usage, validating it before use under the lock. If that is too slow, you could use reference counting where the page has the count of in-cache + readers, the removal listener decrements, and the last reader performs the final clean-up when reaching zero. You might consider phantom references for this to let the GC do that implicitly so that releasing is delayed until the value is unreachable, e.g. a proxy. Then use the referent to lookup the page and release it. There are other techniques like optimistic reads with validation afterwards (e.g. StampledLock). Regardless, you'll need to manage these types of races.

jandam · 2017-08-16T18:42:09Z

Thank you very much for detailed analysis. This is not Caffeine issue. This is design feature. Now I'm able to fix my code. Sorry that you wasted time on this. Thank you very much. 16. 8. 2017 v 18:21, Ben Manes <[email protected]>:

…

In your test, I think there is an inherent race that can't be solved at the cache. The live value can be read from the cache by T1, the entry evicted and released by T2, and still in use by T1. A context switch makes that entirely possible. Imagine if you reduced the cache size to zero, you would expect that to happen too. Changing your code to that and it fails without threads. The cache size and eviction policy make all the difference in your test, since entries are loaded and not reused. If the cache size is large enough then the race won't occur because eviction is delayed just long enough to not crash. Switching to Guava works at your cache size but fails at smaller. That's due to using LRU so the accessed entry is delayed by O(n) operations. Caffeine may evict much sooner to avoid cache pollution, which increases the hit rate by favoring frequency over recency. In your case you need to provide additional synchronization around the value to handle races on mutable state. A simple approach is to synchronize on the page during usage, validating it before use under the lock. If that is too slow, you could use reference counting where the page has the count of in-cache + readers, the removal listener decrements, and the last reader performs the final clean-up when reaching zero. You might consider phantom references for this to let the GC do that implicitly so that releasing is delayed until the value is unreachable, e.g. a proxy. Then use the referent to lookup the page and release it. There are other techniques like optimistic reads with validation afterwards (e.g. StampledLock). Regardless, you'll need to manage these types of races. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

ben-manes · 2017-08-16T19:07:05Z

No problem, happy to be of help.

ben-manes · 2017-08-18T01:23:46Z

@jandam I began a FAQ, so please let me know as you encounter gotchas we should add.

jandam · 2017-08-22T06:35:58Z

Sorry for late answer. I have nothing for FAQ. Maybe there should be several lines of code for write-through and write-back cache in Writer section. RxJava example is nice. https://github.com/ben-manes/caffeine/wiki/Writer Thank you. Martin From: "Ben Manes" <[email protected]> To: "ben-manes/caffeine" <[email protected]> Cc: "jandam" <[email protected]>, "Mention" <[email protected]> Sent: Friday, August 18, 2017 3:23:53 AM Subject: Re: [ben-manes/caffeine] LoadingCache::get returns value after invoking RemovalListener (#177) @jandam I began a FAQ , so please let me know as you encounter gotchas we should add. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub , or mute the thread .

ben-manes mentioned this issue Aug 15, 2017

getAllPresent can return null values #176

Closed

ben-manes mentioned this issue Aug 16, 2017

[FEATURE] Temporarily disable eviction for a selected key #178

Closed

ben-manes closed this as completed Aug 16, 2017

eshenzw mentioned this issue Mar 8, 2021

coroutine waiting for lock on a thread，cpu 100% #512

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LoadingCache::get returns value after invoking RemovalListener #177

LoadingCache::get returns value after invoking RemovalListener #177

jandam commented Aug 15, 2017 •

edited

Loading

ben-manes commented Aug 15, 2017

jandam commented Aug 15, 2017

ben-manes commented Aug 15, 2017

ben-manes commented Aug 15, 2017

Maaartinus commented Aug 15, 2017

ben-manes commented Aug 15, 2017

jandam commented Aug 15, 2017 via email

jandam commented Aug 16, 2017

jandam commented Aug 16, 2017

ben-manes commented Aug 16, 2017

jandam commented Aug 16, 2017 via email

ben-manes commented Aug 16, 2017

ben-manes commented Aug 18, 2017

jandam commented Aug 22, 2017 via email

LoadingCache::get returns value after invoking RemovalListener #177

LoadingCache::get returns value after invoking RemovalListener #177

Comments

jandam commented Aug 15, 2017 • edited Loading

ben-manes commented Aug 15, 2017

jandam commented Aug 15, 2017

ben-manes commented Aug 15, 2017

ben-manes commented Aug 15, 2017

Maaartinus commented Aug 15, 2017

ben-manes commented Aug 15, 2017

jandam commented Aug 15, 2017 via email

jandam commented Aug 16, 2017

jandam commented Aug 16, 2017

ben-manes commented Aug 16, 2017

jandam commented Aug 16, 2017 via email

ben-manes commented Aug 16, 2017

ben-manes commented Aug 18, 2017

jandam commented Aug 22, 2017 via email

jandam commented Aug 15, 2017 •

edited

Loading