-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple Threads BLOCKED at BoundedLocalCache.removeWithWriter #340
Comments
Can you please upgrade to the latest release. The timestamp is within the computation which should address part of your concern. I think this fixes it but you'd have to monitor to determine if it shifts the problem to another area to investigate. This seems to have been fixed in b3e5412 for #212, and released in version v2.6.1. Please take a look at the code for the current release below. If it resolves your issues that would be wonderful. If not then let's gather some more information after the upgrade and investigate a fix. The subsequent versions should be backwards compatible with only improvements and bug fixes.
Does this mean you are maintaining an external expiration mechanism in addition to expiration support in Caffeine? It seems irrelevant to this problem, but I am curious about your setup in general. caffeine/caffeine/src/main/java/com/github/benmanes/caffeine/cache/BoundedLocalCache.java Lines 2128 to 2161 in c06c5cc
|
Thank you for the response and thanks for pointing out the commit that may fix our issue. I'll upgrade to version 2.8.0 and exercise the code. Yes, the code is maintaining the object expiration, not that I implemented it - I just inherited it. It seems the code is not using the expiration functionality of Caffeine, based upon the builder configuration. The following is from our Cache Monitor that initializes Caffeine.
The implementation seems odd to me as it does not appear to be utilizing Caffeine for the cache but rather for notification of subscribers. The Cache Monitor is invoked for all operations but doesn't appear to return any of the values retrieved from Caffeine. There is a consumer of the Cache Monitor which is actually performing the caching and retrieval of objects. (Very odd) Thanks again for the assistance on this issue. |
That is a bit odd. If you do not use the built-in expiration then that stale timestamp wouldn't matter, as the cache never expires an entry. However you did say earlier that the cache has just I would recommend using the cache's expiration logic (fixed or variable), rather than manually removing the stale entry. By default the removal occurs lazily when other activity on the cache triggers maintenance. If you have promptness requirements, like to notify even if no cache activity is occuring, then you can use A slight nit is that you can remove that casting to let the cache builder properly infer types, e.g. this.lruCache = Caffeine.newBuilder()
.weigher((Weigher<CacheKey<K> key, V value) -> weigher.weigh(key.getKey(), value)) It's also good to keep somewhat up to date on dependency versions. If you use Gradle then you might find my gradle-versions-plugin helpful. Maven has this built-in, others have similar I'm sure. Just a good practice. |
I think the reason for not using the cache's expiration logic is because each cache item can have a different expiration value. However, I was just reviewing the docs on Caffeine and I see that it does indeed have a varying expiration policy. I think this would work for us but would require some changes on our end to define a common interface for all objects placed into the cache. (more work to do) The cache contains more entires than 2, many more.... it is the ThreadPool size that was set to 2. Thanks for the information on the scheduler. That indeed does look interesting and my allow us to evict items in a non-lazy fashion. I'll investigate that further. We are currently using maven so the dependencies should be pulled in properly. It appears that version 2.5.2 was the latest version when it was included in the project but it has never been updated since it was added. (Surprise!) I'll need to add it to our list of libraries that need to be maintained on a regular basis. Thank you again for all of your assistance. |
In addition to evaluating via For Maven, the task versions:display-dependency-updates is incredibly helpful. It's useful to run periodically or, if no one normally pays attention, setup a CI job to email them once a quarter. I'm going to close this for now, but feel free to re-open or create new issues. I'm hoping that upgrading, getting more familiarity with the code, and refactorings might resolve this naturally. If I can be of help or whatever, please do ping me. |
We are using version 2.5.2 of caffeine. We are seeing an issue with threads being BLOCKED when calling BoundedLocalCache.remove.
We are initializing the cache with ExecutorService.newFixedThreadPool with a size of 2. I don't know that this has any bearing on the situation but I thought I'd mention it.
http-STAGING-10-181646-3 #476 daemon prio=5 os_prio=0 tid=0x00007f67b48af800 nid=0x4c92 waiting for monitor entry [0x00007f6031de4000] java.lang.Thread.State: BLOCKED (on object monitor) at com.github.benmanes.caffeine.cache.BoundedLocalCache.removeWithWriter(BoundedLocalCache.java:1810) at com.github.benmanes.caffeine.cache.BoundedLocalCache.remove(BoundedLocalCache.java:1734) at com.github.benmanes.caffeine.cache.LocalManualCache.invalidate(LocalManualCache.java:74)
This occurs when load is high on our server with 6000 concurrent users. We are performing a get and are able to locate the object container. We then check the expirationTime of the object and if it is expired we then call remove. This is where the blocking occurs and appears to be related to the BoundedLocalCache.removeWithWriter method.
`
long now = expirationTicker().read();
RemovalCause[] cause = new RemovalCause[1];
`
The above code acquires the current timestamp (now) prior to the synchronized block, which seems as if it might be problematic if multiple threads were to be blocked on the object monitor.
Our theory is that if multiple threads are requesting the same cache key and the associated cache item is expired, each thread will be calling into BoundedLocalCache.remove, which calls into BoundedLocalCache.removeWithWriter, and when the synchronized block is hit only one thread will succeed while the other threads wait. However, at some point while this is occurring a new object is loaded into the cache for this key. As the remaining BLOCKED threads are waiting to acquire the monitor, at some point one will succeed and it will then remove the object that was just added as the evaluation checks the timestamp (now) value that was acquired prior to the synchronized block.
I can't imagine that our situation is unique but this is the only block of code in our execution that contains a synchronized block.
Thanks for the assistance.
The text was updated successfully, but these errors were encountered: