Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doubts about the code In UnboundedLocalCache.java #869

Closed
q871795224 opened this issue Feb 16, 2023 · 3 comments
Closed

Doubts about the code In UnboundedLocalCache.java #869

q871795224 opened this issue Feb 16, 2023 · 3 comments

Comments

@q871795224
Copy link

Caffeine is the best caching library I have ever used, and I am also learning the programming ideas contained in Caffeine.

When I use Caffeine, I found the getAllPresent method in UnboundedLocalCache.java. Lines 149 to 151 of this method first put several key-value pairs whose value is null in the HashMap. I want to know what the purpose of this operation is.
image

@ben-manes
Copy link
Owner

haha, yes. This is an optimization hack to return the results in the given key order, drop duplicate keys to avoid redundant lookups, and minimize allocations as a minor performance trick.

The non-concurrent Java collections like ArrayList, HashMap, etc. all allow for null values, despite that being very troublesome in practice. The general complaint is it is unclear what Map.get(key) returning null means, so you might sometimes see Map.containsKey(key) to determine if absent or a null mapping. Very ugly.

In that method's usage, it iterates over the preallocated results and modifies the Map.Entry instances directly. If the item was found then it sets the value, else uses the iterator to discard the null mapping. Afterwards the results are returned under an unmodifiable wrapper. This way we avoided extra copying of intermediate collections.

The trick is a little more helpful for LoadingCache.getAll(keys) due to the CacheLoader.loadAll(keys) being able to return additional entries than those requested (e.g. to prefetch a group). That means we wouldn't return the loaded entries directly as it could have keys not explicitly requested. By filling in the holes of the result mapping we side step having to do that extra checking.

Of course this is not style of code to write in normal application logic, but I thought that I could get away with this micro optimization for fun in my own code without any coworkers seeing and frowning at me in a pr had I done such non-sense in company code. 🙂

public Map<K, V> getAllPresent(Iterable<? extends K> keys) {
var result = new LinkedHashMap<Object, Object>(calculateHashMapCapacity(keys));
for (Object key : keys) {
result.put(key, null);
}
int uniqueKeys = result.size();
for (var iter = result.entrySet().iterator(); iter.hasNext();) {
Map.Entry<Object, Object> entry = iter.next();
Object value = data.get(entry.getKey());
if (value == null) {
iter.remove();
} else {
entry.setValue(value);
}
}
statsCounter.recordHits(result.size());
statsCounter.recordMisses(uniqueKeys - result.size());
@SuppressWarnings("unchecked")
var castedResult = (Map<K, V>) result;
return Collections.unmodifiableMap(castedResult);

@ben-manes
Copy link
Owner

ben-manes commented Feb 16, 2023

Note that when I make fun of my optimization hack here, it is also because being clever backfired. It exposed a bug in the Java 8 rewrite of HashMap which broke this and wasn't caught since everyone avoids doing this (#176, JDK-8186171).

@ben-manes
Copy link
Owner

It does highlight a snippet where we can switch from raw Object. In Guava this is getAllPresent(Iterable<? extends Object>). There are some subtle reasons for that while Java 5 generics were being adopted. We later switched to the stricter K input collections, but it seems that I forgot to clean up the usage there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants