Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Support cacheNullValue flag in the CaffeineSpec #391

Closed
DanielYWoo opened this issue Feb 5, 2020 · 9 comments
Closed

Comments

@DanielYWoo
Copy link

Sometimes a null value means the data does not exist and the null value should be cached to avoid extra expensive database access. But sometimes, the underlying database or external system is very fast to return value if it's null and in that case the null value is not need to be cached.
Is it possible to add cacheNullValue flag in the CaffeineSpec?

@ben-manes
Copy link
Owner

For negative caching, using an Optional as the value is clearer. This avoids confusion with apis that can return null to indicate no mapping, eg HashMap.get has this confusion as null values are allowed. An optional wrapping is clearer and doesn’t require anything special in the cache or user code.

@DanielYWoo
Copy link
Author

DanielYWoo commented Feb 5, 2020

Yes, that's how Spring Cache did, they use a ValueWrapper

protected Object toStoreValue(@Nullable Object userValue) {
	if (userValue == null) {
		if (this.allowNullValues) {
			return NullValue.INSTANCE;
		}
	    throw new IllegalArgumentException(
			"Cache '" + getName() + "' is configured to not allow null values but null was provided");
	}
	return userValue;
}

The NullValue.INSTANCE is actually a special object. They introduced a lot of extra code to handle this, and some messy generic class cast issue. It makes sense to handle this on top of Caffeine with Optional or any kind of wrapper. But if the Caffeine can support it then everybody will code less ;-)

@ben-manes
Copy link
Owner

It would cause even worse code if we tried to embed the pattern internally, but also many api challenges. I think it’s fair to wrap the cache and even build better libraries on top, especially for broader scoped goals. A case like Spring does it once for all of its users and hides the cache over its own abstraction, which I think is ideal. If you think of this library as a fancy Map then it’s okay to hide within a different abstraction. Null values simply cause too many quirks with Java collections (e.g. computations) that I don’t think we can do anything more intelligent than suggest wrapping the cache with a more opinionated api.

@ben-manes
Copy link
Owner

For example AsyncCache and the asMap() view would not be friendly to null mappings. It’s also confusing for callbacks, like Weigher, to become null aware. These mix concepts where a surrogate value is explicit and understood. Unfortunately I can’t see a means that doesn’t break compatibility while also not becoming a confusing (and errorprone) construct. Sorry that I can’t convince myself of a means to add this that is best long term.

@DanielYWoo
Copy link
Author

That's fine, I agree with your point. Thanks anyway.

@sangelopoulosTP
Copy link

sangelopoulosTP commented Jan 7, 2024

For negative caching, using an Optional as the value is clearer. This avoids confusion with apis that can return null to indicate no mapping, eg HashMap.get has this confusion as null values are allowed. An optional wrapping is clearer and doesn’t require anything special in the cache or user code.

Sorry to comment on such an old issue @ben-manes but I would appreciate it if you helped me understand your point of view on how the empty optional and the null are different.
For example I have this:

@Cacheable
public Person findById(String id){
// hit the db
}

@Cacheable
public Optional<Person> findById(String id){
// hit the db
}

To me it makes sense handling the null as absent and hitting the database again. But why not do the same on the empty optional? Why should the user expect a difference in caching?

@DanielYWoo
Copy link
Author

@sangelopoulosTP Without breaking the API compatibility, you can use Optional, or NullValue.INSTANCE as I suggested in this thread, or your customized class like Empty or NegativeCache to distinguish "not found" or "not cached" yet.

Your example is based on Spring which is not a good example and cannot demonstrate the problem. You may look at the query API of Caffeine.

@sangelopoulosTP
Copy link

@DanielYWoo I know about the alternatives, I'm just trying to understand the creator's POV on why a return value of Optional.empty() is different than null and should be handled differently.

@ben-manes
Copy link
Owner

To me it makes sense handling the null as absent and hitting the database again. But why not do the same on the empty optional? Why should the user expect a difference in caching?

In some cases one wants to use negative caching to avoid calls where they know that it won't be found. If the entry was found and cached then the performance would be predictable on a miss, but if absent then every call would perform the resource load, incurring a latency penalty as it was fetched anew. An attacker could exploit this to cause a denial of service attack by overwhelming the underlying resource with many negative lookups. In those cases negative caching can be required for performance and reliability, and a user might set it use a different expiration time-to-live to revalidate more or less frequently.

I'm just trying to understand the creator's POV on why a return value of Optional.empty() is different than null and should be handled differently.

HashMap allows get(key) returning null to mean either no key mapping or a mapping where the key mapped to null. The only way to distinguish this is to also use Map.containsKey to determine which it meant. That is not only a confusing api, but it does not work in a concurrent map where another thread may change the map, such as adding a mapping, so there is no way to get the actual result's meaning.

The newer methods like computeIfAbsent have similar problems when they allow null mappings and it is not distinguishable as to whether it was absent or has a null value. Thus, if you want to use null for negative caching you would be surprised as the method is spec'd to always invoke the function, which is done atomically for ConcurrentHashMap style implementations, so the intended fast read is instead an operation under an exclusive lock.

In computeIfPresent a null new value typically means that the mapping should be removed. However in HashMap if the prior value is null then returning null as the new value has no effect. The mapping remains unchanged so there is no way to remove the entry via a computation.

Similarly putIfAbsent (Java Collection Puzzlers) was intended to returns null to indicate no prior mapping. When this was redefined to allow for null mappings, that signal is unclear as it returns null if the prior value was null.

A null mapping is very confusing as to what methods do in a non-concurrent map. Even worse in a concurrent one it cannot be distinguished because the secondary operations break atomicity. No matter what it will be a surprising api for the user as they cannot be sure about what behavior to expect.

Disallowing null and storing Optional as a value makes the semantics clear. A value of null means there was no mapping found and a value of Optional.empty() means it is mapped to an empty sentinel. The apis like gets, computes, puts, removes are all very clear and the concurrency / performance is predictable. The behavior of a cached optional can be tuned, e.g. expire empty ones sooner than non-empty for a short-lived negative cache.

I don't see how overloading null or Optional with two potential meanings does not cause confusion or become a limitation. Spring Cache has a very constrained api with fewer features and for advanced cases recommends using the vendor's. That's a reasonable approach to make the common case simple and idiomatic, but not restrict users from getting advanced work done. I think Caffeine's api is pretty good at clarity and flexibility given the breadth of user needs. I see no harm in applications using lighter abstractions for ease of use, especially since Caffeine is meant to be a hidden implementation detail and not part of an application's api like Spring Cache's is between domain services.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants