Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for limiting expiry of cached objects #75

Closed
blschatz opened this issue May 3, 2016 · 18 comments
Closed

Support for limiting expiry of cached objects #75

blschatz opened this issue May 3, 2016 · 18 comments

Comments

@blschatz
Copy link

blschatz commented May 3, 2016

I would like to limit the expiry process of cached values, based not only on time, but also on properties of the object. I can see two ways to achieve this, both of which would require API changes:

  1. Modify the expiry process to evaluate a predicate against the cached value and skip expiry of a value if the predicate applies;
  2. Use a removal listener to apply the predicate, then add the entry back in it it applies.

The latter approach leaves a window of time where a new value could be created, then replaced by the put() (a putIfAbsent() would appear to solve this).

Is there a way to achieve the above use case using the current API?

@ben-manes
Copy link
Owner

If (2) is acceptable, then you can use the Cache.asMap() view.

The cache uses only O(1) algorithms to provide consistent performance regardless of size and activity. As fixed expiration is often the most useful, it benefits by also being easily modeled by shuffling entries in a linked list. The proposal in (1) could suffer O(n) scans.

A common approach to expiration (e.g. memcached) is to perform it lazily and rely on a capacity constraint. If the fetched entry is expired then emulate a miss (e.g. get/load, remove(k,v) if expired, get/load). This pollutes the cache with expired entries, but the size eviction ensures that they are removed eventually. If applicable you could implement that strategy by evaluating your custom criteria.

Otherwise you're left with potentially O(lg n) and similar approaches. That means there isn't anything more intelligent that we can do, or that feels less hacky, then what you would have to do in your code. The API tries to be malleable enough to help you get your job done, but sometimes its better to use a different tool if its not the right fit.

Usually domain knowledge in the eviction policy either has a huge gain or none whatsoever, with the latter being more likely. So if you go down that path you might want to measure to see if your optimization is helpful in practice.

@ben-manes
Copy link
Owner

ben-manes commented May 3, 2016

If the window in (2) is not acceptable a very hacky but similar idea is to use a CacheWriter. That is called within a compute block so it cannot call back into the cache. However, it could populate a victim cache with the recoverable entries and schedule an async task that re-populates the cache. If the cache loader also checks the victim cache, then the window won't be visible except if you use getIfPresent and conditional (e.g. asMap.replace) lookups. I'm not claiming that's good and may be sketchy.

Basically you want to disable expiration for certain entries. That's kind of like how an entry can have a weight of 0 indicating it won't be evicted by a size constraint. In that case the policy skips over those entries, but a pending task is to move them into a dedicated queue to avoid the scan. It hasn't been a problem since the feature is rarely used, except for AsyncLoadingCache for incomplete futures.

The request seems to be that a predicate evaluates an entry to determine if it is eligible for expiration and place it on the proper internal queue. That would be doable, but leads to a clumsy API if an ad hoc predicate. If like Weigher it calculates an expiration time (per-entry expiration) the API might be okay but it is no longer O(1). I've been resistant to providing that due to the time complexity and poor API evolution (e.g. http caching might want put(k, v, duration) not a callback evaluation).

JCache requires per-entry expiration and, like the vendors, ours is currently performed lazily by relying on a maximum size constraint. I have been planning on playing with hashed timer wheel as an alternative, since JCache doesn't enforce that a size constraint must be applied. At best that might be promoted into the core library with a warning regarding the impact on the time complexity of maintenance operations.

@blschatz
Copy link
Author

blschatz commented May 5, 2016

Thanks for the detailed replies Ben. I ended up using your CacheWriter recommendation to implement what I need.

I think you have captured what I am trying to achieve, with the clarification that the predicate would only be evaluated if the time based expiry evaluation indicated that the entry was expirable. My way of looking at it is as an override that might cause a "reset" on the expiry timer of a cached object when the timer expires and the predicate matches.

@ben-manes
Copy link
Owner

ben-manes commented May 5, 2016

Sorry if I'm overly detailed in my responses. Its a habit as I think through everything, perhaps talking to myself half the time :-)

From an API standpoint the predicate to opt out of expiration is too ad hoc. Too many of those in a library (or language) results in something clumsy and hard to maintain. So short term its not the right solution due to Bloch's conceptual weight rule and your current workaround is an acceptable loss.

But iterating towards an efficient variable expiration scheme will probably be a good resolution. The cost then would be losing O(1) behavior for expirable entries, but timer wheels have been shown to scale well in real systems. But they can have poor cascading effects without small adaptions like adding random jitter. I need to follow up on the latest techniques, such as being tickless, so that we have a good implementation to support the resulting API.

@blschatz
Copy link
Author

blschatz commented May 5, 2016

No need to apologise. I appreciate the thought put into the answers.

I agree, the current workaround is ugly but works fine, and my proposal is rather specific. I look forward to a general approach evolving. Happy to keep the dialog moving forward as your design crystalises.

@blschatz
Copy link
Author

In case you weren't aware, Agrona has adopted Netty's hashed timer wheel. https://github.com/real-logic/Agrona

@ben-manes
Copy link
Owner

You might be interested in JCTools #109. Since @mjpt777 subsequently tweeted a link I posted and that it is deprecated, there's a good chance there's doesn't include many additional optimizations.

But to be honest I haven't spent any time on evaluating this approach. I think it would work out well, but I'm hesitant to add it before doing a deep dive into the techniques to improve the worst-case scenarios. Maybe the holidays will be a good time to shift focus, as it would be a fun experiment.

@ben-manes
Copy link
Owner

ben-manes commented Mar 24, 2017

I've started to think about this again. As discussed, the implementation would be based on a hierarchical timer wheel (see slide 10). The next challenge is to design an API that feels natural but also performs well. I'd appreciate help on this part as I am unhappy with the stub described below.

Expiration response:

  • Never expires
  • Don't change current expiration
  • Expire in ... from now
  • Already expired

Constraints:

  • Avoid allocations of response value
  • Callback API for loading caches
public interface Expirer<K, V> {
  /**
   * @param now current time, in nanos
   * @return
   *   r < 0 : don't expire (eternal)
   *   r == 0: no change
   *   r > 0 : the amount of time before expiring in nanos
   */
  long create(K key, V value, long now);
  long update(K key, V value, long now);
  long read(K key, V value, long now);

  default long expiresIn(long expirationTime, @Nonnull TimeUnit unit, long now) {
    return Math.max(1L, unit.toNanos(expirationTime) - now);
  }
  default long noChange() {
    return 0L;
  }
  default long neverExpires() {
    return -1L;
  }
}

I don't think already expired is a good response to support. To avoid allocation and the lack of value types, we either have to use a primitive or hope escape analysis would kick in. If we use a primitive then we need special codes, which is confusing. To try to simplify that, we could use helper methods that users could call to translate their response correctly. But if they don't realize it could be error prone if doing a simple math operation (e.g. value.creationTime() + 5m - now).

I'm also unhappy with the interface name: Expirer, Expiry, etc are all a bit ugly.

Would might also want to consider a Comparator or Collector style functional builders after the interface is settled upon.

@blschatz @oakad @jgoldhammer @yrfselrahc @Maaartinus @lowasser

@Maaartinus
Copy link

  • r < 0 : don't expire (eternal)
  • r == 0: no change
  • r > 0 : the amount of time before expiring in nanos

To me, this feels wrong. Eternal and in 2**63 nanoseconds (292 years) is about the same, so I'd go for r == Long.MAX_VALUE instead of eternal (no special casing needed, saturated math does the job). Any negative value should mean "already expired".

No idea how to handle "Don't change current expiration" properly.

@ben-manes
Copy link
Owner

ben-manes commented Mar 25, 2017

You're right. I think I confused myself and wrote this up too quickly during my morning commute.

Already expired is a quirky concept. It would mean that a newly loaded value could have been expired, we should not return it to the user, and notify the removal listener. It might make sense if one we wanted to let the user hold the timestamp instead, but then we'd have to query it when shuffling the entry in the timer wheel or allocate a wrapper to publish into the read buffer. Any foreign interface can throw exceptions and we lose ability to dictate the memory visibility constraints.

Perhaps to handle "no change", we pass in the current expiration time as well. If no change is desired then the user should return that duration. So we could reduce this down to a single answer of "expire in ..." which has to be positive.

@ben-manes
Copy link
Owner

ben-manes commented Mar 25, 2017

public interface Expiry<K, V> {
  /** Returns the duration in nanoseconds until the entry has expired. */
  long expireAfterCreate(K key, V value, long currentTimeNanos) {
    return TimeUnit.MINUTES.toNanos(5);
  }
  long expireAfterUpdate(K key, V value, long currentTimeNanos, long expirationTimeNanos) {
    return expirationTimeNanos - currentTimeNanos;
  }
  long expireAfterRead(K key, V value, long currentTimeNanos, long expirationTimeNanos) {
    return expirationTimeNanos - currentTimeNanos;
  }
}

The above example would satisfy the requests for expireAfterCreate. The stub interface handles never expiring (via large value), no changes (by calculating the current delta), and expiring at a future time. I'm still not comfortable supporting a already expired (zero / negative value), but if we did then loosening the constraints by not throwing an exception would be backwards compatible.

I don't know of a use-case for update to be provided with the old and new values. It would also restrict when the evaluation is made.

I still dislike all of the interface names that I can think of.

Thoughts?

@cruftex
Copy link
Contributor

cruftex commented Mar 25, 2017

I did think about that quite a lot. Actually, the current interfaces in cache2k for expiry went through several redesigns in the last years after we found some caveats in production. The interface is called ExpiryPolicy there are five return values with a special meaning as described in ExpiryTimeValues. Besides that, there is also an interface that can be implemented by the value class that provides the expiry time, ValueWithExpiryTime. It also needs to be noted, that the ExpiryPolicy is used for the refresh ahead timing as well and there is a special policy that controls expiry times for caching or suppression of exceptions, called ResiliencePolicy. Seriously speaking, the whole thing is quite over engineered, but it proofs very useful. (Sorry, I am aware this has an evangelical touch. However, we already have some practical experience on exactly that topic.)

Some comments on the actual proposal:

Expiry interface with three methods create(), update() and read()

Looks similar to JCache. So far, I never had a use case for a different expiry on create(). Maybe it exists, but it is not the majority. IMHO an interface with a single method for the common use case is better to avoid code bloat. Any more sophisticated use cases could be covered by another interface AdvandedExpiry. I would also recommend to separate the read() use case. The cache "should know" whether there is a special handling for read or not, to do useful optimizations.

duration

With a time reference in the parameter returning a duration or point in time can do the same things. I decided for point in time for two reasons:

  • If an entry needs to expire at a point in time, the value just passes through, since we need the point in time for the internal timer. Returning point in time saves redundant calculations.
  • point in time seems more future proof for further enhancements. For example, there can be proper support for platforms that might have wall clock changes, such as mobile phones.

resolution

A millisecond resolution seems to be enough and more practical. OTOH the nano resolution better fits the design philosophy of Caffeine. See below.

randomization, sharp or lagging expiry

cache2k doesn't do expiry time randomization currently, however we experience some spikes in production and see a need for it. The interface already has a distinction between expiry at an exact point in time (called sharp expiry) and expiry that should happen approximately at a time, so the cache can optimize and expire, which is in reality mostly a refresh, when there are sufficient resources.

Talking about randomization as example, cache2k would typically have a configuration switch, while Caffeine, with the library philosophy, would probably go with a randomization adapter on the user expiry policy. In that way using nanosecond resolution makes sense.

However, with the current proposal the cache implementation needs to decide for one semantic:

  • expire exactly when the duration has passed
  • expire approximately when the duration has passed and optimize on resource usage

Typically, it would be the first one. But, OTOH, if exireAfterWrite() is specified, an exact expiry isn't needed and the cache could do optimizations and randomization.

I don't know of a use-case for update to be provided with the old and new values. It would also restrict when the evaluation is made.

We call the use case "dynamic expiry". The duration can be adjusted, depending on how aggressively the data is changing. For example if the available seats in a flight change from 10 to 3 you might do a shorter duration for refresh, then when they change from 180 to 179. You need also the load time of the previous entry, that's why we put in a complete CacheEntry as parameter which has getLastModification(). Storing the additional value, is producing some additional memory overhead, however, knowing the last modification or load time, is useful at other places, too. I am feeling quite ambivalent at that point. In Caffeine, if a user likes to do it, it can store the time in the value object.

Side note: Providing the previous value is useful for the loader, too. See the comments in AdvancedCacheLoader

Just shoot, if there are any questions.

For the next release of cache2k I am planing a reimplementation on the actual expiry process, to improve resource usage, so I am looking forward to see some more thoughts on how to implement it efficiently.

@blschatz
Copy link
Author

I'm not sure I follow. Is the proposal that at expiry time, the Expirer.update() method would be called, wherein the specific logic (where we want to keep the entry alive) could return a time in the future, thus keeping the entry alive?

@ben-manes
Copy link
Owner

Thanks Jens. That practical, well thought out, and pragmatic feedback that I appreciate. In regards to your points,

interface

For most usages, the expireAfterWrite or expireAfterAccess would still be preferred. That is both more efficient (linked lists) and simpler to configure.

The 3 method interface could be reduced via a functional builder, e.g. Comparator or Collector. So the noise is easier to reduce since we can assume Java 8, whereas you remain Java 6 compatible.

duration

When I originally tried to model it as point in time it resulted in an ugly interface similar to others. The duration model appears to result in a cleaner API and adds only a two cycle penalty (one subtraction in user code, one in cache). I think we can iterate and, I think, the duration style is likely to come out on top.

resolution

Nano matches Ticker, so it makes sense to be consistent. There are interesting quirks about the cost and accuracy of System.nanoTime vs System.currentTimeInMillis(). Many of them disappear on Linux, thankfully.

randomization

You're right, I would leave approximations to a decorator or custom variable policy. It is nice to provide utility methods or documentation to assist, though. The stampeding effect is something users have to consider for all resources, but this would allow for that compared to the existing API.

dynamic expiry

I'd probably lean towards asking users to store the timestamps in their value, as you suggest. I also think developers over estimate the value of domain-specific policies. Most often more general strategies are better and easier to reason about, e.g. using pub-sub instead of trying to dynamically tune the refresh time.

_ efficiency_

See slide 10 for a very short overview of timer wheels. They can either rely on a ticking thread or be tickless, the latter being what Caffeine would use. There are many good papers, articles, and code examples that discuss in depth. While that is amortized O(1), for fixed expiration we use LRU/FIFO time bounded queues which are strict O(1).

@ben-manes
Copy link
Owner

@blschatz

This would not cover the scenario of resurrection. In your approach of an expired entry querying a predicate, this could have a storm of calls on near expiration. That's because reads would have to call before faking a miss. If the predicate was implemented as a synchronized or I/O operation (e.g. check database) then this would be harmful. I don't think that approach would be supported. Instead your 2nd approach would be doable with a CacheWriter without a gap in visibility if all queries are to a CacheLoader, allowing you to do intermediate shuffling by stashing it into a shared map.

The proposal is for expiration to be configured per-entry. Whenever a read or write operation is performed the user can determine when the entry will expire. They can use fields of the value to determine what makes sense, e.g. if the duration should be based on the entry's database timestamp. Then one entry might have a ten minute lifetime and another an hour lifetime, without requiring O(lg n) operations to sort them. This proposal would still require periodic operations on the cache entry to revisit its expiration time.

Sorry if that proposal doesn't fit your use-case (there were a few open issues, and yours was the most technical discussion). It would give you a little more flexibility in the re-evaluation during the writer/loader shuffling if you wanted to promote an entry to eternal or use a non-fixed expiration interval.

@cruftex
Copy link
Contributor

cruftex commented Mar 25, 2017

Most often more general strategies are better and easier to reason about, e.g. using pub-sub instead of trying to dynamically tune the refresh time.

Cannot agree more, but so far no system we had to integrate provided pub-sub :(

ben-manes added a commit that referenced this issue May 1, 2017
The expiration time can now be customized on a per entry basis to allow
them to expire at different rates. This is acheived in O(1) time using
a timer wheel and evaluating using the new Expiry interface. This
setting can be combined with refreshAfterWrite, but is incompatible
with the fixed expiration types (expireAfterAccess, expireAfterWrite).

While the test suite was updated to incorporate this new configuration
option, there is still remaining work before this should be released.
 - New tests specific to this feature (such as exceptional conditions)
   have not yet been written
 - Incorporate a data integrity check for the timer wheel into the
   validation listener
 - Inspection through cache.policy()
 - JCache integration
 - Documentation
@ben-manes
Copy link
Owner

This feature is implemented, but additional tests are needed before release.

ben-manes added a commit that referenced this issue May 1, 2017
The expiration time can now be customized on a per entry basis to allow
them to expire at different rates. This is acheived in O(1) time using
a timer wheel and evaluating using the new Expiry interface. This
setting can be combined with refreshAfterWrite, but is incompatible
with the fixed expiration types (expireAfterAccess, expireAfterWrite).

While the test suite was updated to incorporate this new configuration
option, there is still remaining work before this should be released.
 - New tests specific to this feature (such as exceptional conditions)
   have not yet been written
 - Incorporate a data integrity check for the timer wheel into the
   validation listener
 - Inspection through cache.policy()
 - JCache integration
 - Documentation
ben-manes added a commit that referenced this issue May 1, 2017
The expiration time can now be customized on a per entry basis to allow
them to expire at different rates. This is acheived in O(1) time using
a timer wheel and evaluating using the new Expiry interface. This
setting can be combined with refreshAfterWrite, but is incompatible
with the fixed expiration types (expireAfterAccess, expireAfterWrite).

While the test suite was updated to incorporate this new configuration
option, there is still remaining work before this should be released.
 - New tests specific to this feature (such as exceptional conditions)
   have not yet been written
 - Incorporate a data integrity check for the timer wheel into the
   validation listener
 - Inspection through cache.policy()
 - JCache integration
 - Documentation
ben-manes added a commit that referenced this issue May 1, 2017
The expiration time can now be customized on a per entry basis to allow
them to expire at different rates. This is acheived in O(1) time using
a timer wheel and evaluating using the new Expiry interface. This
setting can be combined with refreshAfterWrite, but is incompatible
with the fixed expiration types (expireAfterAccess, expireAfterWrite).

While the test suite was updated to incorporate this new configuration
option, there is still remaining work before this should be released.
 - New tests specific to this feature (such as exceptional conditions)
   have not yet been written
 - Incorporate a data integrity check for the timer wheel into the
   validation listener
 - Inspection through cache.policy()
 - JCache integration
 - Documentation
@ben-manes
Copy link
Owner

Released 2.5.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants