Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refresh as a non-blocking write #56

Closed
ben-manes opened this issue Mar 4, 2016 · 31 comments
Closed

Refresh as a non-blocking write #56

ben-manes opened this issue Mar 4, 2016 · 31 comments

Comments

@ben-manes
Copy link
Owner

Currently refreshAfterWrite and LoadingCache.refresh(key) are performed as blocking writes (computations). This was a simple to implement approach that does not obstruct reads, avoids duplicate in-flight calls, and postpones write expiration. However it has a few drawbacks that a fully non-blocking version would not,

  • Blocks other writers to that entry (explicit or eviction).
    • May also block independent writes due to ConcurrentHashMap locking on the bin level
  • Requires an explicit executor task, hinting that the reload should use a same-thread executor
    • May waste a thread if asyncReload(key, executor) does not use the provided executor

A fully asynchronous refresh must handle conflicts and notify the removal listener appropriately. A conflict occurs when the refresh completes and the mapping has changed (different or absent value). This is detectable as a reload computes a new value based on the key and old value. The refresh cannot know whether the change is benign or whether it would insert stale data.

In Guava the refresh clobbers or inserts the entry, but due to races dropping is the only safe default choice. That default is potentially wasteful, so the user should have the option to resolve the conflict instead. This can be done by adding a new default method to AsyncCacheLoader,

// Placeholder name, suggestions welcome!
default boolean retainOnRefreshConflict(K key, V currentValue, V oldValue, V newValue) {
  return false;
}

The removal listener must be notified to avoid resource leaks. In Guava the removal cause is replaced, but that is intended as a user actions rather than an automatic one. As we need to signal when a reload replaced or dropped the value, to avoid confusion a refreshed should be added to the enum.

The post-processing of a reload will be implemented by attaching a whenComplete to the future returned by asyncReload. This avoids the unnecessary thread case described above for AsyncLoadingCache. It is worth noting that cancelling the refreshing future if the entry is removed was rejected. This is because unlike Future, in CompletableFuture the task is not interrupted to hint that it may stop and the removal listener would still need to be notified if the value materialized.

The final detail is whether an in-flight refresh should postpone expiration. This is currently the case for expireAfterWrite (but not expireAfterAccess) due to reusing the entry's writeTime to avoid an extra field to indicate that a refresh was triggered. A stricter expireAfterWrite would require an additional per-entry field and is more prone to a redundant load. For now we'll keep it as is, but be willing to change if requested.

The above is from brainstorming with Etienne Houle @ Stingray (@ehoule).

@abatkin, @yrfselrahc, @lowasser might have an opinions on this design

@ben-manes
Copy link
Owner Author

This is progressing on the refresh branch. The changes match closely to the above, but further work is needed to complete. The existing tests cases pass. The pending tasks include,

  • Add test cases for when RemovalCause.REFRESHED is emitted
    • Probably emits REPLACED on a success, but should now be the new cause
  • Adding tests for retainOnRefreshConflict returning true
  • Adding JavaDoc for the API additions
  • Ignoring expireAfterAccess while the entry is being refreshed

@ghost
Copy link

ghost commented Mar 14, 2016 via email

@ben-manes
Copy link
Owner Author

I'm not sold on the method name either and am hoping to hear suggestions. I figured that I should try to get the rest of the code ready in the meantime, though.

@ghost
Copy link

ghost commented Mar 15, 2016

One idea would be resolveRefreshConflict, and then have it return V instead
of boolean.

Charles

On Mon, Mar 14, 2016 at 12:46 PM Ben Manes [email protected] wrote:

I'm not sold on the method name either and am hoping to hear suggestions.
I figured that I should try to get the rest of the code ready in the
meantime, though.


Reply to this email directly or view it on GitHub
#56 (comment).

@ben-manes
Copy link
Owner Author

The concern with that approach is if the resolved value is neither the current value nor the refreshed value (newValue). Then the rejected values would be sent to the removal listener and the cache updated to hold the resolved value. This user behavior doesn't make a lot of sense and defeats the intent of the method (to pick between the current or new value), but we'd have to account for it. I could see a user deciding to resolve by fetching from the database again to return that value, whereas preferably they might compare a modified timestamp on the values instead.

@ehoule
Copy link

ehoule commented Mar 15, 2016

Also, if the user decides to return a new object, it means he would probably be blocking the thread because he might need to load that object from somewhere. So we possibly want to discourage that...
Alternatively we could assert that V is either the current or new value.

@ghost
Copy link

ghost commented Mar 16, 2016

Alternatively the method could be called resolveRefreshConflict and it
could return an enum indicating how to resolve the conflict.

Charles

On Tue, Mar 15, 2016 at 4:45 PM Etienne Houle [email protected]
wrote:

Also, if the user decides to return a new object, it means he would
probably be blocking the thread because he might need to load that object
from somewhere. So we possibly want to discourage that...
Alternatively we could assert that V is either the current or new value.


You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub
#56 (comment)

@ben-manes
Copy link
Owner Author

An enum indicates that the options might expand, which could only be to reload yet again. It also carries the weight of increasing the API further and dealing with null as an error. That begs the question of what to do, regardless of the method signature, if an exception is thrown.

Perhaps this method doesn't carry its conceptual weight? I'm sure very few users will utilize the conflict handling and some resolution is needed. Guava's clobber/insert is simple and correct in most cases, though arguably could have (hopefully benign) races. Dropping would be safest and less efficient, but requires a new removal cause. If this method shouldn't be included, at least yet, then I'd be inclined to follow Guava's lead here. I don't recall the internal discussion for refresh and why conflicts were resolved that way.

@ben-manes
Copy link
Owner Author

It occurred to me that this same race occurs on a loadAll for a synchronous cache. That is because there is an inherent race because synchronized objects cannot be locked / unlocked in aggregate. This is not the case for an asynchronous cache where we insert incomplete futures, bulk load, and then populate those futures.

LoadingCache#getAll(keys) is documented as allowing implementors to race with the behavior of,

In the case of overlapping non-blocking loads, the last load to complete will replace the existing entry.

That is slightly too narrowly worded because the last load will clobber an existing entry for any reason, including an explicit insert. This allows clobbering to insert stale data.

It seems better to generalize this enhancement to resolve conflicts for any type of load. That means the method would be called retainOnConflict, the addition removal cause conflict would be added, and getAll documentation updated, and getAll updated to resolve conflicts. The default behavior would be pessimistic by dropping the newly loaded value.

@ben-manes
Copy link
Owner Author

I plan on cutting a minor release (2.3) soon and, given how close this is, it bugs me not to finish it off. I think my only mental blocker is the name.

  • retainOnConflict: a query, but it is not clear which value is being retained
  • resolveOnConflict: a action, that sounds heavy, but is really just a boolean response
  • preserveOnConflict: a query, indicates that the current value would be preserved

There's probably a few other good options on a better thesaurus. Ideally the method chosen sounds like a question and it is obvious what the reaction will be to the answer. I think "preserve" is the closest word I can find to describe intent and minimize confusion.

@abatkin
Copy link

abatkin commented Apr 20, 2016

If I had to vote, I'd say that you can intuitively understand what "retain" and "preserve" will do without having to think, whereas with "resolve" it is not clear exactly what that means.

@ben-manes
Copy link
Owner Author

The good thing about "resolve" is that since you can't guess, you have to look it up :-)

For "retain" I don't know if I'm confusing myself and no one else would be. The caller just loaded a value and sees that a different one exists in the cache. From its perspective "retain" might be misread as not discarding it. By definition I think its "continue to have; keep possession of" would mean that the current value is kept and the refreshed value is discarded. I don't know if I'm confusing myself from all of the other details, or if others wouldn't immediately know what "retain" meant.

"preserve" is a very strong word, "maintain (something) in its original or existing state." That is extremely clear to me that the current value should be kept and a newly loaded value discarded. So I'm leaning towards that name, which I hadn't thought of using until today.

Do you have a preference?

@abatkin
Copy link

abatkin commented Apr 20, 2016

I see what you are saying. And names are important. I have no preference between "retain" and "preserve", but vote against "resolve" (even if it might drive more folks to read the documentation).

@ben-manes
Copy link
Owner Author

Thanks. I'm going to see if I can finish this task up in the next couple of days. I think that's mostly tests and documentation. Then I shouldn't have a good excuse for punting on #7, other than finding the time to focus on it.

@ghost
Copy link

ghost commented Apr 20, 2016

My preference is for retainOnConflict or resolveOnConflict, mainly since
they sound more like caching behaviors. I'd go with retainOnConflict for
the currently proposed boolean return value, or resolveOnConflict if it
were to return an enum.

I continue to feel that returning an enum, while strictly unnecessary, goes
a long ways towards increasing the readability of the method and its use in
practice.

Looking back at this after previous discussions I had to reread your text
to figure out if the retain/preserve was referring to the refreshed element
or the modified element. The use of an enum would make that very explicit.

Charles

On Tue, Apr 19, 2016 at 8:06 PM Ben Manes [email protected] wrote:

I plan on cutting a minor release (2.3) soon and, given how close this is,
it bugs me not to finish it off. I think my only mental blocker is the name.

  • retainOnConflict: a query, but it is not clear which value is
    being retained
  • resolveOnConflict: a action, that sounds heavy, but is really just
    a boolean response
  • preserveOnConflict: a query, indicates that the current value
    would be preserved

There's probably a few other good options on a better thesaurus. Ideally
the method chosen sounds like a question and it is obvious what the
reaction will be to the answer. I think "preserve" is the closest word I
can find to describe intent and minimize confusion.


You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub
#56 (comment)

@ehoule
Copy link

ehoule commented Apr 20, 2016

I think with retainOnConflict it's not clear if it means retain the modified value or the refreshed one. I think 'preserve' is clearer that it should keep the one already there (the modified one). I also like the 'resolve' with an enum.

Alternatively with could invert the query, eg. clobberOnConflit (or any synonymous of clobber).

@ben-manes
Copy link
Owner Author

I agree that "retain" and "resolve" feel more natural in the context of a cache. I think preserve is less ambiguous. I'm a little weary of an enum, but could be convinced.

I'm unsure if I have created a more confusing mess by deviating from Guava's clobbering behavior, or at least not making it default. Take for example the timeline,

  • t0: k => v1
  • t1: LoadingCache#refresh(k)
  • t2: k => null (expired, evicted, invalidated)
  • t3: k => ?, refresh completes

In Guava, Charles had this insert the new value. But we don't know why it changed underneath us - it could be benign like expiration, or we could be loading old data if explicitly invalidated. If we reject conflicts by default then the refresh drops the new value.

I suspect Charles would argue that the consistency around caches is murky and users should source from the system of record if they need strict accuracy. That then leads to clobbering being a natural, because inconsistency may be unavoidable. If so, then exposing some conflict handling to the user isn't too important. And if exposed but generally unavoidable, then clobbering as the default highlights that.

What would you expect / want?

@ghost
Copy link

ghost commented Apr 20, 2016

When you phrase it that way I actually prefer the clobbering.

It sounds like the refresh behavior simply needs to be documented,
indicating that the routine refreshes could clobber other asynchronous
changes to the cache.

Charles

On Wed, Apr 20, 2016 at 1:08 PM Ben Manes [email protected] wrote:

I agree that "retain" and "resolve" feel more natural in the context of a
cache. I think preserve is less ambiguous. I'm a little weary of an enum,
but could be convinced.

I'm unsure if I have created a more confusing mess by deviating from
Guava's clobbering behavior, or at least not making it default. Take for
example the timeline,

  • t0: k => v1
  • t1: LoadingCache#refresh(k)
  • t2: k => null (expired, evicted, invalidated)
  • t3: k => ?, refresh completes

In Guava, Charles had this insert the new value. But we don't know why it
changed underneath us - it could be benign like expiration, or we could be
loading old data if explicitly invalidated. If we reject conflicts by
default then the refresh drops the new value.

I suspect Charles would argue that the consistency around caches is murky
and users should source from the system of record if they need strict
accuracy. That then leads to clobbering being a natural, because
inconsistency may be unavoidable. If so, then exposing some conflict
handling to the user isn't too important. And if exposed but generally
unavoidable, then clobbering as the default highlights that.

What would you expect / want?


You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub
#56 (comment)

@ben-manes
Copy link
Owner Author

Do you prefer clobbering with documentation (e.g. Guava), or clobbering by default with documentation?

My concern with blindly clobbering is that it hides races that could cause inconsistency. Yet you're right that its usually benign and not too worrisome. Is there tangible value to let the user opt-in to deciding the conflict resolution or do you view that as adding unnecessary conceptual weight?

@ghost
Copy link

ghost commented Apr 20, 2016

Well, I favor a well-defined clobbering that does not expose a knob to
control it. I think a reasonable argument could be made for discarding the
refreshed result if that value at the time the refresh began had changed.
That would be semantically equivalent to cancelling a refresh when a write
occurred.

Doesn't a similar problem exist when an explicit write occurs while a
computation is pending? What do you do in that case?

Charles

On Wed, Apr 20, 2016 at 2:29 PM Ben Manes [email protected] wrote:

Do you prefer clobbering with documentation (e.g. Guava), or clobbering by
default with documentation?

My concern with blindly clobbering is that it hides races that could cause
inconsistency. Yet you're right that its usually benign and not too
worrisome. Is there tangible value to let the user opt-in to deciding the
conflict resolution or do you view that as adding unnecessary conceptual
weight?


You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub
#56 (comment)

@ben-manes
Copy link
Owner Author

Doesn't a similar problem exist when an explicit write occurs while a computation is pending? What do you do in that case?

Do you mean a put(k, v) when a LoadingCache#get(k) is in progress? Just like in Guava it blocks the write until complete. An AsyncLoadingCache won't block, but will attack a completion handler to forward to the removal listener. Unlike Guava, an invalidate(key) will block until the computation materializes.

I think a reasonable argument could be made for discarding the refreshed result if that value at the time the refresh began had changed.

A change like a put(k, v) is easy to identify, but what about an invalidate? The preferred idiom in Guava is to not insert directly, but invalidate and wait for the next load. How would a refresh know if when it is safe to insert when oldValue != currentValue?

@ben-manes
Copy link
Owner Author

I read through this again and now I see your point. In LocalCache you clobbered the computing entry and, when it completes, it notices this fact and publishes a notification that it was replaced. So every chance one can clobber in Guava it is done, making it very consistent.

In ConcurrentHashMap it locks the bin (ideally one per entry). Any subsequent write blocks on the bin waiting for the preceding one to complete. I think that's the only reasonable approach for the map because it doesn't have a removal listener, so clobbering could have surprising side-effects like resource leaks.

Since we don't clobber on computations, it doesn't give us a enough of a pattern to emulate.

@ben-manes
Copy link
Owner Author

Maybe clobbering is the only consistent option? If we think of this in terms of the ABA problem there might not be any other correct way.

I was assuming that if currentValue == oldValue there is no conflict. Otherwise we had to decide whether to store newValue or discard it. But the value is a user supplied object and the timeline could be,

  • t0. put (k1, v1)
  • t1. store in SoR (k1, v2)
  • t2. refresh(k1)
  • t3. restore in SoR (k1, v1)
  • t4. put(k1, v1)
  • t5. refresh returns v2; finds (v1 == v1); stores (k1, v2)

That is convoluted, but technically possible. The tag to compare would be the write time, which makes sense for refreshAfterWrite. But constructing LoadingCache doesn't require that be stored on the entry, and we wouldn't want to start just for an ABA safe refresh.

So either we block other writes (current), clobber, or have ABA unsafe conflict handling. We don't want the last one, so the whole method non-sense is out. We want to avoid the blocking writes, potential of wasting an async thread, and precedent is set in getAll.

I guess I went the long way around to arrive back at what you did in Guava :-)

@anuraaga
Copy link

anuraaga commented Apr 22, 2016

Looking forward to this :) (had to temporarily rollback using caffeine which was causing this deadlock in our asynchronous server, not sure why exactly it happened but pretty sure full asynchronity would cause it to go away - https://gist.github.com/anuraaga/49eb6d92a9b4c656126743314599a56c)

As a user of the cache the clobbering sounds fine - as you say, I wouldn't expect perfect consistency from a LoadingCache and generally would expect an extra refresh cycle to catch updates in cases like this one. A part of me doesn't like how it means evicted entries would come back to life thanks to the refresh, but I guess some level of thrashing at the low end of the LRU scale is expected.

Also FWIW, I guess in practice, most use-cases of a LoadingCache probably only ever call get(), and some may call refresh() (critical read). IMO, as long as puts aren't completely broken, it's probably not worth worrying about their consistency too much

@ben-manes
Copy link
Owner Author

Writing tests shows that I'm bad at keeping all the details of Guava, Caffeine, and ConcurrentHashMap straight.

Guava's LoadingValueReference is an intermediate value in the hash table entry. If the entry is modified (updated or removed), then the loading value is discarded upon completion. Otherwise the value is stored directly. This avoids the ABA problem, such as a refresh loading in an invalidated entry.

There's not much I can do here, unfortunately. The options appear to be,

  1. Leave as is with all the caveats
  2. Best effort detect changes (old==current and write time if available)
  3. Emulate Guava by storing a forwarding node that for unwraps itself, etc.

I think (3) is possible and could be done without being excessively invasive. It requires care, but I think is doable. However I'd probably also define the behavior as (2) being acceptable. That allows integrating the current patch and, if (3) is done, not needing to do similar for the fully unbounded cache (a simple decorator for CHM). The latter would avoid adding a value wrapper for this case.

This seems reasonable. The worst cases of (2) failing are similar issues Guava has, e.g. that an invalidate(key) ignore loading values. That's exactly what would happen here for a prefetch, a SoR write, invalidate, and materialization of the stale value. The only addition is an ABA of exact instances for a LoadingCache#refresh(k) where we do not have the write time to check. For the more common refreshAfterWrite the time check is available, so not an issue. Anyone needing the precision of strict consistency would not obtain it from Guava today nor most infrastructure, and therefore have to add numerous countermeasures for protection.

@ben-manes
Copy link
Owner Author

Thanks @anuraaga. I think with the above mentioned comments it would guard against evicted entries being resurrected to some degree.

Thanks for the stack trace and sorry to hear there's an existing problem. It would be nice to figure out how to reproduce that in a unit test to validate that my patch won't suffer that effect.

I hope to finish my patch tonight or tomorrow, so a release Friday or Saturday is likely.

@anuraaga
Copy link

Thanks - I've looked into our thread dump in more detail and put what details seemed relevant in #69, maybe it provides clues on a repro.

@ben-manes
Copy link
Owner Author

ben-manes commented Apr 22, 2016

A part of me doesn't like how it means evicted entries would come back to life thanks to the refresh, but I guess some level of thrashing at the low end of the LRU scale is expected.

Guava's performs a putIfAbsent if the old LoadingValueReference isn't present. This means than an explicit invalidate or an eviction will resurrect the entry from an in-flight refresh. My current patch does not do this, and therefore the new unit test passes for Caffeine and fails for Guava. I think that the reason for this appears to be the interpretation of the JavaDoc on refresh(key).

Loads a new value for key {@code key}, possibly asynchronously. While the new value is loading
the previous value (if any) will continue to be returned by {@code get(key)} unless it is evicted. If the new value is loaded successfully it will replace the previous value in the cache...

This indicates the previous value is the instance. If updated Guava discards the refresh and explicitly calls out eviction. The statement of replace indicates to me that the putIfAbsent was a mistaken. However, if Guava had done this then it is not clear what the removal cause should have been, since the refresh action wouldn't know why the entry disappeared. In the patch I'm using explicit since the refresh is a user action and a conditional failure would be explicitly discarded rather than an implicit eviction. Yet that is misleading on refreshAfterWrite which is implicit and has the same scenario.

What is the desired behavior here?

@ghost
Copy link

ghost commented Apr 22, 2016

My personal preference with cache loading is to think of it in terms of
linearizability, which requires that it appear as if the get which
triggered the load appears to have taken place at some unique sequential
point between the start and end of that call. If the entry was absent when
the get began then this allows for two potential outcomes: either a newly
loaded instance is returned, or an instance created concurrently with the
get is returned. Either are acceptable. In terms of linearizability it
doesn't matter if a new entry is loaded and then discarded due to a
concurrent insert, as it will never have been visible within the cache.

I've thought less about refresh. It seems fine to me if a refresh begins
and then successfully stores its computed result even if the old value has
since been automatically evicted. This allows eviction to be used to
prevent the return of stale data, and refresh to be used to ensure that
fresh data is generally available. (Admittedly this argument applies more
to time-based eviction than size-based eviction; I would be fine if the two
were treated differently, and if a size-based eviction essentially
cancelled the refresh.) The more complicated scenario is when a user
manually inserts a new element into the cache. In this case I'd be inclined
to follow the precedent of cache loading, and to discard the refreshed
result in this case.

In other words, I argue for all manual user additions to the cache (as well
as size-based evictions) invoking a "cancel" on any pending loads or
refreshes, and for allowing refreshes to succeed in all other cases.

Charles

On Fri, Apr 22, 2016 at 5:04 AM Ben Manes [email protected] wrote:

A part of me doesn't like how it means evicted entries would come back to
life thanks to the refresh, but I guess some level of thrashing at the low
end of the LRU scale is expected.

Guava's performs a putIfAbsent if the old LoadingValueReference isn't
present. This means than an explicit invalidate or an eviction will
resurrect the entry from an in-flight refresh. My current patch does not do
this, and therefore the new unit test passes for Caffeine and fails for
Guava. I think that the reason for this appears to be the interpretation of
the JavaDoc on refresh(key).

Loads a new value for key {@code https://github.com/code key}, possibly
asynchronously. While the new value is loading
the previous value (if any) will continue to be returned by {@code
https://github.com/code get(key)} unless it is evicted. If the new
value is loaded successfully it will replace the previous value in the
cache...

This indicates the previous value is the instance. If updated Guava
discards the refresh and explicitly calls out eviction. The statement of
replace indicates to me that the putIfAbsent was a mistaken. However, if
Guava had done this then it is not clear what the removal cause should have
been, since the refresh action wouldn't know why the entry disappeared.

What is the desired behavior here?


You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub
#56 (comment)

@ben-manes
Copy link
Owner Author

I think what kept causing me confusion is that if a user's mutation should cancel a refresh, then arguably an invalidate should too. Yet that is not true in Guava and could cause surprising behavior if assuming a happens-before of actions. I believe this would violate the linearizability property if guaranteed.

My mistake was to try to provide a predictable sequence that properly handled the various cases. In retrospect it unnecessary and following Guava's lead here is preferred. That is simple and consistent. There are other places where linearizability would be broken as well, in both Guava and Caffeine, and that's acceptable.

If a user wants much stricter behavior then asMap().compute(k, func) is available. And stale data is always a factor in concurrent / distributed systems, so refresh being best-effort is good enough when other necessary precautions are taken.

Despite going around in circles for too long, I think we have a conclusion. At the very least I understand the rational and can have tests for them.

ben-manes added a commit that referenced this issue Apr 23, 2016
Previously refresh was implemented as a computation performed on the
executor. Like all writes it allowed concurrent reads, but blocked
other writes like updating, invalidating, or evicting. This provided
strong consistency at the cost of lock granularity (hash bin) and
potentially wasting a thread in an asynchronous cache. This simple
model was also shown to be broken, as per the deadlock reported.

A refresh is now performed without blocking and better matches Guava's.
The newly loaded entry is dropped if the mapping now points to a
different value. Like Guava, if the entry disappears (e.g. eviction)
the loaded value is inserted. Usage through refreshAfterWrite is
preferred and it will try to avoid redundant in-flight loads. Unlike
Guava a LoadingCache#refresh() cannot detect and ignore redundant
loads. It may be possible to strengthen the implementation, but
explicit refreshes are rare. Similar to Guava, the approach is not ABA
safe but best effort and does what users would likely prefer. For
stricter reload behavior, users should perform a Map#compute instead.

Load testing uncovered a weighted eviction bug with a cache heavily
dominated by zero-weight entries (e.g. incomplete futures). The main
space eviction would find no victims and needed to fallback to scan
the admission window.

Thanks to everyone who helped in the discussions to wrap my head how
to properly implement this.
ben-manes added a commit that referenced this issue Apr 23, 2016
Previously refresh was implemented as a computation performed on the
executor. Like all writes it allowed concurrent reads, but blocked
other writes like updating, invalidating, or evicting. This provided
strong consistency at the cost of lock granularity (hash bin) and
potentially wasting a thread in an asynchronous cache. This simple
model was also shown to be broken, as per the deadlock reported.

A refresh is now performed without blocking and better matches Guava's.
The newly loaded entry is dropped if the mapping now points to a
different value. Like Guava, if the entry disappears (e.g. eviction)
the loaded value is inserted. Usage through refreshAfterWrite is
preferred and it will try to avoid redundant in-flight loads. Unlike
Guava a LoadingCache#refresh() cannot detect and ignore redundant
loads. It may be possible to strengthen the implementation, but
explicit refreshes are rare. Similar to Guava, the approach is not ABA
safe but best effort and does what users would likely prefer. For
stricter reload behavior, users should perform a Map#compute instead.

Load testing uncovered a weighted eviction bug with a cache heavily
dominated by zero-weight entries (e.g. incomplete futures). The main
space eviction would find no victims and needed to fallback to scan
the admission window.

Thanks to everyone who helped in the discussions to wrap my head how
to properly implement this.
ben-manes added a commit that referenced this issue Apr 23, 2016
Previously refresh was implemented as a computation performed on the
executor. Like all writes it allowed concurrent reads, but blocked
other writes like updating, invalidating, or evicting. This provided
strong consistency at the cost of lock granularity (hash bin) and
potentially wasting a thread in an asynchronous cache. This simple
model was also shown to be broken, as per the deadlock reported.

A refresh is now performed without blocking and better matches Guava's.
The newly loaded entry is dropped if the mapping now points to a
different value. Like Guava, if the entry disappears (e.g. eviction)
the loaded value is inserted. Usage through refreshAfterWrite is
preferred and it will try to avoid redundant in-flight loads. Unlike
Guava a LoadingCache#refresh() cannot detect and ignore redundant
loads. It may be possible to strengthen the implementation, but
explicit refreshes are rare. Similar to Guava, the approach is not ABA
safe but best effort and does what users would likely prefer. For
stricter reload behavior, users should perform a Map#compute instead.

Load testing uncovered a weighted eviction bug with a cache heavily
dominated by zero-weight entries (e.g. incomplete futures). The main
space eviction would find no victims and needed to fallback to scan
the admission window.

Thanks to everyone who helped in the discussions to wrap my head how
to properly implement this.
ben-manes added a commit that referenced this issue Apr 23, 2016
Previously refresh was implemented as a computation performed on the
executor. Like all writes it allowed concurrent reads, but blocked
other writes like updating, invalidating, or evicting. This provided
strong consistency at the cost of lock granularity (hash bin) and
potentially wasting a thread in an asynchronous cache. This simple
model was also shown to be broken, as per the deadlock reported.

A refresh is now performed without blocking and better matches Guava's.
The newly loaded entry is dropped if the mapping now points to a
different value. Like Guava, if the entry disappears (e.g. eviction)
the loaded value is inserted. Usage through refreshAfterWrite is
preferred and it will try to avoid redundant in-flight loads. Unlike
Guava a LoadingCache#refresh() cannot detect and ignore redundant
loads. It may be possible to strengthen the implementation, but
explicit refreshes are rare. Similar to Guava, the approach is not ABA
safe but best effort and does what users would likely prefer. For
stricter reload behavior, users should perform a Map#compute instead.

Load testing uncovered a weighted eviction bug with a cache heavily
dominated by zero-weight entries (e.g. incomplete futures). The main
space eviction would find no victims and needed to fallback to scan
the admission window.

Thanks to everyone who helped in the discussions to wrap my head how
to properly implement this.
ben-manes added a commit that referenced this issue Apr 23, 2016
Previously refresh was implemented as a computation performed on the
executor. Like all writes it allowed concurrent reads, but blocked
other writes like updating, invalidating, or evicting. This provided
strong consistency at the cost of lock granularity (hash bin) and
potentially wasting a thread in an asynchronous cache. This simple
model was also shown to be broken, as per the deadlock reported.

A refresh is now performed without blocking and better matches Guava's.
The newly loaded entry is dropped if the mapping now points to a
different value. Like Guava, if the entry disappears (e.g. eviction)
the loaded value is inserted. Usage through refreshAfterWrite is
preferred and it will try to avoid redundant in-flight loads. Unlike
Guava a LoadingCache#refresh() cannot detect and ignore redundant
loads. It may be possible to strengthen the implementation, but
explicit refreshes are rare. Similar to Guava, the approach is not ABA
safe but best effort and does what users would likely prefer. For
stricter reload behavior, users should perform a Map#compute instead.

Load testing uncovered a weighted eviction bug with a cache heavily
dominated by zero-weight entries (e.g. incomplete futures). The main
space eviction would find no victims and needed to fallback to scan
the admission window.

Thanks to everyone who helped in the discussions to wrap my head how
to properly implement this.
@ben-manes
Copy link
Owner Author

Released 2.3.0 which includes non-blocking refresh and fixes the deadlock that @anuraaga discovered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants