-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add stores for read-through and write-through caching #220
Conversation
Merge from twitter/storehaus
class ReadThroughStore[K, V](backingStore: ReadableStore[K, V], cache: Store[K, V]) | ||
extends ReadableStore[K, V] { | ||
|
||
override def get(k: K): Future[Option[V]] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dont we need 'put if absent' semantic to be atomic to prevent a cache stampede. I was thinking we will have to take AsyncMutex and flatmap that before we do anything. Or is the expectation that client will ensure that store is used in a threadsafe manner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Not sure if we need "put if absent". I guess the idea is to refresh the cache value on every write (for the write-through case).
Agree on making this threadsafe with a mutex. It'd be nice to have a more lightweight / lock-free solution here, though I can't think of one. I'll go ahead and add a mutex unless someone has a different way to fix this.
Added mutex. |
cache.get(k).flatMap { cacheValue => | ||
cacheValue match { | ||
case None => getFromBackingStore(k) | ||
case some => Future.value(some) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This maybe premature optimization so feel free to ignore but it would be sweet if we only use mutex in scenario where there is a cache miss and we have to go to the backing store. In case of a cache hit there is no need to lock as its just a read only operation. I feel like cache hit will be the most common scenario and we should try to make it lock free if we can
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. In database terms, this would give us "read committed" isolation level, which should be good enough in this case I think.
override def get(k: K): Future[Option[V]] = | ||
cache.get(k).flatMap { cacheValue => | ||
cacheValue match { | ||
case None => getFromBackingStore(k) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
erm.. I think there is a bug in this code now. Imagine two threads coming one after the other in a cache miss scenario. First thread will do a cache.get, acquire the mutex and do a get on the backing store. In the mean time second thread will do a cache.get, get a miss and block on getFromBackingStore till the first thread is done. Then it will acquire the lock, do a get on the backingstore again and repopulate the cache. I think you may need to do a cache get inside the getFromBackingStore function, just to check that cache is not already populated by the previous thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do see a case where we can have threads queued up trying to get a hot key on cache miss. Also a cache get is probably going to be less expensive than getting from backing store again. But, is this expected to happen fairly frequently?
Just wondering if we need to have an additional cache get in the read path each time to account for this case. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an additional cache get would be fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought some more about this. Looks like we'll need to place the cache get inside the mutex block if we want this behavior.
The current code doesn't break any store semantics as the latest value in backing store is written to cache.
So we can either keep the existing behavior and revisit if this turns out to be a perf issue, or, make the mutex block larger. I propose we do the former.
@MansurAshraf @johnynek thoughts?
Let's merge and get it in the next release and polish as needed. |
Add stores for read-through and write-through caching
Cool. |
Addresses #216