Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stores for read-through and write-through caching #220

Merged
merged 5 commits into from
Mar 7, 2014

Conversation

rubanm
Copy link
Contributor

@rubanm rubanm commented Feb 26, 2014

Addresses #216

class ReadThroughStore[K, V](backingStore: ReadableStore[K, V], cache: Store[K, V])
extends ReadableStore[K, V] {

override def get(k: K): Future[Option[V]] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont we need 'put if absent' semantic to be atomic to prevent a cache stampede. I was thinking we will have to take AsyncMutex and flatmap that before we do anything. Or is the expectation that client will ensure that store is used in a threadsafe manner?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Not sure if we need "put if absent". I guess the idea is to refresh the cache value on every write (for the write-through case).

Agree on making this threadsafe with a mutex. It'd be nice to have a more lightweight / lock-free solution here, though I can't think of one. I'll go ahead and add a mutex unless someone has a different way to fix this.

@rubanm
Copy link
Contributor Author

rubanm commented Feb 28, 2014

Added mutex.
Made all cache operations best effort.
Any backing store failures are propagated to the client.

cache.get(k).flatMap { cacheValue =>
cacheValue match {
case None => getFromBackingStore(k)
case some => Future.value(some)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This maybe premature optimization so feel free to ignore but it would be sweet if we only use mutex in scenario where there is a cache miss and we have to go to the backing store. In case of a cache hit there is no need to lock as its just a read only operation. I feel like cache hit will be the most common scenario and we should try to make it lock free if we can

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. In database terms, this would give us "read committed" isolation level, which should be good enough in this case I think.

override def get(k: K): Future[Option[V]] =
cache.get(k).flatMap { cacheValue =>
cacheValue match {
case None => getFromBackingStore(k)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

erm.. I think there is a bug in this code now. Imagine two threads coming one after the other in a cache miss scenario. First thread will do a cache.get, acquire the mutex and do a get on the backing store. In the mean time second thread will do a cache.get, get a miss and block on getFromBackingStore till the first thread is done. Then it will acquire the lock, do a get on the backingstore again and repopulate the cache. I think you may need to do a cache get inside the getFromBackingStore function, just to check that cache is not already populated by the previous thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do see a case where we can have threads queued up trying to get a hot key on cache miss. Also a cache get is probably going to be less expensive than getting from backing store again. But, is this expected to happen fairly frequently?

Just wondering if we need to have an additional cache get in the read path each time to account for this case. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an additional cache get would be fine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought some more about this. Looks like we'll need to place the cache get inside the mutex block if we want this behavior.

The current code doesn't break any store semantics as the latest value in backing store is written to cache.

So we can either keep the existing behavior and revisit if this turns out to be a perf issue, or, make the mutex block larger. I propose we do the former.

@MansurAshraf @johnynek thoughts?

@johnynek
Copy link
Collaborator

johnynek commented Mar 7, 2014

Let's merge and get it in the next release and polish as needed.

johnynek added a commit that referenced this pull request Mar 7, 2014
Add stores for read-through and write-through caching
@johnynek johnynek merged commit d338ba6 into twitter:develop Mar 7, 2014
@rubanm
Copy link
Contributor Author

rubanm commented Mar 7, 2014

Cool.

@rubanm rubanm deleted the feature/writethrough branch March 12, 2014 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants