Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/write through cache perf #234

Merged
merged 12 commits into from
Jun 17, 2014
Merged

Conversation

ianoc
Copy link
Collaborator

@ianoc ianoc commented Jun 3, 2014

-> Adds in a caliper project, and a benchmark

-> Add in a HHFiltered store, we use a local mutable CMS sketch to keep track of heavy hitters, only storing those in the cache. Yields drop in read I/O by 30%

// End of thread-unsafe update steps


final def getFilterFunc: K => Boolean = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason this is not a val?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it does a bunch of operations before returning a lambda function

ianoc added 2 commits June 4, 2014 13:40
}


class ApproxHHTracker[K](hhPct: HeavyHittersPercent, updateFreq: WriteOperationUpdateFrequency, roFreq: RollOverFrequencyMS) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe in the object, make a convenience method to set the whole thing? Or at least a comment demoing.

ianoc added 3 commits June 16, 2014 09:52
Make the CMS a proxy on a cache rather than a store.

Add a eager write through store thats simple + fast and will update the cache before we have a confirmed store write.

add a test for the Filtered Cache
private[this] val putsSincePrune = new java.util.concurrent.atomic.AtomicInteger(1)

def get(k: K) = {
val clockVal = clock()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a difference in this change. Here you are hitting the clock even when the item is absent. Originally, we only check the clock if the item is present. This will make cases of high miss rate more expensive. Can you keep it in the closure, but just make it more verbose there, like add a comment?

@ianoc
Copy link
Collaborator Author

ianoc commented Jun 17, 2014

Updated for comments

@ianoc ianoc closed this Jun 17, 2014
@ianoc ianoc reopened this Jun 17, 2014
sealed class ApproxHHTracker[K](hhPct: HeavyHittersPercent, updateFreq: WriteOperationUpdateFrequency, roFreq: RollOverFrequencyMS) {
private[this] final val WIDTH = 1000
private[this] final val DEPTH = 4
private[this] final val hh = new java.util.HashMap[K, Long]()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was wondering if this can be made a ConcurrentHashMap. But looks like it can't because an update here optionally performs pruning which needs a global view of the hash map?

johnynek added a commit that referenced this pull request Jun 17, 2014
@johnynek johnynek merged commit b5ea3a0 into develop Jun 17, 2014
@johnynek
Copy link
Collaborator

Woops, sorry @rubanm I just noticed, and agree with your point. Would be good to get a cleanup PR on that to clean this up.

In general, I do think it is a shame that this goes in storehaus-algebra rather than storehaus-cache. Looking back, it looks like storehaus-cache depends on util and algebird. What is the reason it should not be there?

@johnynek
Copy link
Collaborator

ok. it is in cache. Not paying good attention here.

Not really clear why the storehaus-algebra package is separate here. I need to review the deps here. I think it is because core does not depend on algebird (but does depend on bijection).

storehaus-algebra does depend on algebird.

I am not sure why cache has to. If this algo is the only reason, I would be willing to duplicate the hashing code (which is really short) rather than add another dep just for that to cache.

@johnynek johnynek deleted the feature/writeThroughCachePerf branch June 17, 2014 20:19
@ianoc
Copy link
Collaborator Author

ianoc commented Jun 17, 2014

the cache dep on algebird was there previously, i didn't modify that in the build file.

(It moved from algebra to cache based on a previous comment that it really was a type of cache than a store, so then its natural home was the cache)

storehaus-core depends on storehaus-cache so picks up that algebird dep transitively now undoing most of the nature of it not depending on it itself?

The other reference I see is for CyclicIncrementProvider to get Successible.

@johnynek
Copy link
Collaborator

I guess Algebird is just at the bottom. So most of the reason for these modules is wasted. :/

@rubanm
Copy link
Contributor

rubanm commented Jun 17, 2014

Right. CyclicIncrementProvider is in turn referenced in LIRSCache so ther doesn't seem to be a straightforward way to drop this dependency in storehaus-cache.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants