-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/write through cache perf #234
Conversation
// End of thread-unsafe update steps | ||
|
||
|
||
final def getFilterFunc: K => Boolean = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason this is not a val?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, it does a bunch of operations before returning a lambda function
And add a comment on the store
} | ||
|
||
|
||
class ApproxHHTracker[K](hhPct: HeavyHittersPercent, updateFreq: WriteOperationUpdateFrequency, roFreq: RollOverFrequencyMS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe in the object, make a convenience method to set the whole thing? Or at least a comment demoing.
Make the CMS a proxy on a cache rather than a store. Add a eager write through store thats simple + fast and will update the cache before we have a confirmed store write. add a test for the Filtered Cache
private[this] val putsSincePrune = new java.util.concurrent.atomic.AtomicInteger(1) | ||
|
||
def get(k: K) = { | ||
val clockVal = clock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a difference in this change. Here you are hitting the clock even when the item is absent. Originally, we only check the clock if the item is present. This will make cases of high miss rate more expensive. Can you keep it in the closure, but just make it more verbose there, like add a comment?
Updated for comments |
sealed class ApproxHHTracker[K](hhPct: HeavyHittersPercent, updateFreq: WriteOperationUpdateFrequency, roFreq: RollOverFrequencyMS) { | ||
private[this] final val WIDTH = 1000 | ||
private[this] final val DEPTH = 4 | ||
private[this] final val hh = new java.util.HashMap[K, Long]() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was wondering if this can be made a ConcurrentHashMap
. But looks like it can't because an update here optionally performs pruning which needs a global view of the hash map?
Feature/write through cache perf
Woops, sorry @rubanm I just noticed, and agree with your point. Would be good to get a cleanup PR on that to clean this up. In general, I do think it is a shame that this goes in storehaus-algebra rather than storehaus-cache. Looking back, it looks like storehaus-cache depends on util and algebird. What is the reason it should not be there? |
ok. it is in cache. Not paying good attention here. Not really clear why the storehaus-algebra package is separate here. I need to review the deps here. I think it is because core does not depend on algebird (but does depend on bijection). storehaus-algebra does depend on algebird. I am not sure why cache has to. If this algo is the only reason, I would be willing to duplicate the hashing code (which is really short) rather than add another dep just for that to cache. |
the cache dep on algebird was there previously, i didn't modify that in the build file. (It moved from algebra to cache based on a previous comment that it really was a type of cache than a store, so then its natural home was the cache) storehaus-core depends on storehaus-cache so picks up that algebird dep transitively now undoing most of the nature of it not depending on it itself? The other reference I see is for CyclicIncrementProvider to get Successible. |
I guess Algebird is just at the bottom. So most of the reason for these modules is wasted. :/ |
Right. CyclicIncrementProvider is in turn referenced in LIRSCache so ther doesn't seem to be a straightforward way to drop this dependency in storehaus-cache. |
-> Adds in a caliper project, and a benchmark
-> Add in a HHFiltered store, we use a local mutable CMS sketch to keep track of heavy hitters, only storing those in the cache. Yields drop in read I/O by 30%