You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This ticket is a follow-up on the discussion we had in the pull request 354, i.e. the CMS[K] pull request.
Algebird currently has the three data structures with overlapping functionality, and the question is whether we could/should consolidate them in some way -- primarily to reduce maintenance cost.
SketchMap[K, V] and CMS[K] are almost identical except that SketchMap also parameterizes the value.
However, the current implementation of SketchMap is significantly slower than CMS (in this experiment, SketchMap[BigInt] achieved 0.3k writes/s compared to 65k writes/s for CMS[BigInt], i.e. it was slower by two orders of magnitude).
SpaceSaver may be an alternative to a top-N CMS.
Note that a top-N CMS is not commutative, which limits is applicability in practice. (A top-% CMS is commutative though.)
Past discussions
The snippets below are taken from the conversations in GH-345 and the pull request 354, where we discussed the work on turning the old Long-based CMS into a parameterized CMS[K].
@johnynek wrote (source):
I am concerned that we have now SketchMap and CMS that are almost identical except that SketchMap parameterizes the value. Given that making the key generic didn't hurt performance, I tend to think we must just have some small bottleneck in SketchMap that could be fixed. [...] It would be great to give this a going over with yourkit and see if we can identify a few hotspots [...].
@avibryant wrote (source):
I don't think SpaceSaver replaces SketchMap, but I do think it might be a good alternative to TopNLogic. (It would be interesting to do some benchmarking of error rates between these two). But SpaceSaver doesn't even parameterize the value type currently, which is something else we should fix.
The text was updated successfully, but these errors were encountered:
@avibryant
Hey Avi, what do you mean when you say: "But SpaceSaver doesn't even parameterize the value type currently, which is something else we should fix."?
I am happy to work on SpaceSaver to fix it as needed once i understand what the issue is.
Best, Koert
This ticket is a follow-up on the discussion we had in the pull request 354, i.e. the
CMS[K]
pull request.Algebird currently has the three data structures with overlapping functionality, and the question is whether we could/should consolidate them in some way -- primarily to reduce maintenance cost.
Overlapping functionality
SketchMap[BigInt]
achieved0.3k writes/s
compared to65k writes/s
for CMS[BigInt], i.e. it was slower by two orders of magnitude).Past discussions
The snippets below are taken from the conversations in GH-345 and the pull request 354, where we discussed the work on turning the old
Long
-based CMS into a parameterizedCMS[K]
.The text was updated successfully, but these errors were encountered: