Skip to content
This repository was archived by the owner on Jul 7, 2020. It is now read-only.

HyperLogLogPlusPlus sparse precision 32 accuracy problem #159

Open
Enzo90910 opened this issue Aug 30, 2019 · 0 comments
Open

HyperLogLogPlusPlus sparse precision 32 accuracy problem #159

Enzo90910 opened this issue Aug 30, 2019 · 0 comments

Comments

@Enzo90910
Copy link

I have had very strange results (very high inaccuracy) for low-cardinality HLL++ when using usual values of p (p = 11, 12 ,13, 14) and sp = 32. I suspect (but I am not certain) that treating sp = 31 and sp = 32 exactly the same at the following line causes the problem:

sm = sp > 30 ? Integer.MAX_VALUE : 1 << sp;

since for low cardinalities, the cardinality is computed this way:
return Math.round(HyperLogLog.linearCounting(sm, sm - sparseSet.length));

Using sp=31 works as expected, sp=32 does not.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant