HPPC-186: Change the iteration order of HashSet and HashMap. Remove hash order mixing. #10

bruno-roustant · 2020-08-14T14:52:49Z

Sample code for https://issues.carrot2.org/browse/HPPC-186

vsonnier · 2020-08-14T15:09:51Z

Hi @bruno-roustant !
Please note that HPPCRT-18 was sperseded by : vsonnier#38. In short, I still iterate "in reverse" in HPPC-RT but I also kept the "perturbation" policy in the end.

bruno-roustant · 2020-08-14T16:34:51Z

Hi @vsonnier !
Interesting, thanks for sharing.
Here the approach with a slot increment would pass. I think it is worth reviving this discussion again :)

dweiss · 2020-08-14T18:32:07Z

Hello both (hi @vsonnier , long time no see!). Yes, it would be great to resign from the seed mix. I admit I can't remember whether this was considered at the time - it's been a while. I only remember I also didn't anticipate the original problem and I went the really defensive way of trying to absolutely minimize the chances of this happening in a production system. We do copy/ manipulate large associative arrays and if this every happens it's effectively a live deadlock...

If we can somehow prove the iteration order (or skip) prevents this with a really low-likelihood of happening then it'd be a great improvement. Remember those maps can be copied over and over between instances.

Separately from theoretical aspects there is also the practical aspect: seed mix adds overhead but it's fairly local and should play well with CPU caches, and even hotspot optimizations (in tight loops). The same with linear iteration - it is really cache-friendly. Adding skips will affect this property.

bruno-roustant · 2020-08-14T19:56:12Z

I'll take some time to think about how to prove the really low-likehood of collision avalanche.

About the seed mix overhead, I saw a consistent and visible perf impact during the numerous benchmarks we did when evaluating worm hashing. For all benchmarks, mixed hppc had visible overhead compared to scatter, due to this slightly more complex hash function, on multiple machines and with two different benchmark implementations (both based on JMH). So I'm convinced that the hotspot optimization does not have a good effect there. At least for jdk 11 and below. Maybe that would be different with C/C++.

For the cache-friendliness of the iteration, well I suppose the JMH benchmark will be the arbitrator. Based on the CollisionAvalanche I ran with the sample code, the small skip seems to work well since it provides 33% speed improvement for scatter hppc set compared to standard hppc set. We'll need to run more benchmarks for more sizes to confirm this.

Ok, this makes me confident I can continue to work on the subject. That's what I wanted to discuss first.

dweiss · 2020-08-14T20:02:04Z

I agree the "scatter" version is slightly faster - it is! That's why I left it in the code (for some applications it's useful). The question remains what happens when you don't have full control over how your collections are being used - this is what I'm concerned with.

As for JVM/hotspot optimizations and cache influence: it's a really, really tough material to provide insight on since each and every architecture (not just the CPU but also memory controller, etc.) will have varying results... I really miss simple computers sometimes when you could just read clock ticks out of each instruction and just know for certain...

bruno-roustant · 2020-08-17T13:06:55Z

I have tried a few things.

First I quickly verified that the small prime increment (let’s call it skip-iteration-order) is indeed an iteration order: it retrieves all slots and only once. I ran brute force from 2^0 to 2^28. It works also for any starting slot (could be random instead of always 0).

Then I instrumented KTypeHashSet to count each time we scan linearly the next slots after a hash collision. I called this collisionScanCount. I ran tests to copy Sets for many source and target sizes, focusing mainly on heavy loaded sets (as in CollisionAvalanche benchmark). Each time I measured the collisionRatio = collisionScanCount / targetSet.size() to estimate the penalty for collisions. For the case of heavy loaded set, I got collisionRatio=0.99 for KTypeHashSet and collisionRatio=0.75 for KTypeScatterSet. This is nearly 25% less collision slots scanned. And this corresponds to the perf gain I see for B004_HashSet_CollisionAvalanche: hppc_scatter is 28% faster.

Then I modified B004_HashSet_CollisionAvalanche to have copy cycles: a source set is copied to an empty target, then the target becomes the source and is copied to another target and so on. Same perf gain 28%.

Actually when a set is copied, the skip-iteration-order is used to read the source set but once copied the keys are in the target hash (mixPhi) order. So for any copy between two hppc sets the iteration order is always different than the target hash order. That’s why it avoids collisions. I don’t think we need to include randomness to ensure this property.
So this could protect forEach(), iterator(), toArray().

What if a non-hppc collection is copied to a hppc set? The source collection could have an iteration order based on the mixPhi order. This is actually the case with Fastutil and Koloboke collections, they use the same mixPhi. In this case only the current keyMixer approach guarantees the protection (I could imagine a buffered-shuffling iterable wrapper to iterate the keys in modified order, but the caller would have to remind it). I don’t know how much you want to protect this case Dawid.

dweiss · 2020-08-17T13:13:46Z

Can you file a patch with this, Bruno? Doesn't have to be commitable, I'd just look and give it some thought but it'd be clearer for me to see what you want to do with those prime increments, etc.

bruno-roustant · 2020-08-17T13:35:27Z

This PR already provides the code for protecting toArray().
I'll add the iteration protection for forEach() and iterator().

dweiss · 2020-08-18T07:25:26Z

Hi Bruno.

bq. First I quickly verified that the small prime increment (let’s call it skip-iteration-order) is indeed an iteration order: it retrieves all slots and only once. I ran brute force from 2^0 to 2^28. It works also for any starting slot (could be random instead of always 0).

I think this one is relatively easy to prove mathematically too - if the "skip" is a prime m and underlying table is of size 2^n then the two are relatively prime to each other so for any sequence of i=0..(2^n-1), i * m mod 2^n is guaranteed to fill any bucket within 0..2^n-1. This also applies if you add an initial "shift" (it merely rotates the initial sequence of i's. The same idea is used in open addressing with double hashing.

As for practical repercussions - I think you may be right. I'll have to spend some time on this tinkering with actual possibilities but please do feel free to provide a patch that does implement this idea.

bruno-roustant · 2020-08-18T12:19:11Z

I pushed the code. Now it is on KTypeHashSet directly instead of KTypeScatterSet.
The idea is that each time we make a copy of a HashSet, the iteration order is kind of "shuffled" to ensure it is not the same as the hash order in the target set.

CollisionAvalanche benchmark before the change:

B004_HashSet_CollisionAvalanche.run __ HPPC _____________ 0.024 ± 0.002 s/op
B004_HashSet_CollisionAvalanche.run __ HPPC_SCATTER __ 1.062 ± 0.059 s/op
B004_HashSet_CollisionAvalanche.run __ FASTUTIL _________ 1.084 ± 0.087 s/op
B004_HashSet_CollisionAvalanche.run __ KOLOBOKE ________ 1.519 ± 0.036 s/op

CollisionAvalanche benchmark after the change:

B004_HashSet_CollisionAvalanche.run __ HPPC _____________ 0.024 ± 0.001 s/op
B004_HashSet_CollisionAvalanche.run __ HPPC_SCATTER __ 0.016 ± 0.001 s/op
B004_HashSet_CollisionAvalanche.run __ FASTUTIL ________ 1.058 ± 0.020 s/op
B004_HashSet_CollisionAvalanche.run __ KOLOBOKE _______ 1.529 ± 0.081 s/op

dweiss · 2020-08-18T12:58:22Z

Looks good to me. I wonder if we should just drop the seed-mixed version entirely and just use this instead. The same would apply to maps too, I guess? I can fold this in and do this, let's see what happens. :)

bruno-roustant · 2020-08-18T13:10:57Z

Yes, cool! To drop the seed-mixed version is the next step, and apply to map.
I'm ready to help if you want. I could push the change for the map. I could also help to drop stuff.

dweiss · 2020-08-18T13:34:11Z

Please do go ahead with this if you have spare cycles!

…a 20% perf penalty.

Make hash iteration order vary for each KTypeHashSet iteration.

bruno-roustant · 2020-08-19T14:39:43Z

First step: I removed HashOrderMixingStrategy from KTypeHashSet (KTypeScatterSet to be removed).

In the process I got a significant test failure with HashCollisionsClusteringTest.testHashSetClusteringAtFront2().
Awesome tests!
It caught a naughty weakness of the hash order with increment: when we repeatedly iterate the keys of sets A1,A2,...,An (in the test the top 5K keys) and add them to the same set B, we actually follow the same hash iteration order for the sets A, so the distribution in B is stacking at the beginning of the array. With the very nice visualizeKeyDistribution() in this test I could confirm that. It provoked collision avalanches.

So I fixed this weakness by making the hash iteration order change between iterations. There is an iteration seed, which changes for each iteration start, that we use to determine the iteration slot start (not always 0) and the iteration increment (only needs to be an odd number). That fixed immediately the test (and the cool visualization showed a uniform distribution).
This works with multi-threads without synchronization.

CollisionAvalanche benchmark after the change:

B004_HashSet_CollisionAvalanche.run __ HPPC _____________ 0.017 ± 0.001 s/op
B004_HashSet_CollisionAvalanche.run __ HPPC_SCATTER __ 0.017 ± 0.001 s/op

bruno-roustant · 2020-08-19T14:55:34Z

The HashCollisionsClusteringTest.testHashSetClusteringAtFront2() test is fixed but the problem is not completely fixed.
The test reused the same A by clearing it and refilling it. So the iteration seed changed.
But if the test was creating new A instances each time, it would currently have the same initial seed.
I'll fix that in the next commit.

dweiss · 2020-08-19T19:13:11Z

Hi Bruno. Just wanted to say I do follow your updates but haven't had the time to look at the code so far. Good the tests are catching odd cases - I'm sure there are more of these lurking around the corner. :)

dweiss · 2020-08-19T19:23:12Z

The "format violations" can be automatically corrected - just run ./gradlew spotlessApply and it'll reformat the code. There is a Google format plugin for IntelliJ as well but they're not always in perfect sync (with respect to imports, for example). I am generally happy with the formatting done by google's formatter - it is very convenient with cross-code refactorings and generally leaves the code in much better shape over time.

…rmScatterMap.

bruno-roustant · 2020-08-20T15:21:14Z

I removed ScatterSet and ScatterMap. Actually now HashSet and HashMap are like old scatter versions with a kind of shuffled iteration order.
Tests are passing and the speed is good.
Next step will be to remove HashOrderMixingStrategy and associated classes.

During this change I fixed ObjectIdentityHashSet.hashKey to use System.identityHashCode() (it didn't use it although I think the intent was to use it like in KTypeVTypeIdentityHashMap).

dweiss · 2020-08-20T15:21:35Z

hppc/src/main/java/com/carrotsearch/hppc/ObjectIdentityHashSet.java

@@ -46,7 +45,7 @@ public ObjectIdentityHashSet(ObjectContainer<? extends KType> container) {
  @Override
  protected int hashKey(KType key) {
    assert key != null; // Handled as a special case (empty slot marker).
-    return BitMixer.mixPhi(key);
+    return BitMixer.mixPhi(System.identityHashCode(key));


Uh. this looks like a bug in existing code too.

Corrected this on master, you may want to merge.

dweiss · 2020-08-20T15:24:32Z

Yep, I saw that. Looks like a legitimate bug to me.

dweiss · 2020-08-21T10:55:36Z

Finally had a chance to take a closer look. Thanks for all the work, Bruno. This approach looks very good to me indeed. I do have some minor concerns about iteration order of elements from the same container changing from call to call -- whether this will bite somebody is probably not a question of 'if' but 'when... :)

This said, for anybody who did use those collections the iteration order between runs would have changed anyway so I don't think it's a major issue.

Do you think it's ready to be merged in? If anything pops up, it can be another PR.

bruno-roustant · 2020-08-21T11:45:38Z

This was my last commit for this PR: to remove now unused HashOrderMixing and BitMixer's mix with seeds methods.
I tried to follow the code style. But sure there must be stuff you want to adapt/modify, please do so.
Note that I removed an example class and a section in overview.html about hash vs scatter. Maybe you need to restore some text/example, or add some comment about the iteration order change.
On my side it's ready (and now I'll have less spare cycles :).

dweiss · 2020-08-21T12:05:56Z

I'll merge it asap. Thanks Bruno.

There are 3 changes in hppc that impact OpenSearch: - ObjectIntScatterMap has been removed. I could have replaced it with ObjectIntHashMap, but a plain java.util.HashMap seems to solve the problem well so I used that instead. - `indexRemove(int index)` is a new method in two interfaces that we extend in a delegate pattern, so I have had to add that method as a pass-through in two places. - The seeded version of `BitMixer.mix(int, int)` [has been removed][1]. The implementation that was removed simply xor'd the value to the seed before calling the single-int variant of the method. I have changed the OpenSearch code to do the xor directly, so this should be equivalent to the previous implementation. [1]: carrotsearch/hppc#10 Signed-off-by: Andrew Ross <[email protected]>

HPPC-186: Protect the iteration order of KTypeHashSet

6696dbf

bruno-roustant force-pushed the master branch from cd19b46 to 6696dbf Compare August 18, 2020 12:06

bruno-roustant added 4 commits August 18, 2020 18:20

Also protect rehash() iteration.

ee4641a

HPPC-186: Protect the iteration order of KTypeVTypeHashMap.

3c05a75

Do not protect rehash. It was working fine before and protection has …

216e7c7

…a 20% perf penalty.

Remove HashOrderMixingStrategy from KTypeHashSet.

a689bd1

Make hash iteration order vary for each KTypeHashSet iteration.

Each KTypeHashSet instance has a different initial hash iteration seed.

285a174

bruno-roustant added 3 commits August 20, 2020 11:27

Remove KTypeScatterSet.

85a6744

Remove KTypeVTypeScatterMap.

f4da87a

Update KTypeVTypeHashMap with changing iteration. Remove KTypeVTypeWo…

32d8386

…rmScatterMap.

bruno-roustant changed the title ~~HPPC-186: Sample to change the iteration order of KTypeScatterSet.toArray()~~ HPPC-186: Change the iteration order of HashSet and HashMap. Remove hash order mixing. Aug 20, 2020

dweiss reviewed Aug 20, 2020

View reviewed changes

Remove unused HashOrderMixing and BitMixer's mix with seeds methods.

238f17f

dweiss merged commit eaba889 into carrotsearch:master Aug 21, 2020

karussell mentioned this pull request Feb 5, 2021

deterministic iteration vs. performance degradation #14

Closed

rjernst mentioned this pull request Feb 18, 2022

Upgrade hppc to 0.9.1 elastic/elasticsearch#84168

Closed

andrross mentioned this pull request May 10, 2022

Bump hppc from 0.8.1 to 0.9.1 in /server opensearch-project/OpenSearch#3199

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HPPC-186: Change the iteration order of HashSet and HashMap. Remove hash order mixing. #10

HPPC-186: Change the iteration order of HashSet and HashMap. Remove hash order mixing. #10

bruno-roustant commented Aug 14, 2020

vsonnier commented Aug 14, 2020

bruno-roustant commented Aug 14, 2020

dweiss commented Aug 14, 2020

bruno-roustant commented Aug 14, 2020

dweiss commented Aug 14, 2020

bruno-roustant commented Aug 17, 2020 •

edited

Loading

dweiss commented Aug 17, 2020

bruno-roustant commented Aug 17, 2020 •

edited

Loading

dweiss commented Aug 18, 2020

bruno-roustant commented Aug 18, 2020

dweiss commented Aug 18, 2020

bruno-roustant commented Aug 18, 2020

dweiss commented Aug 18, 2020

bruno-roustant commented Aug 19, 2020 •

edited

Loading

bruno-roustant commented Aug 19, 2020

dweiss commented Aug 19, 2020

dweiss commented Aug 19, 2020

bruno-roustant commented Aug 20, 2020

dweiss Aug 20, 2020

dweiss Aug 21, 2020

dweiss commented Aug 20, 2020

dweiss commented Aug 21, 2020

bruno-roustant commented Aug 21, 2020 •

edited

Loading

dweiss commented Aug 21, 2020

HPPC-186: Change the iteration order of HashSet and HashMap. Remove hash order mixing. #10

HPPC-186: Change the iteration order of HashSet and HashMap. Remove hash order mixing. #10

Conversation

bruno-roustant commented Aug 14, 2020

vsonnier commented Aug 14, 2020

bruno-roustant commented Aug 14, 2020

dweiss commented Aug 14, 2020

bruno-roustant commented Aug 14, 2020

dweiss commented Aug 14, 2020

bruno-roustant commented Aug 17, 2020 • edited Loading

dweiss commented Aug 17, 2020

bruno-roustant commented Aug 17, 2020 • edited Loading

dweiss commented Aug 18, 2020

bruno-roustant commented Aug 18, 2020

CollisionAvalanche benchmark before the change:

CollisionAvalanche benchmark after the change:

dweiss commented Aug 18, 2020

bruno-roustant commented Aug 18, 2020

dweiss commented Aug 18, 2020

bruno-roustant commented Aug 19, 2020 • edited Loading

CollisionAvalanche benchmark after the change:

bruno-roustant commented Aug 19, 2020

dweiss commented Aug 19, 2020

dweiss commented Aug 19, 2020

bruno-roustant commented Aug 20, 2020

dweiss Aug 20, 2020

Choose a reason for hiding this comment

dweiss Aug 21, 2020

Choose a reason for hiding this comment

dweiss commented Aug 20, 2020

dweiss commented Aug 21, 2020

bruno-roustant commented Aug 21, 2020 • edited Loading

dweiss commented Aug 21, 2020

bruno-roustant commented Aug 17, 2020 •

edited

Loading

bruno-roustant commented Aug 17, 2020 •

edited

Loading

bruno-roustant commented Aug 19, 2020 •

edited

Loading

bruno-roustant commented Aug 21, 2020 •

edited

Loading