Consider not dropping entries on panicky rehashes #444

Manishearth · 2023-07-16T14:22:17Z

Context: I maintain elsa, a crate which contains append-only collections that can be appended to with only &self references. Recently @steffahn brought this bug to my attention, where elsa's assumption that "a hashmap that only experiences insertions (not replacements or removals) will never drop values". This is incorrect in the (relatively rare) case Hash panics on subsequent calls. (It's a very interesting edge case, nice catch @steffahn!)

I do have a way to fix this: we can ensure that Hash is only called by our code (by using a wrapper type with a cached hash; a common pattern). There are other potential fixes too.

However, I do think that the behavior that insert() may drop values not being replaced is surprising, and while HashMap makes no promises of its behavior when you write silly Hash and Eq implementations, I do think a different route for the garbage-in-garbage-out behavior is possible: when Hash panics during realloc, stuff the rest of the entries back into the hashmap in order regardless of hash. It will be totally unpredictable on .get() (but we knew that; that's what is expected when you write silly Hash and Eq implementations), but it will end up calling the destructors at the right time. An alternate route would be to just leak the values.

Basically, I don't think the current behavior of going out of our way to clean stuff up is super necessary, and it makes things harder for unsafe code (elsa is a crate that heavily relies on this, but I can imagine other unsafe code writers making similar assumptions in more specific cases).

Thoughts?

The text was updated successfully, but these errors were encountered:

Amanieu · 2023-07-17T13:13:12Z

The relevant code is here: https://github.com/rust-lang/hashbrown/blob/3b348eaac93c5ee60f7ce1ff5a246e86782497d1/src/raw/mod.rs#L2277C45-L2277C46

I suppose we could just mark the elements as having a hash value 0 instead of deleting them, however this is a deep implementation detail of the hash table. Consider the future where we replace the standard library hash table implementation yet again. It would not be possible to guarantee that this behavior remains the same.

My recommendation would be to look into a different workaround in your own code.

Manishearth · 2023-07-17T15:34:07Z

I am looking into other workarounds in my code; but I think this is a useful property for both hashbrown and the stdlib to try and maintain for the uses of all unsafe code, not just mine. And it's not just useful, it's an easy assumption to make.

Append-only collections are a common pattern and I've seen this kind of logic used outside of elsa as well. I think others will make this assumption in the future; and ideally they would be able to.

I could ask upstream that this property be codified in the docs once we have it settled here.

Manishearth mentioned this issue Jul 16, 2023

FrozenMap unsound if Hash panics in such a way as to prevent HashMap rehashing to succeed. Manishearth/elsa#48

Open

steffahn mentioned this issue Nov 22, 2023

OnceMap unsound if Hash panics in such a way as to prevent HashMap/HashTable rehashing to succeed a1phyr/once_map#3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider not dropping entries on panicky rehashes #444

Consider not dropping entries on panicky rehashes #444

Manishearth commented Jul 16, 2023

Amanieu commented Jul 17, 2023

Manishearth commented Jul 17, 2023

Consider not dropping entries on panicky rehashes #444

Consider not dropping entries on panicky rehashes #444

Comments

Manishearth commented Jul 16, 2023

Amanieu commented Jul 17, 2023

Manishearth commented Jul 17, 2023