You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Context: I maintain elsa, a crate which contains append-only collections that can be appended to with only &self references. Recently @steffahn brought this bug to my attention, where elsa's assumption that "a hashmap that only experiences insertions (not replacements or removals) will never drop values". This is incorrect in the (relatively rare) case Hash panics on subsequent calls. (It's a very interesting edge case, nice catch @steffahn!)
I do have a way to fix this: we can ensure that Hash is only called by our code (by using a wrapper type with a cached hash; a common pattern). There are other potential fixes too.
However, I do think that the behavior that insert() may drop values not being replaced is surprising, and while HashMap makes no promises of its behavior when you write silly Hash and Eq implementations, I do think a different route for the garbage-in-garbage-out behavior is possible: when Hash panics during realloc, stuff the rest of the entries back into the hashmap in order regardless of hash. It will be totally unpredictable on .get() (but we knew that; that's what is expected when you write silly Hash and Eq implementations), but it will end up calling the destructors at the right time. An alternate route would be to just leak the values.
Basically, I don't think the current behavior of going out of our way to clean stuff up is super necessary, and it makes things harder for unsafe code (elsa is a crate that heavily relies on this, but I can imagine other unsafe code writers making similar assumptions in more specific cases).
Thoughts?
The text was updated successfully, but these errors were encountered:
I suppose we could just mark the elements as having a hash value 0 instead of deleting them, however this is a deep implementation detail of the hash table. Consider the future where we replace the standard library hash table implementation yet again. It would not be possible to guarantee that this behavior remains the same.
My recommendation would be to look into a different workaround in your own code.
I am looking into other workarounds in my code; but I think this is a useful property for both hashbrown and the stdlib to try and maintain for the uses of all unsafe code, not just mine. And it's not just useful, it's an easy assumption to make.
Append-only collections are a common pattern and I've seen this kind of logic used outside of elsa as well. I think others will make this assumption in the future; and ideally they would be able to.
I could ask upstream that this property be codified in the docs once we have it settled here.
Context: I maintain
elsa
, a crate which contains append-only collections that can be appended to with only&self
references. Recently @steffahn brought this bug to my attention, where elsa's assumption that "a hashmap that only experiences insertions (not replacements or removals) will never drop values". This is incorrect in the (relatively rare) caseHash
panics on subsequent calls. (It's a very interesting edge case, nice catch @steffahn!)I do have a way to fix this: we can ensure that
Hash
is only called by our code (by using a wrapper type with a cached hash; a common pattern). There are other potential fixes too.However, I do think that the behavior that
insert()
may drop values not being replaced is surprising, and whileHashMap
makes no promises of its behavior when you write sillyHash
andEq
implementations, I do think a different route for the garbage-in-garbage-out behavior is possible: whenHash
panics during realloc, stuff the rest of the entries back into the hashmap in order regardless of hash. It will be totally unpredictable on.get()
(but we knew that; that's what is expected when you write sillyHash
andEq
implementations), but it will end up calling the destructors at the right time. An alternate route would be to just leak the values.Basically, I don't think the current behavior of going out of our way to clean stuff up is super necessary, and it makes things harder for unsafe code (
elsa
is a crate that heavily relies on this, but I can imagine other unsafe code writers making similar assumptions in more specific cases).Thoughts?
The text was updated successfully, but these errors were encountered: