-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove rehash! from Dict constructor #24345
Conversation
new(copy(d.slots), copy(d.keys), copy(d.vals), 0, d.count, d.age, d.idxfloor, | ||
d.maxprobe) | ||
new(copy(d.slots), copy(d.keys), copy(d.vals), d.ndel, d.count, d.age, | ||
d.idxfloor, d.maxprobe) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we do dnew = new(...)
and then rehash!(dnew)
if d.ndel > 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exactly what I have considered (as given in the comment to the PR). I have decided not to add it as it introduces a performance penalty for Dict
creation. Probably @JeffBezanson can provide the guidance here.
Another approach would be to test d.ndel > "some value greater than 0"
(i.e. if we have a lot of deleted entries we do rehash!
, and if there are only a few we leave it without rehashing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a second thought - my recommendation would be to remove rehash!
from the constructor. Its main benefit is that later we do not have to rehash!
both dictionaries twice. But this is something that we cannot avoid anyway.
If we believe that being able to a clean dictionary is an important feature I would rather recommend to:
- rename
rehash!
to_rehash!
; - expose a new
rehash!
as an exported function and define it forDict
andSet
. This newrehash!
should perform an additional check if the size is not decreased (as_rehash!
is designed for growing only).
Note that the |
Yes, this is ok as a (quasi-)bug fix. As a later enhancement, if the fraction of deleted keys is high we should probably copy the pairs by |
Good idea - I will do some benchmarks to find a reasonable break even and propose another PR. |
CI failure seems unrelated. Should this be merged? |
This PR follows https://discourse.julialang.org/t/is-there-a-bug-in-dict/6672/4.
It implements:
Dict{K,V}(d::Dict{K,V})
constructor performs a bare copy ofd
(withoutrehash!
);ht_keyindex2
toht_keyindex2!
(this is not strictly necessary for this PR so can be omitted, but it cleans up the related code, asht_keyindex2
in some cases may callrehash!
thus it may mutate the passed dictionary)I recommend to remove
rehash!
from the constructor. Another approach would be to rehash a newly createdDict
:but I do not see if there would be any huge benefit of this -
rehash!
will be called anyway if needed bysetindex!
orget!
and it adds performance penalty during construction.