Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deepcopy/serialize when used after delete! #112

Merged
merged 4 commits into from
Oct 9, 2022

Conversation

theogf
Copy link
Collaborator

@theogf theogf commented Sep 21, 2022

Hey @andyferris !

It looks my deepcopy and serialize implementation have a bug.
I wrongly assumed that indices.values == collect(indices).
It is visibly not true and create undef references when using non isbits structure.

I used collect(indices) instead and took care of updating the tests accordingly.

(version patch is bumped as well)

@theogf
Copy link
Collaborator Author

theogf commented Sep 21, 2022

Oh no there are actually more issues with the values of the Dictionary itself

@andyferris
Copy link
Owner

I see the problem.

What we can do is take a shortcut if holes == 0. Otherwise we need to use the kind of logic you see in rehash!. I suggest one of the two following options:

  1. Just call rehash! if holes != 0 and then continue the current logic.
  2. Iterate the tokens and collect the results if holes != 0

The disadvantage of the first one is it won't be thread safe when you have multiple readers (I'm not even sure it is safe under single-threaded concurrency).

Probably should go with the second with code like:

serialize_type(s, T, false)
if indices.holes == 0
    return serialize(s, ind.values)
else
    is = Vector{I}(undef, length(dict))
    @inbounds for t in tokens(dict)
        is[i] = ind.values[t]
    end
    return serialize(s, is)
end

It might be even better if we could stream out the indices in the second branch rather than collecting them in memory, but at least this is safe.

Will need equivalent code for dictionaries, and for deepcopy.

@theogf
Copy link
Collaborator Author

theogf commented Sep 22, 2022

So I was thinking of an even more straightforward solution with dispatching deepcopy_internal on the Dictionary and collecting the values themself.

Something like

deepcopy(d::Dictionary{I,T}) where{I,T} = Dictionary{I,T}(deepcopy(collect(keys(d))), deepcopy(collect(d))) 

This way we just build a whole new Dictionary and avoid shenanigans ensuring that the holes and slots are consistent?

@theogf
Copy link
Collaborator Author

theogf commented Sep 22, 2022

So I just implemented my proposal

@codecov
Copy link

codecov bot commented Sep 22, 2022

Codecov Report

Base: 78.86% // Head: 79.33% // Increases project coverage by +0.47% 🎉

Coverage data is based on head (ce7b08b) compared to base (7d596e8).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #112      +/-   ##
==========================================
+ Coverage   78.86%   79.33%   +0.47%     
==========================================
  Files          20       20              
  Lines        2342     2347       +5     
==========================================
+ Hits         1847     1862      +15     
+ Misses        495      485      -10     
Impacted Files Coverage Δ
src/Dictionary.jl 80.52% <100.00%> (+0.52%) ⬆️
src/Indices.jl 91.04% <100.00%> (+2.89%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@theogf
Copy link
Collaborator Author

theogf commented Sep 26, 2022

@andyferris Would you rather have me use your solution instead?

@0x0f0f0f
Copy link

@theogf news here?

@theogf
Copy link
Collaborator Author

theogf commented Oct 5, 2022

Hi @andyferris, any update on this :) ?

@andyferris
Copy link
Owner

Sorry I have been smashed.

Correct is more important the fast, so lets merge this. Thanks @theogf

@andyferris andyferris merged commit 83d4105 into andyferris:master Oct 9, 2022
@andyferris
Copy link
Owner

@theogf theogf deleted the fix-deepcopy-2 branch November 24, 2022 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants