Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add setwith! #117

Merged
merged 10 commits into from
Jan 29, 2024
Merged

Add setwith! #117

merged 10 commits into from
Jan 29, 2024

Conversation

theogf
Copy link
Collaborator

@theogf theogf commented Jan 9, 2023

This PR proposes a new feature which is an equivalent of mergewith! with a Dictionary containing only one element.
This is much faster than building a Dictionary every time.
I did not use token as the get approach was faster.
Here are some benchmarks:

julia> using BenchmarkTools, Dictionaries
julia> function tokensetwith!(f, d::AbstractDictionary{I}, i, value) where {I}
    i2 = safe_convert(I, i)
    hastoken, token = gettoken!(d, i2)
    hastoken ? settokenvalue!(d, token, f(gettokenvalue(d, token), value)) : settokenvalue!(d, token, value) 
    return d
end
julia> @btime mergewith!(+, d, Dictionary(Ref(2), Ref(2.0))) setup=(d=Dictionary(1:5, rand(5)))
  131.679 ns (6 allocations: 448 bytes)
julia> @btime setwith!(+, d, 2, 2.0) setup=(d=Dictionary(1:5, rand(5)))
  7.993 ns (0 allocations: 0 bytes)
julia> @btime tokensetwith!(+, d, 2, 2.0) setup=(d=Dictionary(1:5, rand(5)))
  8.785 ns (0 allocations: 0 bytes)

Additionaly I started to use safe_convert instead of converting and equality checking to avoid code duplication,

src/insertion.jl Outdated Show resolved Hide resolved
@theogf
Copy link
Collaborator Author

theogf commented Jan 23, 2023

Ping @andyferris :)

Copy link
Owner

@andyferris andyferris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@theogf very sorry I let this linger - life is too full!

I like the idea here, and the name is good, however one thing (and we see this sometimes in Base functions too) is that setwith! doesn't work well with dictionaries that contain nothing values. Note that the tokensetwith! and setwith! actually behave differently when the existing value is nothing.

I suspect that the tokensetwith! implementation needs @inbounds around the settokenvalue! statements to improve speed. Also the condtional could be moved inside the function call so only the value differs. Also we can make a second method for safe_convert in the case no conversion is necessary, to make it a no-op as it currently wastes time with isequal etc. (In any case it's a massive speed improvement on the mergewith! method, right?)

@theogf theogf closed this Jan 27, 2024
@theogf theogf reopened this Jan 27, 2024
@andyferris
Copy link
Owner

andyferris commented Jan 28, 2024

One more thing I'm thinking of here, is what to do in the case you want to insert a new mutable object in the "default" case, like an empty vector that you will push! to later? Or what if you want a random number, or want to grab the value from an external database?

For get! we have a form where the first argument is a zero-argument function that gets called in order to construct the value. I'm wondering if that is the pattern we should use here - f has two methods, one with zero one argument and another with one argument two arguments. (Or we take two functions. Or we keep the existing setwith! method and add a second one. Or we add a method to setwith! that accepts a Pair.) What do you think?

@andyferris
Copy link
Owner

andyferris commented Jan 28, 2024

So I guess I wonder if something like this would work ok?

function setwith!(f, d::AbstractDictionary{I}, i, value) where {I}
    i2 = safe_convert(I, i)
    (had_token, token) = gettoken!(d, i2)
    settoken!(d, token, had_token ? f(gettokenvalue(d, token), value) : f(value))
    return d
end

(Also, note the usage of gettoken! and gettokenvalue and settoken! here, which is faster and gets around needing a nothing sentinel)

@andyferris
Copy link
Owner

Hmm, I see now that the original mergewith! only supports simple reductions (not general folds). I also note that the "mutation" case is already served well by mutating the result of get!.

So probably the remaining work is to use gettoken! etc here.

Copy link
Owner

@andyferris andyferris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay - now we are using tokens, and I think this function makes sense the way you have it.

I'll note here for posterity that at some point in the future Base might have a function like update! or modify with similar behavior, and we might need to adjust the name to suit. But the current name works extremely well with set! so its OK to roll with it.

Copy link

codecov bot commented Jan 29, 2024

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (0598d92) 80.12% compared to head (7f7c65b) 80.40%.

Files Patch % Lines
src/insertion.jl 89.47% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #117      +/-   ##
==========================================
+ Coverage   80.12%   80.40%   +0.28%     
==========================================
  Files          20       20              
  Lines        2360     2358       -2     
==========================================
+ Hits         1891     1896       +5     
+ Misses        469      462       -7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@andyferris
Copy link
Owner

And just to check the benchmark at the top:

julia> using BenchmarkTools, Dictionaries

julia> function tokensetwith!(f, d::AbstractDictionary{I}, i, value) where {I}
           i2 = Dictionaries.safe_convert(I, i)
           hastoken, token = gettoken!(d, i2)
           @inbounds(hastoken ? settokenvalue!(d, token, f(gettokenvalue(d, token), value)) : settokenvalue!(d, token, value))
           return d
       end
tokensetwith! (generic function with 1 method)

julia> @btime mergewith!(+, d, Dictionary(Ref(2), Ref(2.0))) setup=(d=Dictionary(1:5, rand(5)))
  122.996 ns (6 allocations: 448 bytes)
5-element Dictionary{Int64, Float64}
 1 │ 0.6762046077805732
 2 │ 1796.0759530049804
 3 │ 0.15434878980223343
 4 │ 0.27622327088960563
 5 │ 0.6259347665546298

julia> @btime setwith!(+, d, 2, 2.0) setup=(d=Dictionary(1:5, rand(5)))
  8.033 ns (0 allocations: 0 bytes)
5-element Dictionary{Int64, Float64}
 1 │ 0.6825175785534227
 2 │ 1998.4092353771443
 3 │ 0.9883831681856018
 4 │ 0.8871562355778311
 5 │ 0.2561037264701246

julia> @btime tokensetwith!(+, d, 2, 2.0) setup=(d=Dictionary(1:5, rand(5)))
  8.039 ns (0 allocations: 0 bytes)
5-element Dictionary{Int64, Float64}
 1 │ 0.6663394005518313
 2 │ 1998.6949422352689
 3 │ 0.48414147097987303
 4 │ 0.32928579879421094
 5 │ 0.26274353692197994

@andyferris andyferris merged commit 8a5d0d5 into andyferris:master Jan 29, 2024
1 of 5 checks passed
@andyferris
Copy link
Owner

Thanks again @theogf - only took me a year...

@theogf theogf deleted the tgf/setwith! branch January 8, 2025 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants