-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove permute!! and invpermute!! #44869
Conversation
Ah, I had marked this as a 2.0 change because I thought these were exported but they aren't so we could remove them in a minor release. |
Note that in practice, there are several packages using |
Good catch, @stevengj. Going through all the packages that use DataArrays has been deprecated since julia 1.0 Some packages would need to switch from Clustering would almost certainly see a speedup as the eltype it's permuting is If folks want to go through with this PR, I can make the necessary PRs to those packages. |
If these PRs are expected to be performance wins, it sounds like they should be made whether we want this PR or not. |
The performance wins are of the form " |
In that case, I think the order should be
|
Waiting for the release of 1.9 to begin step 2. |
|
The benchmarks in the OP do not exclude GC overhead. |
They do not, but IIUC they don't test for cases where |
I'm not going to reproduce the above figure with 1000x iterations because that would take to long. Here's a single point from the figure in the OP, benchmarked while reusing both parameters 1000x times. This result is consistent with the figures in the OP. julia> x2 = rand(1000); perm = Vector{Int}(undef, 1000); perm2 = Vector{Int}(undef, 1000); @benchmark Base.permute!!($x2, copyto!($perm2, perm)) setup=(rand!($x2); randperm!($perm)) gcsample=false gctrial=false evals=1000 samples=100
BenchmarkTools.Trial: 100 samples with 1000 evaluations.
Range (min … max): 2.932 μs … 4.214 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 3.284 μs ┊ GC (median): 0.00%
Time (mean ± σ): 3.293 μs ± 215.969 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▃▃ ▃ ▃▁ ▁ ▁▁▁ ▆█▆▆ ▃▆▃▆
▄▄▄▁▄██▄█▁▄██▁█▁███▇▁████▇████▇▇▁▄▄▄▇▄▁▁▄▄▁▁▁▁▁▁▄▁▄▄▁▁▁▁▇▁▇ ▄
2.93 μs Histogram: frequency by time 3.83 μs <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> x2 = rand(1000); perm = Vector{Int}(undef, 1000); perm2 = Vector{Int}(undef, 1000); @benchmark Base.permute!($x2, copyto!($perm2, perm)) setup=(rand!($x2); randperm!($perm)) gcsample=false gctrial=false evals=1000 samples=100
BenchmarkTools.Trial: 100 samples with 1000 evaluations.
Range (min … max): 1.018 μs … 8.594 μs ┊ GC (min … max): 0.00% … 79.15%
Time (median): 1.323 μs ┊ GC (median): 0.00%
Time (mean ± σ): 1.730 μs ± 1.108 μs ┊ GC (mean ± σ): 22.74% ± 23.38%
█
▄▄██▄▃▂▂▃▃▁▁▁▁▁▃▁▄▃▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▂ ▂
1.02 μs Histogram: frequency by time 6.83 μs <
Memory estimate: 7.94 KiB, allocs estimate: 1. |
Note that these follow-the-cycles algorithms become more of a win when the elements size is large, as in |
If you want to go ahead with this, then I agree it should be dropped ASAP in 1.11/master, early in 1.11 development, in case we need to revert it later.
If you drop this in 1.11 I would argue doing it in 1.10 too before its release, in case it will be LTS. It would be easy to add it back in 1.10.1, but if it stays in then it will be a (practically) breaking change to drop in in a 1.10.x, seemingly bad for an LTS. |
😢 |
After Rerunning after JuliaRegistries/General#91067 |
The package evaluation job you requested has completed - possible new issues were detected. |
@nanosoldier |
The package evaluation job you requested has completed - possible new issues were detected. |
@nanosoldier |
Trying again after fixing a bug in my custom registry setup: |
The package evaluation job you requested has completed - possible new issues were detected. |
tehe, lots of noise and trial and error, but not too much load on the nanosoldier machines as I experiment with this. I think my most recent invocation was ignored because I miscopied it. @nanosoldier |
The package evaluation job you requested has completed - possible new issues were detected. |
@nanosoldier |
The package evaluation job you requested has completed - no new issues were detected. |
@nanosoldier |
The package evaluation job you requested has completed - possible new issues were detected. |
Failures due to folks using old versions of PooledArrays and StructArrays. I'll try again in a while. |
Eh, whatever. This is good enough. The implementations are gone and the methods are deprecated. |
I don't think that these algorithms provide an unambiguous performance improvement that justifies their inclusion in base.
Perhaps a more complex dispatch system that looks at length and eltype might be warranted, but I'm in favor of excision and putting fancy and high-efficiency permutations somewhere like Permutations.jl. Here are some benchmarks that shed light on the consequences of this PR, green is an improvement and red is a regression:
Roguhly, I'm benchmarking
[inv]permute!(Vector{NTuple{width, Int}}(undef, length), shuffle!(collect(1:length)))
Benchmark code