Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove permute!! and invpermute!! #44869

Closed
wants to merge 6 commits into from

Conversation

LilithHafner
Copy link
Member

I don't think that these algorithms provide an unambiguous performance improvement that justifies their inclusion in base.

Perhaps a more complex dispatch system that looks at length and eltype might be warranted, but I'm in favor of excision and putting fancy and high-efficiency permutations somewhere like Permutations.jl. Here are some benchmarks that shed light on the consequences of this PR, green is an improvement and red is a regression:

permute
invpermute

Roguhly, I'm benchmarking [inv]permute!(Vector{NTuple{width, Int}}(undef, length), shuffle!(collect(1:length)))

Benchmark code
using Random, Plots
_permute!(a, p::AbstractVector) = copyto!(a, a[p])
_invpermute!(a, p::AbstractVector) = copyto!(a, a[invperm(p)])

function t(::Type{T}, n, fs...) where T
    m = max(1, 10^6 ÷ max(10, n * sizeof(T)))
    A = [Vector{T}(undef, n) for _ in 1:m]
    P = [shuffle!(collect(1:n)) for _ in 1:m]
    [@elapsed(for (a,p) in zip(A,P) 
        f(a,p)
    end) for f in fs]
    [@elapsed(for (a,p) in zip(A,P) 
        f(a,p)
    end) for f in fs]
end

function compute()
    global fs = permute!,_permute!,invpermute!,_invpermute!
    global data = [t(NTuple{x,Int},2^y,fs...) 
        for x in 1:15, y in 1:20]
end
function plot()
    for (i,f) in collect(enumerate(fs))[[1,3]]
        display(heatmap(
            log10.(getindex.(data, i+1) ./ getindex.(data, i)),
            c=cgrad([:black, :green, :white, :red, :black]),
            xlabel="log2(length)", ylabel="width",
            title="log10(_$f / $f)",
            clims=(-2,2)))
    end
end

@LilithHafner LilithHafner added speculative Whether the change will be implemented is speculative performance Must go faster excision Removal of code from Base or the repository labels Apr 6, 2022
@StefanKarpinski StefanKarpinski modified the milestone: 2.0 Apr 6, 2022
@StefanKarpinski
Copy link
Member

Ah, I had marked this as a 2.0 change because I thought these were exported but they aren't so we could remove them in a minor release.

@stevengj
Copy link
Member

stevengj commented Apr 9, 2022

Note that in practice, there are several packages using Base.permute!!: DataArrays, StructArrays, DataFrames, IndexedTables, PooledArrays, … so this change will be breaking.

@LilithHafner
Copy link
Member Author

LilithHafner commented Apr 11, 2022

Good catch, @stevengj.

Going through all the packages that use Base.permute!! or Base.ipermute!!

DataArrays has been deprecated since julia 1.0
DataFrames no longer uses Base.permute!! in its master branch, it now uses a much faster approach.
PooledArrays only uses Base.permute!! to define Base.permute!! on the PooledArray type.

Some packages would need to switch from Base.permute!! to permute!. All of these packages use Base.permute!! instead of permute! only for performance reasons but may experience a speedup switching to permute! after this PR compared to the current Base.permute!!.

Clustering would almost certainly see a speedup as the eltype it's permuting is Int.
IndexdTables could be much (perhaps 2x) faster if it used a similar approach as DataFrames instead of Base.permute!!, whether it sees a speedup depends on user eltype size and length.
Qaintessent may see a minor regression.
StructArrays uses Base.permute!! on an eltype of references, so I suspect it would also be faster after this PR to simply call permute!.

If folks want to go through with this PR, I can make the necessary PRs to those packages.

@oscardssmith
Copy link
Member

If these PRs are expected to be performance wins, it sounds like they should be made whether we want this PR or not.

@LilithHafner
Copy link
Member Author

LilithHafner commented Apr 11, 2022

The performance wins are of the form "permute! after PR is better than Base.permute!! which is better than permute! before PR", so without this PR they would have to switch to copyto!(v, v[p]) rather than permute!(v, p), that aside, yes.

@oscardssmith
Copy link
Member

In that case, I think the order should be

  1. fix performance of permute! / invpermute!.
  2. make PRs to packages.
  3. decide whether we want to deprecate permute!!

@LilithHafner
Copy link
Member Author

LilithHafner commented Apr 11, 2022

@LilithHafner LilithHafner added this to the 1.10 milestone Jul 6, 2022
@LilithHafner LilithHafner marked this pull request as draft July 6, 2022 00:24
@LilithHafner
Copy link
Member Author

Waiting for the release of 1.9 to begin step 2.

@alyst
Copy link
Contributor

alyst commented Sep 8, 2022

[inv]permute!!(v, p) doesn't allocate the new array, so while the algorithm itself might be slower than copyto!(v, v[p]), in certain contexts the GC overhead of permute!() may result in lower overall performance.

@LilithHafner
Copy link
Member Author

The benchmarks in the OP do not exclude GC overhead.

@alyst
Copy link
Contributor

alyst commented May 14, 2023

The benchmarks in the OP do not exclude GC overhead.

They do not, but IIUC they don't test for cases where v and/or p are reused 1000x times.

@LilithHafner
Copy link
Member Author

I'm not going to reproduce the above figure with 1000x iterations because that would take to long. Here's a single point from the figure in the OP, benchmarked while reusing both parameters 1000x times. This result is consistent with the figures in the OP.

julia> x2 = rand(1000); perm = Vector{Int}(undef, 1000); perm2 = Vector{Int}(undef, 1000); @benchmark Base.permute!!($x2, copyto!($perm2, perm)) setup=(rand!($x2); randperm!($perm)) gcsample=false gctrial=false evals=1000 samples=100
BenchmarkTools.Trial: 100 samples with 1000 evaluations.
 Range (min  max):  2.932 μs    4.214 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     3.284 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   3.293 μs ± 215.969 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

       ▃▃ ▃  ▃▁ ▁ ▁▁▁  ▆█▆▆ ▃▆▃▆                               
  ▄▄▄▁▄██▄█▁▄██▁█▁███▇▁████▇████▇▇▁▄▄▄▇▄▁▁▄▄▁▁▁▁▁▁▄▁▄▄▁▁▁▁▇▁▇ ▄
  2.93 μs         Histogram: frequency by time        3.83 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> x2 = rand(1000); perm = Vector{Int}(undef, 1000); perm2 = Vector{Int}(undef, 1000); @benchmark Base.permute!($x2, copyto!($perm2, perm)) setup=(rand!($x2); randperm!($perm)) gcsample=false gctrial=false evals=1000 samples=100
BenchmarkTools.Trial: 100 samples with 1000 evaluations.
 Range (min  max):  1.018 μs  8.594 μs  ┊ GC (min  max):  0.00%  79.15%
 Time  (median):     1.323 μs             ┊ GC (median):     0.00%
 Time  (mean ± σ):   1.730 μs ± 1.108 μs  ┊ GC (mean ± σ):  22.74% ± 23.38%

    █                                                        
  ▄▄██▄▃▂▂▃▃▁▁▁▁▁▃▁▄▃▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▂ ▂
  1.02 μs        Histogram: frequency by time       6.83 μs <

 Memory estimate: 7.94 KiB, allocs estimate: 1.

@stevengj
Copy link
Member

Note that these follow-the-cycles algorithms become more of a win when the elements size is large, as in Base.permutecols!! where we are permuting the columns of a matrix (used for sorting eigenvectors).

@LilithHafner LilithHafner removed the speculative Whether the change will be implemented is speculative label May 27, 2023
@LilithHafner LilithHafner marked this pull request as ready for review May 31, 2023 13:51
@LilithHafner LilithHafner removed the performance Must go faster label May 31, 2023
@oscardssmith oscardssmith added the needs pkgeval Tests for all registered packages should be run with this change label May 31, 2023
@PallHaraldsson
Copy link
Contributor

If you want to go ahead with this, then I agree it should be dropped ASAP in 1.11/master, early in 1.11 development, in case we need to revert it later.

We shouldn't backport this to 1.10

If you drop this in 1.11 I would argue doing it in 1.10 too before its release, in case it will be LTS. It would be easy to add it back in 1.10.1, but if it stays in then it will be a (practically) breaking change to drop in in a 1.10.x, seemingly bad for an LTS.

@LilithHafner
Copy link
Member Author

888 packages failed tests only on the current version.

😢

@LilithHafner
Copy link
Member Author

After Rerunning after JuliaRegistries/General#91067
@nanosoldier runtests()

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@LilithHafner
Copy link
Member Author

@nanosoldier runtests(["PooledArrays"], configuration=(registry="LilithHafner/General#permute-testing", ))

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@LilithHafner
Copy link
Member Author

@nanosoldier runtests(["PooledArrays"], configuration=(registry="LilithHafner/General#permute-testing", ), vs = ":master")

@LilithHafner
Copy link
Member Author

Trying again after fixing a bug in my custom registry setup:
@nanosoldier @nanosoldier runtests(["PooledArrays"], configuration=(registry="LilithHafner/General#permute-testing", ), vs = ":master")

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@LilithHafner
Copy link
Member Author

tehe, lots of noise and trial and error, but not too much load on the nanosoldier machines as I experiment with this. I think my most recent invocation was ignored because I miscopied it.

@nanosoldier runtests(["PooledArrays"], configuration=(registry="LilithHafner/General#permute-testing", ), vs = ":master")

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@LilithHafner
Copy link
Member Author

@nanosoldier runtests(["PooledArrays"])

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - no new issues were detected.
The full report is available.

@LilithHafner
Copy link
Member Author

@nanosoldier runtests()

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@LilithHafner
Copy link
Member Author

Failures due to folks using old versions of PooledArrays and StructArrays. I'll try again in a while.

@LilithHafner
Copy link
Member Author

Eh, whatever. This is good enough. The implementations are gone and the methods are deprecated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
excision Removal of code from Base or the repository needs pkgeval Tests for all registered packages should be run with this change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants