Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop using permute!! and invpermute!! #44941

Merged
merged 6 commits into from
May 11, 2022

Conversation

LilithHafner
Copy link
Member

Everything from #44869 except for actually removing permute!! and invpermute!!. See the discussion there for details.

@@ -232,7 +233,7 @@ julia> A
1
```
"""
invpermute!(a, p::AbstractVector) = invpermute!!(a, copymutable(p))
invpermute!(v, p::AbstractVector) = copyto!(v, v[invperm(p)])
Copy link
Member

@stevengj stevengj Apr 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to use something like:

w = similar(v)
w[p] = v
return w

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, that violates invpermute!(v, p) === v.

This is a candidate implementation of invpermute, a replacement of v[invperm(p)] that runs 2-3x faster, but we'd need to demonstrate a convincing use case to include it in base rather than Permutations.jl.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, the last line should be return copyto!(v, w)

Copy link
Member

@stevengj stevengj Apr 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my machine with a random length-1000 permutation and v::Vector{Float64}, my version (with the copyto!(v, w) fix) is 30% faster than copyto!(v, v[invperm(p)]) as well as allocating less memory.

Copy link
Member

@stevengj stevengj Apr 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even simpler,

invpermute!(v, p::AbstractVector) = (v[p] = v; v)

(setindex! correctly handles the aliasing.)

Copy link
Member

@stevengj stevengj Apr 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But again, if someone wants this, I'm not sure I see the point of having an invpermute! function at all. It's shorter to just write v[p] = v.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And again, I think that we should take out invpermute! in 2.0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While these changes will make the choice to have invpermute! in base look silly, given that we do have invpermute! in 1.x, I think this version is better than what we have right now because it is faster in most cases and consequently better facilitates migration away from Base.invpermute!!

julia> @benchmark invpermute!(v, p) setup=(v=rand(3000); p=shuffle(1:3000)) evals=1 # This PR
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):   4.647 μs   13.241 ms  ┊ GC (min  max):  0.00%  99.75%
 Time  (median):      6.744 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   18.594 μs ± 265.536 μs  ┊ GC (mean ± σ):  26.63% ±  2.00%

  ▇█▅▃▃▂▁▁▁▃▃▂▁                                                ▂
  ███████████████▇▇▆▆▆▆▅▅▆▅▅▆▄▄▅▅▄▅▃▄▅▅▄▄▃▃▄▅▄▁▃▁▁▃▃▅▄▃▄▁▃▁▃▁▃ █
  4.65 μs       Histogram: log(frequency) by time      90.6 μs <

 Memory estimate: 23.48 KiB, allocs estimate: 2.

julia> @benchmark invpermute!(v, p) setup=(v=rand(3000); p=shuffle(1:3000)) evals=1 # 1.7.2
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  15.289 μs   16.599 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     20.938 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   45.645 μs ± 302.911 μs  ┊ GC (mean ± σ):  9.46% ± 1.99%

  █▄▆▆▃▂▁▁▁▁                                                   ▂
  ████████████▇█▇▆▆▆▆▆▅▆▅▆▅▆▄▄▄▆▄▄▄▃▃▁▄▃▄▄▄▄▄▄▄▅▁▃▁▄▄▃▁▃▄▁▁▃▄▄ █
  15.3 μs       Histogram: log(frequency) by time       377 μs <

 Memory estimate: 23.48 KiB, allocs estimate: 2.

julia> @benchmark Base.invpermute!!(v, p) setup=(v=rand(3000); p=shuffle(1:3000)) evals=1 # This PR
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):   9.758 μs    8.325 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     12.275 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   20.373 μs ± 137.556 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇█▆▃▂▁  ▁▁                                                   ▂
  ███████████▇▇▇▆▅▇▆▅▅▆▅▅▆▆▆▅▅▅▆▅▃▄▅▆▅▄▅▅▄▄▄▄▃▄▄▃▄▃▁▁▃▄▃▁▅▄▁▁▄ █
  9.76 μs       Histogram: log(frequency) by time       111 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

@@ -185,7 +184,7 @@ julia> A
1
```
"""
permute!(a, p::AbstractVector) = permute!!(a, copymutable(p))
permute!(v, p::AbstractVector) = copyto!(v, v[p])
Copy link
Member

@stevengj stevengj Apr 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue I have with this is that it seems to violate the basic expectation of permute! by allocating a copy of the array v. If the user wanted v .= v[p], why would they call permute!(v, p) instead of just writing v .= v[p]?

What's the point of having this function at all if it doesn't actually work in-place in v?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that permute!(::AbstractVector, ::AbstractVector) is a natural function to define. It is an inherently inefficient way to manipulate data and so I do not believe it belongs in base, but that's a 2.0 change.

@stevengj
Copy link
Member

stevengj commented Apr 12, 2022

I don't think that permute!(::AbstractVector, ::AbstractVector) is a natural function to define. It is an inherently inefficient way to manipulate data.

It's not inherently inefficient if the element size is large.

For example, for a 1000x1000 matrix A = rand(1000,1000) and a random column permutation p = randperm(1000), I find that Base.permutecols!!(A, copy(p)) is more than 2x faster (and allocates 1000x less memory) than the out-of-place A[:, p], because the "elements" being permuted are length-1000 columns.

(Moreover, it's nontrivial to implement an in-place permutation.)

Because of that, I feel like if someone explicitly calls permute!(v, p) then we should give them what they asked for, an in-place algorithm. If they wanted out-of-place they should have done v[p] (which is faster in the common case of small elements).

@LilithHafner
Copy link
Member Author

That's a valid point, though unless I'm mistaken, the changes here won't affect dispatch to Base.permutecols!!.

Anecdotally, most uses of permute! I've seen would benefit from using this version instead. The charts on #44869 show relevant performance data, though the invpermute! data should be 30% better now due to your changes.

@stevengj
Copy link
Member

stevengj commented Apr 12, 2022

Yes, the changes here won't affect permutecols!!, but permutecols!! uses the same algorithm as permute!! (with a specialized swapcols!), so it was just an easy way to do a performance comparison with a large eltype.

I agree that most usage in practice is probably with such small element types that out-of-place permutation is preferable. But in that case the callers should be patched to use out-of-place copies. If someone explicitly requested an in-place algorithm I think we should give them what they asked for.

(If all of the extant callers in packages and Base are changed to use out-of-place copies, then since permute! is unexported we can think about moving it to Permutations.jl.)

@LilithHafner
Copy link
Member Author

LilithHafner commented Apr 12, 2022

Unfortunately, permute! is exported and has hundreds of uses.

! indicates mutation, not in-place-ness. For example, sort!(x) allocates for many inputs to allow for faster sorting. It is often convenient to have a mutating function rather than an assignment (e.g. modifying the field of an (immutable) struct). From what I've seen, the most common use case for permute! is an explicit application of a permutation via mutation where mutation is for ease of use, not for efficiency. Given that permute! is exported by base, I think it is perfectly reasonable for a user to use it for convenience in this common way and so long as we support permute!, we should optimize for the common use case.

While I'm sure they exist, FWIW I've never stumbled upon a use of permute! that has an eltype larger than Int.

@stevengj
Copy link
Member

Fair enough.

@LilithHafner
Copy link
Member Author

@oscardssmith, in light of @stevengj 's comments and subsequent discussion, are you still in favor of merging?

@oscardssmith oscardssmith added the triage This should be discussed on a triage call label Apr 15, 2022
@oscardssmith
Copy link
Member

Given the complications here, I think it's probably worth getting weigh in from triage first.

@JeffBezanson
Copy link
Member

Looks to me like the situation is that this is a speedup for the common case of small elements? Given that plus the two approved reviews triage is ok with merging this. I agree that ! simply refers to mutation and does not promise an in-place algorithm.

@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Apr 28, 2022
@LilithHafner
Copy link
Member Author

Bump, I think the consensus is to merge but I don't have permissions.

@KristofferC KristofferC added the merge me PR is reviewed. Merge when all tests are passing label May 10, 2022
@KristofferC KristofferC merged commit c7e7a5d into JuliaLang:master May 11, 2022
@LilithHafner LilithHafner deleted the permute-0 branch May 12, 2022 13:42
@giordano giordano removed the merge me PR is reviewed. Merge when all tests are passing label May 13, 2022
@ViralBShah ViralBShah added the sorting Put things in order label Jul 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster sorting Put things in order
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants