Simplify indexing #10

mcabbott · 2020-11-16T23:47:17Z

Before:

julia> ca = CircularArray(rand(Int8, 10,10));

julia> @btime [$ca[i,j] for i in 1:21, j in 1:21];
  52.352 μs (886 allocations: 48.91 KiB)

After:

julia> @btime [$ca[i,j] for i in 1:21, j in 1:21]
  4.009 μs (1 allocation: 544 bytes)
21×21 Matrix{Int8}:
 -69  -22   -83  -28    -7   -20   106  -119  …  -28    -7   -20   106  -119    59  -111  -69
  25  -72  -123  118    36   -54   -81   118     118    36   -54   -81   118   -90   -77   25
...

codecov · 2020-11-16T23:48:33Z

Codecov Report

Merging #10 (a4d79af) into master (fe49430) will decrease coverage by 29.62%.
The diff coverage is 46.66%.

@@             Coverage Diff              @@
##            master      #10       +/-   ##
============================================
- Coverage   100.00%   70.37%   -29.63%     
============================================
  Files            1        1               
  Lines           18       27        +9     
============================================
+ Hits            18       19        +1     
- Misses           0        8        +8

Flag	Coverage Δ
unittests	`70.37% <46.66%> (-29.63%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
src/CircularArrays.jl	`70.37% <46.66%> (-29.63%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fe49430...a4d79af. Read the comment docs.

mcabbott · 2020-11-16T23:52:36Z

One behaviour changed by this PR is:

julia> a1 = CircularArray(b_arr)
3×4 CircularArray{Int64, 2, Matrix{Int64}}:
  2   4   6   8
 10  12  14  16
 18  20  22  24

julia> a1[3]
18

julia> a1[4] # was == 2
4

So linear indexing will cover the whole matrix, not just the first column. It still has IndexStyle(a1) == IndexCartesian() but could inherit that from the parent, too.

Vexatos · 2020-11-17T19:25:07Z

Thanks for the pull request! Here are some performance results from me.

Before:

julia> ca = CircularArray(rand(Int8, 10, 10));

julia> @btime [$ca[i,j] for i in 1:21, j in 1:21];
  43.500 μs (883 allocations: 48.77 KiB)

julia> @btime [$ca[i] for i in 1:(21*21)];
  2.125 μs (1 allocation: 544 bytes)

julia> ca = CircularArray(rand(Int8, 100));

julia> @btime [$ca[i] for i in 1:(21*21)];
  2.122 μs (1 allocation: 544 bytes)

After:

julia> ca = CircularArray(rand(Int8, 10, 10));

julia> @btime [$ca[i,j] for i in 1:21, j in 1:21];
  4.855 μs (1 allocation: 544 bytes)

julia> @btime [$ca[i] for i in 1:(21*21)];
  5.965 μs (1 allocation: 544 bytes)

julia> ca = CircularArray(rand(Int8, 100));

julia> @btime [$ca[i] for i in 1:(21*21)];
  2.124 μs (1 allocation: 544 bytes)

Linear indexing of linear arrays didn't change, 2D-indexing of matrices got a lot faster just as you reported and no longer does a lot of allocations, but linear indexing of matrices has gotten slower. All these results can be reproduced for larger arrays (200×200 elements accessed) and more dimensions. Linear indexing of a higher-dimensional array is consistently about 3 times slower with the PR than without for 2D arrays, 4 times slower with 3D arrays and 6 times slower with 4D arrays.

Before:

julia> ca = CircularArray(rand(Int8, 4, 4, 4));

julia> @btime [$ca[i] for i in 1:(21*21)];
  2.010 μs (1 allocation: 544 bytes)

julia> ca = CircularArray(rand(Int8, 3, 3, 3, 3));

julia> @btime [$ca[i] for i in 1:(21*21)];
  2.123 μs (1 allocation: 544 bytes)

After:

julia> ca = CircularArray(rand(Int8, 4, 4, 4));

julia> @btime [$ca[i] for i in 1:(21*21)];
  10.064 μs (1 allocation: 544 bytes)

julia> ca = CircularArray(rand(Int8, 3, 3, 3, 3));

julia> @btime [$ca[i] for i in 1:(21*21)];
  13.392 μs (1 allocation: 544 bytes)

Linear indexing of the array has so far been undocumented behaviour, so changing the behaviour doesn't seem to be much of a problem to me, I agree with the new behaviour a lot more. However, the slowdown getting worse the more dimensions the array has is worrying. That should probably be addressed.

Adding Base.parent is a good idea, thanks for that.

Vexatos · 2020-11-17T19:37:17Z

If I had to guess, this slowdown may be caused by the linear indexing special case being removed, making linear and multi-dimensional indexing use the same method. In the old version, the special case for linear indexing did not use a splatting operator ..., which from my experience is very slow and might be the source of this slowdown during linear indexing.

Furthermore, I discovered that the arrays no longer error on indexing with bad dimensions:

Before:

julia> ca = CircularArray(rand(Int8, 3, 3, 3, 3));

julia> ca[1, 2]
ERROR: BoundsError: attempt to access 3×3×3×3 Array{Int8,4} at index [1, 2]
Stacktrace:
 [1] getindex(::Array{Int8,4}, ::Int64, ::Int64) at ./array.jl:810
 [2] getindex(::CircularArray{Int8,4,Array{Int8,4}}, ::Int64, ::Int64) at /[]/src/CircularArrays.jl:38
 [3] top-level scope at REPL[6]:1

After:

julia> ca = CircularArray(rand(Int8, 3, 3, 3, 3));

julia> ca[1, 2]
75						# equal to ca[1, 2, 1, 1]

mcabbott · 2020-11-17T19:59:06Z

I'm not sure we should call the change in time for linear indexing of a matrix a slowdown, as it's doing a different operation. Instead of accessing the first column repeatedly, it reads out all the data, and either your calculation wants one or it wants the other. But I am not surprised that reading all the data takes longer, I'd point to thinks fitting in registers or cache or something like that.

And, to go from Cartesian to Linear indices is just multiplication, but the reverse involves div which is slower. And there's more diving in higher dims. For example, compare an Array to a trivial PermutedDimsArray just to fool linear indexing:

julia> quad = randn(10,10,10,10);

julia> IndexStyle(quad)
IndexLinear()

julia> @btime [$quad[i] for i in 1:10^4];
  7.216 μs (2 allocations: 78.20 KiB)

julia> pquad = PermutedDimsArray(quad, (1,2,3,4));

julia> IndexStyle(pquad)
IndexCartesian()

julia> @btime [$pquad[i] for i in 1:10^4];
  22.557 μs (2 allocations: 78.20 KiB)

The old behaviour didn't pay this cost, because it fixed all indices but the leftmost one to 1.

What is still weird about linear indexing is that the algorithm from Base it's using to convert to Cartesian seems not to behave well with negative numbers.

julia> a1
3×4 CircularArray{Int64, 2, Matrix{Int64}}:
  2   4   6   8
 10  12  14  16
 18  20  22  24

julia> a1[-2:9]' # after this PR
1×12 LinearAlgebra.Adjoint{Int64, CircularVector{Int64, Vector{Int64}}}:
 8  10  18  2  10  18  4  12  20  6  14  22

julia> a1[-2], a1[9]
(8, 22)

julia> CircularArray(1:3)[-2:6]'  # with a vector, no conversion to cartesian, no problem:
1×9 LinearAlgebra.Adjoint{Int64, CircularVector{Int64, Vector{Int64}}}:
 1  2  3  1  2  3  1  2  3

Before was even weirder, I think:

julia> a1[-2:9]' 
1×12 LinearAlgebra.Adjoint{Int64, CircularVector{Int64, Vector{Int64}}}:
 8  10  18  2  10  18  4  12  20  6  14  22

julia> a1[-2], a1[9]  # doesn't match
(2, 18)

mcabbott · 2020-11-17T20:09:06Z

Furthermore, I discovered that the arrays no longer error on indexing with bad dimensions:

Oh I didn't see that variant. I saw this one, and added a test:

julia> a1[2,3,99]
14

julia> size(a1, 3)
1

julia> mod(99, 1:1)
1

julia> a1[2,3,1]
14

I'm not sure how weird this is, given that ordinary arrays do handle things like this:

julia> cquad[3,4] == quad[3,4,1,1] == quad[3,4,1,1,1,1,1,1]
true

But if you like I'll restore the error.

mcabbott · 2020-11-17T20:36:25Z

But when the underlying array supports fast linear indexing, then there's no need to go via Cartesian. So the latest commit speeds this up:

julia> @btime [$ca[i,j] for i in 1:21, j in 1:21];
  4.015 μs (1 allocation: 544 bytes)

julia> @btime [$ca[i] for i in 1:(21*21)];
  1.503 μs (1 allocation: 544 bytes)

and avoids some weirdness:

julia> a1
3×4 CircularArray{Int64, 2, Matrix{Int64}}:
  2   4   6   8
 10  12  14  16
 18  20  22  24

julia> a1[-2:9]'
1×12 LinearAlgebra.Adjoint{Int64, CircularVector{Int64, Vector{Int64}}}:
 8  16  24  2  10  18  4  12  20  6  14  22

julia> a1[-2], a1[9]
(8, 22)

and adds a bounds check to restore the error on too few indices.

But breaks one offset array test, not sure why yet.

Vexatos · 2020-11-17T23:12:29Z

I'm not sure we should call the change in time for linear indexing of a matrix a slowdown, as it's doing a different operation.

Of course, I didn't think about that. However, re-adding the specialized method did fix it.

I'm not sure how weird this is, given that ordinary arrays do handle things like this

It pretty much just delegated the getindex call to the inner array, if you look at the error trace it mentions the Base.getindex in array.jl being the one that errors, as normal arrays don't support non-matching indices. This behaviour should probably be kept for consistency.

julia> a = rand(2, 2, 2);

julia> a[1, 2]
ERROR: BoundsError: attempt to access 2×2×2 Array{Float64,3} at index [1, 2]
Stacktrace:
 [1] getindex(::Array{Float64,3}, ::Int64, ::Int64) at ./array.jl:810
 [2] top-level scope at REPL[18]:1

So the latest commit speeds this up

Thank you for adding the special cases back! The slowdown is indeed resolved. Although, you should not use the Integer abstract type directly as julia will fail to specialize the methods like this, causing unnecessary slowdown. The best options would be to either use i::Int or i::I where I<:Integer. Furthermore, isn't eachindex(LinearIndices(arr.data)) always the same as eachindex(arr.data), as they have the same shape?

src/CircularArrays.jl

mcabbott · 2020-11-17T23:41:35Z

Although, you should not use the Integer abstract type directly as julia will fail to specialize the methods like this, causing unnecessary slowdown.

I don't think this is true, it should specialise here. The two weird cases are ::Function and ::Type. I think that most of Base's indexing uses ::Integer, so I try to follow unless there's a reason not to.

isn't eachindex(LinearIndices(arr.data)) always the same as eachindex(arr.data), as they have the same shape?

I was concerned about things like eachindex(rand(2,2)') which gives CartesianIndices. But this may be too complicated, eachindex(LinearIndices(OffsetArray(rand(3,3), 5,10))) is still one-based, so perhaps it's enough to use length(arr.data) here?

Vexatos · 2020-11-17T23:57:59Z

I think that most of Base's indexing uses ::Integer, so I try to follow unless there's a reason not to.

In the past, I have had bad experiences with using abstract types in method signatures unless absolutely necessary, making julia unable to specialize the method for subtypes. getindex in Base seems to be often using Int instead, but not sure that's a better idea.

mcabbott · 2020-11-18T00:09:10Z

Base is confusing... there are indeed more ::Int methods than I remember, sorry. Things like rand(3,3)[1,Int32(2)] do work, but via a slightly longer path. Maybe the same path would apply here if these methods were narrower. Nevertheless I am pretty sure that this should specialise; @code_typed a1[1,2] certainly says it does (although is not quite infallible).

Vexatos · 2020-11-18T00:19:57Z

Thank you for the adjustments. I guess what's left is to figure out why it breaks with CartesianIndex. I already confirmed it has nothing to do with OffsetArrays.

julia> a = CircularArray(reshape(1:9,3,3))
3×3 CircularArray(reshape(::UnitRange{Int64}, 3, 3)):
 1  4  7
 2  5  8
 3  6  9

julia> c = CartesianIndex.(3:5, 3:5)
3-element Array{CartesianIndex{2},1}:
 CartesianIndex(3, 3)
 CartesianIndex(4, 4)
 CartesianIndex(5, 5)

julia> a[3,3]
9

julia> a[4,4]
1

julia> a[5,5]
5

julia> a[c]
3-element CircularVector(::Array{Int64,1}):
 9
 4
 8

mcabbott · 2020-11-18T00:36:36Z

That's progress. Via @less, here's where it's going wrong:

julia> a[CartesianIndex(4,4)] # wrong
4

julia> Base.to_indices(a, (CartesianIndex(4,4),))
(4, 4)

julia> Base._getindex(IndexLinear(), a, (4,4)...)
4

julia> Base._getindex(IndexCartesian(), a, (4,4)...) # right
1

julia> Base._to_linear_index(a, 4, 4)
13

julia> a[13]
4

I suspect that to_indices is the right place to fix this.

Vexatos · 2020-11-18T09:59:30Z

I don't think having to implement Base._getindex can possibly be the right solution to this. It's an internal and undocumented function and probably shouldn't be implemented.

mcabbott · 2020-11-18T10:04:20Z

Yes I don't like it at all.

What I expected to work was overloading to_indices(a, axes(a), (CartesianIndex(3,3), rest...)) which is I think an officially sanctioned way of messing with the internals... it peels off indices one by one so you can dispatch on the first element of the tuple, and that always corresponds to the first (first two, here) elements of the axes tuple.

But in fact this case goes to to_indices(a, (), (Cartesian...,)) with an empty axes tuple, as an optimisation because it was thought CartesianIndices would never need further information. I'm not sure what the right way around that is. Maybe it can be prevented from going down that path.

mcabbott · 2020-11-18T10:49:55Z

I thought I saw a way, but now I invented some more evil test cases and it fails.

Vexatos · 2020-11-22T12:44:09Z

src/CircularArrays.jl

@@ -50,6 +69,14 @@ CircularArray(def::T, size) where T = CircularArray(fill(def, size))
 CircularVector(data::AbstractArray{T, 1}) where T = CircularVector{T}(data)
 CircularVector(def::T, size::Int) where T = CircularVector{T}(fill(def, size))

+Base.IndexStyle(::Type{CircularArray{T,N,A}}) where {T,N,A} = IndexStyle(A)


The question came to me why this line would be needed. Wouldn't the issues with CartesianIndex be resolved if the IndexStyle defaulted to IndexCartesian for anyhting but CircularVector? That would result in the calls with CartesianIndex parameters to run through the second implementation of getindex instead of the first. Then all the hacks into to_indices could also be removed. I would be very interested to see your "more evil test cases" and see if they work with this, because all the current tests pass like this.

I thought this was necessary for speed, although it's entirely possible I have got lost in all the versions now. It causes eachindex(A) to prefer linear indexing, which can drop straight through to the underlying memory, whereas iterating CartesianIndices involves some multiplications by strides... necessary when the underlying array is transposed, say.

I've pushed what I had lying around.

Vexatos · 2020-11-22T15:00:46Z

With my changes, all tests appear to pass, even your new ones, and I cannot find any noticeable slowdown in any of the benchmarks listed in this issue thread, at least with standard arrays. Perhaps you can try the proposed changes, if something turns out to have broken I will revert them.

Vexatos · 2021-01-03T12:41:14Z

Since I heard no more feedback on this, I'll merge it. It seems to be working fine.

mcabbott · 2021-01-03T17:29:37Z

Sorry about vanishing. I forgot all the details now but if it passes those tests it must be working pretty well!

simplify indexing

ad0e7e8

linear indexing test

5e1846a

use fast linear indexing when possible

8a049e7

cosmetic changes

1329a79

Vexatos reviewed Nov 17, 2020

View reviewed changes

src/CircularArrays.jl Outdated Show resolved Hide resolved

simplify linear indexing, and checkbounds

e31f590

Integer -> Int

a549b56

one solution

c3f15db

work via to_indices instead

a3f3273

Vexatos reviewed Nov 22, 2020

View reviewed changes

Michael Abbott and others added 3 commits November 22, 2020 14:09

some evil test cases

bcbdc81

some inconclusive messing with to_indices

f71bf96

Fixed cartesian indexing

a4d79af

Vexatos merged commit 12b733b into Vexatos:master Jan 3, 2021

mcabbott deleted the getindex branch January 3, 2021 17:29

Vexatos mentioned this pull request Feb 7, 2021

Support deleteat! #12

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify indexing #10

Simplify indexing #10

mcabbott commented Nov 16, 2020

codecov bot commented Nov 16, 2020 •

edited

Loading

mcabbott commented Nov 16, 2020

Vexatos commented Nov 17, 2020 •

edited

Loading

Vexatos commented Nov 17, 2020 •

edited

Loading

mcabbott commented Nov 17, 2020

mcabbott commented Nov 17, 2020 •

edited

Loading

mcabbott commented Nov 17, 2020

Vexatos commented Nov 17, 2020 •

edited

Loading

mcabbott commented Nov 17, 2020

Vexatos commented Nov 17, 2020 •

edited

Loading

mcabbott commented Nov 18, 2020

Vexatos commented Nov 18, 2020

mcabbott commented Nov 18, 2020

Vexatos commented Nov 18, 2020

mcabbott commented Nov 18, 2020 •

edited

Loading

mcabbott commented Nov 18, 2020

Vexatos Nov 22, 2020 •

edited

Loading

mcabbott Nov 22, 2020

Vexatos commented Nov 22, 2020 •

edited

Loading

Vexatos commented Jan 3, 2021

mcabbott commented Jan 3, 2021

Simplify indexing #10

Simplify indexing #10

Conversation

mcabbott commented Nov 16, 2020

codecov bot commented Nov 16, 2020 • edited Loading

Codecov Report

mcabbott commented Nov 16, 2020

Vexatos commented Nov 17, 2020 • edited Loading

Vexatos commented Nov 17, 2020 • edited Loading

mcabbott commented Nov 17, 2020

mcabbott commented Nov 17, 2020 • edited Loading

mcabbott commented Nov 17, 2020

Vexatos commented Nov 17, 2020 • edited Loading

mcabbott commented Nov 17, 2020

Vexatos commented Nov 17, 2020 • edited Loading

mcabbott commented Nov 18, 2020

Vexatos commented Nov 18, 2020

mcabbott commented Nov 18, 2020

Vexatos commented Nov 18, 2020

mcabbott commented Nov 18, 2020 • edited Loading

mcabbott commented Nov 18, 2020

Vexatos Nov 22, 2020 • edited Loading

Choose a reason for hiding this comment

mcabbott Nov 22, 2020

Choose a reason for hiding this comment

Vexatos commented Nov 22, 2020 • edited Loading

Vexatos commented Jan 3, 2021

mcabbott commented Jan 3, 2021

codecov bot commented Nov 16, 2020 •

edited

Loading

Vexatos commented Nov 17, 2020 •

edited

Loading

Vexatos commented Nov 17, 2020 •

edited

Loading

mcabbott commented Nov 17, 2020 •

edited

Loading

Vexatos commented Nov 17, 2020 •

edited

Loading

Vexatos commented Nov 17, 2020 •

edited

Loading

mcabbott commented Nov 18, 2020 •

edited

Loading

Vexatos Nov 22, 2020 •

edited

Loading

Vexatos commented Nov 22, 2020 •

edited

Loading