Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleaning up indexing and dimension mapping code. #181

Merged
merged 7 commits into from
Aug 7, 2021
Merged

Conversation

Tokazama
Copy link
Member

@Tokazama Tokazama commented Aug 3, 2021

tldr; I was cleaning up indexing/dimension stuff and did the following:


Now for the IALR (is a long read). Mapping each indexing argument to the dimensions/axes of an indexed array involves finding out:

  1. How many dimensions each index spans (Int -> 1, CartesianIndex{3} -> 3)
  2. How many dimensions in the subset of the array are represented by each index (CartesianIndices{3} -> 3, CartesianIndex{3} -> 0).

There are also indices that trail past the last dimension of the indexed array that are either dropped (Int) or result in final dimension that's one element long (OneTo(1), :).
Most of this is handled in to_indices right now, which I've found easily results in ambiguities (an example in the wild).
Also, to_indices isn't as optimized as one might hope.
I've found some success in the past with small improvements to the performance of ArrayInterface.to_indices (e.g., mostly performing a few basic operations prior to the recursive to_indices call).

So the problems are:

  1. The current approach for indices to array mapping has some odd corner cases.
  2. Implementing unique behaviors for this isn't user friendly due to the propensity for ambiguities. It's also worth noting that the to_indices(::MyArray, axes, inds::Tuple{IndexType,Vararg{Any}}) pattern is somewhat against the design patterns we're aiming for in ArrayInterface.jl because anything that wraps MyArray will lose this relationship with IndexType.
  3. to_indices has some performance draw backs (but it sounds like that will improve someday with more compiler work).

If you look in the original comments about problems with end there's some dislike of going into the type domain.
However, I think it's pretty reasonable (and useful) to require there is a way to map each index type to the dimensions of an array type.
In the current draft this is being done using ndsim_index.
This is a pretty straightforward setup with the exception of accommodating things like EllipsisNotation.jl where the number of dimensions that EllipsisNotation.Ellipsis maps to is dependent on which dimensions all the other indices map to.
The best solution I've come up with that permits contextual dimension mapping is using LazyAxis to keep track of the number of dimensions.

@inline function ndims_index(::Type{LazyAxis{N,A}}, ::Type{I}) where {N,A,I1,I<:Tuple{I1,Vararg{Any}}}
    NI = ndims_index(LazyAxis{N,A}, I1)
    (NI, ndims_index(LazyAxis{NI + N, A}, Base.tuple_type_tail(I))...)
end

The only other idea I have right now is using axes_types types so that users don't have to worry about using LazyAxis as a counter.
I generally try to avoid forcing anyone to directly interact with Tuple to work with an interface.
On the other hand, if someone isn't comfortable digging into that deep into the type system they probably shouldn't be changing something as fundamental as index to dimension mapping.

I'm going to sit on this for a day or two in hopes that I have an epiphany that solves this all perfectly (unless someone has a suggestion that does that for me).
Nothing here should be breaking BTW.
All the stuff I removed wasn't being used anywhere else yet.

* Realized that a lot of things I had documented internally are now in
  the online docs, likely making them appear more stable and public than
  they are right now.
* `argdims` was ambiguously documented and not used elsewhere so I split
  it into two separate/more useful methos: ndims_index and ndims_subset
* Deleted/consolidated a bunch of indexing code
@codecov
Copy link

codecov bot commented Aug 3, 2021

Codecov Report

Merging #181 (12fc825) into master (2c1fcbb) will increase coverage by 0.55%.
The diff coverage is 88.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #181      +/-   ##
==========================================
+ Coverage   83.57%   84.13%   +0.55%     
==========================================
  Files          11       11              
  Lines        1644     1607      -37     
==========================================
- Hits         1374     1352      -22     
+ Misses        270      255      -15     
Impacted Files Coverage Δ
src/ArrayInterface.jl 80.80% <ø> (ø)
src/indexing.jl 76.56% <87.75%> (-1.02%) ⬇️
src/dimensions.jl 96.21% <90.90%> (+0.24%) ⬆️
src/axes.jl 67.85% <0.00%> (+4.92%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2c1fcbb...12fc825. Read the comment docs.

* removed ndims_subset for now.
* added canonicalize tests
@Tokazama
Copy link
Member Author

Tokazama commented Aug 4, 2021

We could potentially do something like this

@inline function ndims_index(axs::Tuple{Vararg{Any,NA}}, inds::Tuple{I1,I2}) where {NA,I1,I2} 
    ndi = ndims_index(getfield(axs, 1), getfield(inds, 1))
    if ndi === nothing
        ndi = ndims_index(getfield(axs, NA), getfield(inds, 2))
        return (ndi - static(NA), ndi)
    else
        _, axstail = Base.IteratorsMD.split(axs, Val(known(ndi)))
        return (ndi, ndims_index(tail(axstail), (getfield(inds, 2),))...)
    end
end
@inline function ndims_index(axs::Tuple{Vararg{Any,NA}}, inds::Tuple{Vararg{Any,NI}}) where {NA,NI}
    ndi = ndims_index(getfield(axs, 1), getfield(inds, 1))
    if ndi === nothing
        ndi = ndims_index(getfield(axs, NA), getfield(inds, NI))
        N = NA - ndi
        if N < 0
            return (ndims_index((), Base.front(inds))..., ndi)
        else
            _, axstail = Base.IteratorsMD.split(axs, Val())
            return (ndims_index(axstail, Base.front(inds))..., ndi)
        end
    else
        _, axstail = Base.IteratorsMD.split(axs, Val(known(ndi)))
        return (ndi, ndims_index(tail(axstail), tail(inds))...)
    end
end

Which would allow Ellipsis to account for CartesianIndices in latter dimensions.
However, I'm leaning towards one of the following:

  1. Implementing Ellipsis here so that it doesn't have to part of a public API
  2. Disallowing contextual multidimensional indexing entirely so that it's not a problem

@Tokazama Tokazama marked this pull request as ready for review August 5, 2021 02:07
@Tokazama
Copy link
Member Author

Tokazama commented Aug 5, 2021

For now I think we should just keep index->dim mapping an internal design. I've been quite verbose here, so it should be pretty easy to come back and implement the real deal in the future.

@Tokazama Tokazama requested a review from chriselrod August 6, 2021 17:09
@chriselrod
Copy link
Collaborator

chriselrod commented Aug 7, 2021

A lot of this is inherently related to index to dimension mapping. We can't do anything about how end and begin get lowered here, but the related issue of mapping each indexing argument to the appropriate dimension of an array can be accomplished here (reference: A[i, end] assumes that i indexes one dimension JuliaLang/julia#35681, Elipsis and [CartesianIndex()] not working SciML/EllipsisNotation.jl#26).

LoopVectorization also has code for handling this right now, but I hadn't gotten around to actually integrating it with begin and end support (i.e. you can't use begin or end as indices in an @turbo loop at the moment).

EDIT: Although probably not worth trying to rebase it on top of this.

Copy link
Collaborator

@chriselrod chriselrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@Tokazama Tokazama merged commit 4b8c553 into master Aug 7, 2021
@chriselrod chriselrod deleted the dim-api branch August 23, 2021 11:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants