Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop preventing indexing with Numbers #12486

Closed
wants to merge 1 commit into from
Closed

Conversation

timholy
Copy link
Member

@timholy timholy commented Aug 6, 2015

This change permits AbstractArray types to perform vector indexing with any type of Number. For Interpolations, this enables B = A[linspace(1,10,1001), linspace(1,15,201)] to work (well, it will after JuliaMath/Interpolations.jl#54 gets merged...).

CC @mbauman, @jakebolewski (ref #10458)

@tomasaschan
Copy link
Member

I'm curious (and on Windows at the moment, lacking the ability to try it out myself); what happens after this change if I do

using DualNumbers
A = rand(10,25)
d = Dual(5,0)
A[d, d+8] # note: both dual numbers actually represent integers
          # and convert(Int, d) works just fine

On current master, I get a nice error message explaining what I did wrong:

ERROR: indexing Array{Float64,2} with types Tuple{DualNumbers.Dual{Int64},DualNumbers.Dual{Int64}} is not supported
 in error at error.jl:21

It would be nice to make sure that this change doesn't make arrays segfault or something because the indexing is weird...

@timholy
Copy link
Member Author

timholy commented Aug 6, 2015

You still get the same error message.

@mbauman
Copy link
Member

mbauman commented Aug 6, 2015

I'll have to look into your interpolations pull request a little more closely, but I'm afraid that this isn't going to be the way we should achieve this.

I think we need to split to_scalar_index and to_multidimensional_index.

For scalar indexing, we need to do three things to the indices:

  • Convert to Int (since that is the canonical method we require to be implemented)
  • Ensure we're using correct dimensionality, or compute the dimensionality with checkbounds and ind2sub/sub2ind.
  • Call the required canonical method (which has a nice fallback error message).

If we allow all Numbers, this will fail with a method missing for ind2sub/sub2ind, which are only implemented for Integer. We could remove that requirement, which would allow, e.g., Dual integers, but it'd give a very wrong result for fractional numbers.

Multidimensional indexing is a little simpler:

  • checkbounds
  • Use find on Array{Bool}
  • Punt to (unsafe) scalar indexing

Actually, I think all we need to do here is remove to_indexes from the unsafe_ fallbacks that call the safe versions.

@timholy
Copy link
Member Author

timholy commented Aug 6, 2015

I'm not sure we have to work so hard. The basic idea is:

  • If the type supports scalar indexing with "weird" indexes, it works. If not, it doesn't.
  • Except for logical indexing, don't do anything that changes what the user passes when using vectors.

In other words, I shouldn't have any expectation that core julia is going to massage my indexes into something my type can support---core julia's job is simply to deliver the appropriate scalars to my task.

@timholy
Copy link
Member Author

timholy commented Aug 6, 2015

In other words, I'm proposing that to_index calls probably need to be part of the type's indexing operations, not part of the core infrastructure.

@mbauman
Copy link
Member

mbauman commented Aug 6, 2015

Agreed. There are two tricky things here:

  • Indexing with Real still needs to convert to Int for 0.4 to avoid a surprising breaking change.
  • We need to avoid a stack overflow within the callbacks.

@mbauman
Copy link
Member

mbauman commented Aug 6, 2015

I've thought before about how it'll be very nice in 0.5 to just define the scalar indexing fallbacks for Integers. Perhaps we should do that now, and, as you're suggesting, do the conversion from Real to Int within the core indexing operations for the Base types.

@nalimilan
Copy link
Member

Perhaps we should do that now, and, as you're suggesting, do the conversion from Real to Int within the core indexing operations for the Base types.

Does this mean that any custom array type will have to redefine the methods to allow indexing with Real? This seems to go against the massive improvement you've done recently to simplify the API that needs to be implemented by custom types, and it will lead to inconsistencies when types forget to implement it.

@timholy
Copy link
Member Author

timholy commented Aug 6, 2015

Does this mean that any custom array type will have to redefine the methods to allow indexing with Real?

I doubt it. If your type is a wrapper around an Array, then it's just pass-through. But counterexamples would be interesting.

As far as I can tell the focused issue is this: ind2sub should only work for Integers. So I guess I'm proposing that for a 3d array, A[2.0, 3] should be fine but A[2.0, 3.0] should not (because you need to call ind2sub in both cases). If we want to support linear indexing with non-ints, I worry it gets in the way of other things. Here's another interesting example: suppose someone makes an AbstractArray where rows are indexed by Symbols and columns by Ints, and then wants to do

result = A[[:a, :b, :c], [1, 2, 3]]  # should return a 3x3 result

How should to_index work in this case?

I'm fine with waiting on this until 0.5, however. This might be too tricky to tackle at this point.

@tomasaschan
Copy link
Member

You still get the same error message.

Thanks for checking for me :)

@ScottPJones
Copy link
Contributor

Why would A[2.0, 3.0] have to fail? I would think that as long as the value of the subscript can be converted to an Int without getting an InexactError, then it should work.

@mbauman
Copy link
Member

mbauman commented Aug 6, 2015

@ScottPJones that's #10458. TL/DR: It's up to the custom types to implement non-Integer indexing. We won't do that for you in base.

@ScottPJones
Copy link
Contributor

OK, that all came up just before I found out about Julia. However, I don't understand why the direction is to make Julia more restrictive, and less user friendly? IMO, Julia is nice because it can generally infer types, you don't have to worry so much about them, but here, it seems you have to know that only one type is permitted (i.e. Int)

@timholy
Copy link
Member Author

timholy commented Aug 6, 2015

@ScottPJones, as background for some AbstractArray A of size (5,5,5), the following

A[78]
A[2,17]
A[1,2,3]

get expanded (automagically!) to

A[3,1,4]
A[2,2,4]
A[1,2,3]

I guess if we want to support ind2sub for non-integers, it doesn't have to fail. But I also think that might be kind of weird.

@ScottPJones
Copy link
Contributor

What's weird about it? Maybe it's my background, where everything was a string, and you only interpreted strings as numbers, so "78", 78.0, 7.8e1, 78, were all considered equal, and when used as a subscript, all found the same node.
That's a lot easier for users, same as type inference makes life easier.

@johnmyleswhite
Copy link
Member

Julia.js

@ScottPJones
Copy link
Contributor

Not sure what you mean by that "Julia.js", I'm not saying that Julia should be like JS, but it would seem that if the compiler can see that the type isn't Int already, it could generate a convert.
If I have something that is a UInt8, for example (maybe read from some file), I'd have to specifically call convert on it before using as an index, which seems like work the compiler should be able to do.

@StefanKarpinski
Copy link
Member

I'm in favor of allowing indexing with general numeric types (at least reals), but autoconverting strings to integers is less ok, largely because it's a pretty expensive operation that seems like it should be asked for explicitly.

@ScottPJones
Copy link
Contributor

Ah, I wasn't intending to say that strings should be autoconverted in Julia, that doesn't make sense, as in Julia you need to use parse instead of convert to go from "48" to 48.
Just things that can be converted, without any information loss (no InexactError) is what I'd imagined.

@tomasaschan
Copy link
Member

Two of the things I love most about "the Julian way" of doing things, is that there is a strong culture of trying to do the thing that makes the most sense for the user, and that there is a similarly strong culture of not assuming anything about the user's intentions that cannot be more-or-less proven to be true. Sometimes - and I think this is such a time - those two are at odds with each-other... but I think we can still do both.

It all boils down to our interpretation of what the ind2sub transformation actually means. If it's unwrapping of linear indices into multilinear ones, we could just as well support any real number and use the same transformation we do for ints, but generalized to reals.

Let's consider a 2D example for simplicity:

julia> A = reshape(1:5*4,4,5)
4x5 Array{Int64,2}:
 1  5   9  13  17
 2  6  10  14  18
 3  7  11  15  19
 4  8  12  16  20
julia> ind2sub(size(A), 14)
(2,4)

Generalizing for reals could mean (and I'm going out on a limb here...) that for a (custom, user-defined) A::AbstractArray, size(A) could return a tuple of reals, which are used as the wrapping-bounds for ind2sub. In other words, since

julia> ind2sub((3.5, 5.5), 13.3)
(2.8000000000000007,4.0)

(calculated using div and rem, not using the actual implementation...)

Note that this is not merely "allowing some weird, maybe-correct inputs to ind2sub" just for user convenience; this is actually expanding the definition to remain exact while allowing a lot of different types.

Since this effectively de-couples ind2sub from indexing, we could just add methods for reals here, and getindex(A::Array, xs::Real...) independently, and have the getindex method do convert(Int, x) on each index before actually indexing into the array. For users that don't care about types, this will Just Work(TM) as long as the real numbers convert cleanly to Int, but it won't assume anything about what to do if the input is not an int. For users that care about performance and/or code style, they can do exactly what they're doing today, and get the methods of ind2sub and getindex that only work with Ints.

EDIT: I realize the most controversial thing about this suggestion is allowing size(A::AbstractArray{T,N}) to return something that's not a NTuple{N,Int}. If we don't do that, I think it's more-or-less equivalent to just allowing ind2sub for non-integers, but it becomes much less powerful.

@mbauman
Copy link
Member

mbauman commented Aug 7, 2015

It all comes down to what being an AbstractArray means (once again). On the face of it, it really seems like a reasonable limitation that AbstractArrays have integer length dimensions that start from index 1.

But for a scaled interpolation object (which works in terms of a given axis, instead of the indices - JuliaMath/Interpolations.jl#47), that's no longer true. It doesn't seem like that big of a departure from indexed interpolation, and it sure seems like this should work similarly. But we're now in the case where there's not a good answer for Base.size(), and the first in-bounds "index" most likely isn't 1. That said, I'm pretty sure the only time these assumptions are used during indexing is in checkbounds (which reminds me, I need to rebase and push #11895 through).

@GlenHertz
Copy link
Contributor

In my custom array I overloaded interpolation into indexing but it got too confusing. When something is an Int it gets the value at the index, when a Real it interpolates on another basis -- 2 and 2.0 will give different results. I would like to use interpolate() to make it clear it is not an index as input but the interpolation dimension. In the general case I think it is too confusing to use indexing.

@timholy
Copy link
Member Author

timholy commented Aug 17, 2015

Closing in favor of @mbauman's more systematic approach (#12540, #12567).

@timholy timholy closed this Aug 17, 2015
@DilumAluthge DilumAluthge deleted the teh/index_with_numbers branch March 25, 2021 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants