Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Use typejoin of the field types for eltype of heterogeneous Tuples #21108

Merged
merged 3 commits into from
Mar 23, 2017

Conversation

martinholters
Copy link
Member

@martinholters martinholters commented Mar 20, 2017

Came up here: #21077 (comment)

# master
julia> collect((1,1.0))
2-element Array{Any,1}:
 1  
 1.0
# this PR
julia> collect((1,1.0))
2-element Array{Real,1}:
 1  
 1.0

Not sure this is actually helpful anywhere, but it probably won't hurt either...

Edit: Closes #20143.

@StefanKarpinski
Copy link
Member

FWIW, I think this behavior is preferable.

base/tuple.jl Outdated
t´ = unwrap_unionall(t)
r = Union{}
for ti in t´.parameters
r = typejoin(r, unwrapva(ti))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rewrap_unionall should be done at this point, since typejoin won't work on types with free variables (it calls <:).

@martinholters
Copy link
Member Author

Just noticed that this presently errors for Unions of Tuples. Should I add

function eltype(u::Union)
    typejoin(eltype(u.a), eltype(u.b))
end

? That would also make something like eltype(Union{Vector{Int},Vector{Float64}}) return Real instead of Any.

@JeffBezanson
Copy link
Member

Actually, allowing eltype on too many abstract types seems a bit dangerous to me, since there's a difference between a collection with element type Real, and a collection with an unknown element type that's some subtype of Real.

@martinholters
Copy link
Member Author

Yes, but there's also a difference between a collection with element type Any, and a collection with an unknown element type that's some subtype of Any. Arguably, it might be better for the eltype(::Any) fallback not to return Any but either error or return some kind of sentinel value. (Or maybe even eltype to generally return a Nullable.) But that would probably be immensely breaking.

Anyway, as said, I'm not sure of the usefulness of this myself, so no offense taken if the PR gets rejected.

@martinholters
Copy link
Member Author

Addressed the review comment but will leave the Union case for now until we have decided whether to go with this PR or not.

@StefanKarpinski
Copy link
Member

Element type <:Real versus Real?

@JeffBezanson
Copy link
Member

There isn't really any good way to program with such a result in user code. To support that, eltype would need to give answers in a lifted domain, e.g. Type{Real} and Type{T} where T<:Real, which I don't think we want to do.

@martinholters
Copy link
Member Author

Hm, if we don't do this, we should probably make iteratoreltype return EltypeUnknown() for heterogeneous Tuples.

@JeffBezanson
Copy link
Member

I think this change does make sense, at least for concrete tuple types. For example if we made heterogeneous tuples EltypeUnknown, then collect would call typejoin on the types of values it gets, which for tuples gives you no more information than if eltype for tuples did the same.

@TotalVerb
Copy link
Contributor

This fixes #20143.

@TotalVerb
Copy link
Contributor

I think it is necessary to check that the tuple type is actually a Tuple{...}; as-is,

julia> eltype(Union{Tuple{Int},Tuple{String}})
ERROR: type Union has no field parameters
Stacktrace:
 [1] eltype(::Type{Union{Tuple{Int64}, Tuple{String}}}) at ./REPL[11]:5

@martinholters
Copy link
Member Author

Added recursive handling of Union to eltype(::Type{<:Tuple}). However, that leads to

julia> eltype(Union{Tuple{Int}, Tuple{Float64}})
Real

julia> eltype(Union{Vector{Int}, Vector{Float64}})
Any

Would we rather have Any for the Union of Tuples case?

@andyferris
Copy link
Member

andyferris commented Mar 21, 2017

This is a bit related to some things I've been thinking about lately, and it came up again today.

It's been noticed that since StaticArrays uses tuples in many places, we could remove remaining calls to return_type In that package by constructing the answer (of map or broadcast or *, say) as a tuple and let an eltype-less constructor use either promote (convert the tuple elements) or typejoin (a widened eltype) to set the output array element type.

It's even quite simple to allow inhomogenous tuples as static array storage: with a bit of effort we could even make methods for map or broadcast or * fully-inferred (i.e. fast, even when not homogenous).

In this picture, eltype would be computed by something like this PR. It would have to be done by an outer constructor, to satisfy AbstractArray{T} inheretence. This seems pretty natural for small immutable arrays - but is quite different from the way large mutable Arrays behave (though I guess collect has some mechanism that I should try to understand?).

Anyway, suffice it to say that the behaviour discussed in this PR will effect those downstream working with inhomogenous collections and trying to mimic "covariantish" (for lack of a better term) behavior.

Personally, I support using typejoin and this PR but I'd prefer if typejoin didn't widen more agressively than inference does currently for the union of the same collection of types.

@martinholters
Copy link
Member Author

typejoin never returns a Union, while inference allows Unions up to a certain size, but then falls back to Any:

julia> reduce(Core.Inference.tmerge, (Int8, Int16, Int32))
Union{Int16, Int32, Int8}

julia> reduce(typejoin, (Int8, Int16, Int32))
Signed

julia> reduce(Core.Inference.tmerge, (Int8, Int16, Int32, Int64))
Any

julia> reduce(typejoin, (Int8, Int16, Int32, Int64))
Signed

I don't think implementing a mechanism that always produces the more specific of these two as part of eltype would lead to especially elegant code. Also, having eltype(Tuple{Int8, Int16, Int32}) return Union{Int16, Int32, Int8} instead of Signed seems of questionable merit to me.

@JeffBezanson
Copy link
Member

@andyferris See

function collect_to!{T}(dest::AbstractArray{T}, itr, offs, st)

That is the preferred way to write map without return_type. It checks the type of each element, calling typejoin as necessary, and if the types can be inferred then all the type-checking and promoting code gets specialized away.

@andyferris
Copy link
Member

andyferris commented Mar 22, 2017

I see it, thanks Jeff!

I wondered if this might be the approach in Base but I had been a little reluctant to implement it this way. In principle at least, with a deep enough type tree, the operation could become O(N^2) for a length-N array. Making typejoin more "optimal" (i.e. widen less, in the extreme case only using Union) would only make this problem worse.

If we were worried about this kind of slow down, one could iterate over itr once to get a wide enough type, create the array, and fill it by iterating again over itr - but crucially you would want inference to be able to completely elide the first iteration whenever it can figure it out at compile time.

(For the case of static arrays, the type join could be done as some pure or generated function over the data's tuple type, so it's relatively straightforward to elide).

@martinholters Yes, good point. Changing typejoin to return Union might be quite useful in some circumstances, since Unions of concrete types are going to become much faster than other abstract types. I see it as inference's job to make the call what types will result in best performance (also trading off compile time vs run time costs), which seems (to me) to be the same problem here in determining the eltype that will give the user the best performance. I guess what I was thinking of was "improving" tmerge such that reduce(Core.Inference.tmerge, (Int8, Int16, Int32, Int64)) == Signed and having const typejoin = Core.Inference.tmerge, but I'm sure @JeffBezanson will explain why this is a terrible idea :)

@martinholters
Copy link
Member Author

The widening in inference is not done to come up with types that are best for codegen, but to ensure convergence of inference itself. And more aggressive widening allows faster convergence. That said, there is also the "TODO: something smarter, like a common supertype" comment in tmerge right above the return Any, and in fact I've tried using typejoin there and nothing exploded immediately. But that's a discussion beyond the scope of this PR.

@martinholters
Copy link
Member Author

With the Union of Tuples case fixed, in this good to go?

@JeffBezanson JeffBezanson merged commit d694548 into master Mar 23, 2017
@martinholters martinholters deleted the mh/tuple_eltype branch March 23, 2017 18:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants