Limit the `unioncomplexity` instead of the `unionlen` in `tmerge` #27417

martinholters · 2018-06-04T11:30:31Z

martinholters · 2018-06-04T11:32:15Z

base/compiler/typelimits.jl

@@ -349,7 +349,7 @@ function tmerge(@nospecialize(typea), @nospecialize(typeb))
    # if we didn't start with any unions, then always OK to form one now
    if !(typea isa Union || typeb isa Union)
        # except if we might have switched Union and Tuple below, or would do so
-        if (isconcretetype(typea) && isconcretetype(typeb)) || !(typea <: Tuple && typeb <: Tuple)
+        if (isconcretetype(typea) && isconcretetype(typeb)) || !(typea <: Tuple || typeb <: Tuple)


This was the minimal change to make the recursive test case (f27316 below) work. It's rather ad-hoc though. Could we just simplify this to the inconcretetype part?

Yes (the second test was wrong anyways, it should have been typeintersect)

No, isconcretetype(typea) && isconcretetype(typeb) is a little too strict, it makes these tests fail:

julia/test/compiler/compiler.jl

Line 334 in 83ce7ba

@test Base.return_types(eltype, (NInt,)) == Any[Union{Type{Int}, Type{Union{}}}] # issue 21763

julia/test/compiler/compiler.jl

Line 383 in 83ce7ba

@test Base.return_types(f18037, (Int,)) == Any[Union{Type{T1},Type{T2}}]

We probably want to allow isType(...), too.

martinholters · 2018-06-04T11:32:58Z

base/compiler/typelimits.jl

@@ -399,7 +399,7 @@ function tmerge(@nospecialize(typea), @nospecialize(typeb))
        end
    end
    u = Union{types...}
-    if unionlen(u) <= MAX_TYPEUNION_LEN
+    if unioncomplexity(u) <= MAX_TYPEUNION_LEN


Worth renaming the constant? MAX_TYPEUNION_COMPLEXITY?

martinholters · 2018-06-04T11:33:49Z

base/compiler/typeutils.jl

+    end
+    return c
+end
+unioncomplexity(u::UnionAll) = unioncomplexity(u.body) * unioncomplexity(u.var.ub)


Not sure we need this case. But then again, it probably doesn't hurt, either.

martinholters · 2018-06-05T07:57:08Z

AV timeout seems to be wide-spread these days...

martinholters · 2018-06-11T08:17:31Z

Bump.

StefanKarpinski · 2018-06-11T18:29:16Z

Who is best to review this: @vtjnash, @Keno, @JeffBezanson?

vtjnash · 2018-06-11T18:47:45Z

base/compiler/typeutils.jl

@@ -125,3 +125,17 @@ function _switchtupleunion(t::Vector{Any}, i::Int, tunion::Vector{Any}, @nospeci
    end
    return tunion
 end
+
+# unioncomplexity correspondens to unionlen after recursively pulling out all unions in
+# covariant positions to the top level, e.g. Tuple{Union{A,B}} -> Union{Tuple{A},Tuple{B}}


maybe add the statement # as may be produced by tuplemerge?

vtjnash · 2018-06-11T18:49:07Z

test/compiler/compiler.jl

+    end
+    return x
+end
+@test Tuple{Tuple{Nothing}} <: Base.return_types(g27316, Tuple{})[1]


should make these == tests, so we are more likely to be alerted if the test breaks and stops testing what we intended it to test

Only if that's actually the right condition. Otherwise an arbitrarily overstrict test with no comment about what to do if it starts failing is pretty bad and will cause future pain and confusion.

The test's intention is to verify that inference finishes in a finite amount of time. In this case, it now returns Any, which I'm not sure we want to test for explicitly. The lower bounds I'm using instead are chosen somewhat arbitrarily.

We can also write this test as S <: T == Any # we may be able to improve this bound in the future

martinholters · 2018-06-12T08:30:09Z

@nanosoldier runbenchmarks(ALL, vs=":master")

martinholters · 2018-06-12T11:58:57Z

Error in testset LinearAlgebra/cholesky:
Error During Test at /private/tmp/julia/share/julia/stdlib/v0.7/LinearAlgebra/test/cholesky.jl:35
  Got exception ErrorException("return type LinearAlgebra.Cholesky{Float32,Array{Float32,2}} does not match inferred return type LinearAlgebra.Cholesky{_1,_2} where _2 where _1") outside of a @test
  return type LinearAlgebra.Cholesky{Float32,Array{Float32,2}} does not match inferred return type LinearAlgebra.Cholesky{_1,_2} where _2 where _1

That looks like it might well be related to this PR ... but why does it only happen on OSX!? Also, I would have expected the cholesky code to be completely type-stable and inferable, and then this PR should have no impact, as it only more aggressively widens Unions. As I don't have an OSX system at hand, I'm at a loss here...

nanosoldier · 2018-06-12T22:47:32Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

martinholters · 2018-06-13T08:01:15Z

The ["union", "array", "(\"map\", *, ...)"] regressions are due to this:

julia> Core.Compiler.return_type(iterate, Tuple{Base.Iterators.Zip2{Array{Union{Nothing, Bool},1},Array{Union{Nothing, Bool},1}}})
Union{Nothing, Tuple{Tuple{Union{Nothing, Bool},Union{Nothing, Bool}},Tuple{Int64,Int64}}} #master
Union{Nothing, Tuple{Tuple{Any,Any},Tuple{Any,Any}}} # this PR

For this example, increasing the cutoff to MAX_TYPEUNION_COMPLEXITY = 5 would suffice, but my feeling is that we should probably limit the length of Unions, as just an enumeration of possible types, to a relatively low value (i.e. keeping it at 4), while the complexity considering nested Tuples may be allowed to become larger. I'll try enforcing separate limits for these two measures.

martinholters · 2018-06-13T08:23:13Z

@nanosoldier runbenchmarks("union", vs=":master")

nanosoldier · 2018-06-13T10:48:32Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

martinholters · 2018-06-13T11:06:46Z

OK, benchmark regressions resolved and OSX failure is gone, too.

martinholters · 2018-06-18T11:15:05Z

Thinking some more about this, it might be better to limit the structural complexity. So I've tried with counting the number of (pairwise) Unions in covariant position, e.g. Union{Nothing,Int,Tuple{Union{Nothing,Int}}} would have complexity three:
2 for the outer Union (representing two pairwise Unions) + 1 for the inner one

@vtjnash, would you say that makes more sense? Then I'll push an update.

vtjnash · 2018-06-19T01:16:21Z

I’m not sure. I think you understand this issue better than me, so I’ll go with whatever you end up recommending.

The bound 23 seems high. I think instead of calling computing the size as a multiplication (the max size of the tuple-switch), it seems like a min over the parameters might be better (the min size set of tuples that could have formed this input). The reasoning is that we’re trying to limit here the number of times that tmerge has been called on the variable. So we care more about how we could have formed this, than we do about what might happen to it later.

martinholters · 2018-06-19T12:12:16Z

The reasoning is that we’re trying to limit here the number of times that tmerge has been called on the variable.

Yes! I think that's exactly what I was trying to achieve, except I wasn't actually aware of it.

martinholters · 2018-06-20T06:12:36Z

@nanosoldier runbenchmarks(ALL, vs=":master")

I'll merge if CI goes well and benchmarks looks good, unless anyone objects.

martinholters · 2018-06-20T10:00:37Z

32bit travis:

871.267048 /home/travis/build/JuliaLang/julia/usr/bin/julia --color=yes /home/travis/build/JuliaLang/julia/doc/make.jl linkcheck= doctest=
875.374433    Cloning default registries into /home/travis/build/JuliaLang/julia/doc/deps/registries
875.445278    Cloning registry Uncurated from "https://github.com/JuliaRegistries/Uncurated.git"
  Updating registry at `deps/registries/Uncurated`
879.706048   Updating git-repo `https://github.com/JuliaRegistries/Uncurated.git`
882.868353    Cloning git-repo `https://github.com/JuliaLang/Compat.jl.git`
  Updating git-repo `https://github.com/JuliaLang/Compat.jl.git`
 Installed Documenter ────────── v0.18.0
887.854872  Installed DocStringExtensions ─ v0.4.4
┌ Warning: Deprecated syntax `try without catch or finally` at /home/travis/build/JuliaLang/julia/contrib/build_sysimg.jl:142.
└ @ nothing build_sysimg.jl:142
910.868509 Documenter: setting up build directory.
914.627885 Documenter: expanding markdown templates.
make[2]: *** [html] Segmentation fault (core dumped)
983.611665 make[2]: Leaving directory `/home/travis/build/JuliaLang/julia/doc'
make[1]: *** [docs] Error 2
983.612004 make[1]: Leaving directory `/home/travis/build/JuliaLang/julia'
make: *** [/home/travis/build/JuliaLang/julia/doc/_build/html/en/index.html] Error 2

Anyone seen that before? Or caused by this PR?

KristofferC · 2018-06-20T10:06:49Z

Haven't seen it before at least.

fredrikekre · 2018-06-20T10:32:24Z

Does Documenters tests pass on this branch?

martinholters · 2018-06-20T11:57:05Z

I've built a 32bit version locally and make -C doc html works. Testing Documenter 0.18.0 fails with three errors/fails, which look like Documenter doesn't like to be tested if not pkg> deved. Definitely no segfault there. Same with my "normal" 64bit build.

I'm restarting the Travis job to see whether it reproduces, old log backed up to https://gist.github.com/martinholters/5603c378db04df2de477740485111ee6

nanosoldier · 2018-06-20T13:50:35Z

Something went wrong when running your job:

NanosoldierError: failed to run benchmarks against comparison commit: failed process: Process(`sudo cset shield -e su nanosoldier -- -c ./benchscript.sh`, ProcessExited(1)) [1]

Logs and partial data can be found here
cc @ararslan

martinholters · 2018-06-20T13:57:46Z

The restarted Travis build is past building the docs, and the nanosoldier failure is a segfault running the benchmarks d3c1ae3, the master commit to compare against. So I guess this PR is not to blame, but we do have an issue.

martinholters · 2018-06-21T11:44:29Z

And another try on top of #27702...

@nanosoldier runbenchmarks(ALL, vs=":master")

nanosoldier · 2018-06-21T22:56:41Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

JeffBezanson · 2018-06-21T23:44:47Z

base/compiler/typeutils.jl

+    end
+    return c
+end
+unioncomplexity(@nospecialize(x)) = 0


What about UnionAll? It looks like we should unwrap them in case there are tuples or unions inside.

I actually had that but apparently lost it in one of the updates. I wasn't sure whether it was necessary, but have now managed to come up with a test case. I also think it might be a good idea to also recurse down the upper bound, but couldn't come up with a test case. Ideas how to construct one?

The `unioncomplexity` is used as an estimate for the number of times `tmerge` had to be called to form the type.

martinholters · 2018-06-25T07:02:48Z

Another try...

@nanosoldier runbenchmarks(ALL, vs=":master")

I'll merge if CI goes well and benchmarks looks good, unless anyone objects.

martinholters · 2018-06-25T08:09:42Z

Restarted freebsd ci after it hit

command timed out: 1200 seconds without output running ['./.freebsdci.sh', 'runtests'], attempting to kill

nanosoldier · 2018-06-25T14:02:22Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

martinholters · 2018-06-26T06:23:23Z

The test failures on travis 32bit look interesting: IIUC correctly, 65535 is chosen as port number and furthermore, it is assumed that adding 1 to a port number gives a valid port number again, which in this case is obviously not true. Anyway does not look related, benchmarks are within noise threshold, so I'm finally merging this.

JeffBezanson · 2018-06-27T22:32:12Z

This seems to have caused a regression in mapslices: https://github.com/JuliaLang/julia/pull/27828/files/d184a46949c254a9b7b2e5e7cb3beaa863f60d17#r198592367

JeffBezanson · 2018-06-28T03:41:16Z

The difference in code seems to enter here:

julia> code_typed(Base.concatenate_setindex!, (Array{Float64,2}, Float64, Base.OneTo{Int64}, Vararg{Any}))
1-element Array{Any,1}:
 CodeInfo(
1951 1 ── %1  = Core.tuple(%%R)::Tuple{Array{Float64,2}}
     │    %2  = Core._apply(Base.dotview, %1, %%I)::SubArray
     │    %3  = Core.tuple(%%v)::Tuple{Float64}      
     │    %4  = Core.tuple(%3)::Tuple{Tuple{Float64}} broadcasted
     │    %5  = Core.tuple(%2)::Tuple{SubArray}╻       axes
     │    %6  = Base.getfield(%2, :(:indices))::Any     getproperty
     │    %7  = Core._apply(Base._indices_sub, %5, %6)::Tuple
     │    %8  = Base.Broadcast.typeof(%7)::Type{#s54} where #s54<:Tuple
     │    %9  = Core.apply_type(Base.Broadcast.Broadcasted, Base.Broadcast.Style{Tuple}, %8, typeof(identity), Tuple{Tuple{Float64}})::Type{Base.Broadcast.Broadcasted{Base.Broadcast.Style{Tuple},_1,typeof(identity),Tuple{Tuple{Float64}}}} where _1
     │    %10 = %9(identity, %4, %7)::Base.Broadcast.Broadcasted{Base.Broadcast.Style{Tuple},_1,typeof(identity),Tuple{Tuple{Float64}}} where _1

At this point, it seems a whole bunch of broadcast code has been inlined in spite of relatively weak type information. In contrast, the prior, faster version looks like this:

julia> code_typed(Base.concatenate_setindex!, (Array{Float64,2}, Float64, Base.OneTo{Int64}, Vararg{Any}))
1-element Array{Any,1}:
 CodeInfo(
1951 1 ─ %1 = Core.tuple(%%R)::Tuple{Array{Float64,2}}     │    
     │   %2 = Core._apply(Base.dotview, %1, %%I)::Union{SubArray{_1,_2,_3,_4,false} where _4 where _3 where _2 where _1, SubArray{_1,_2,_3,_4,true} where _4 where _3 where _2 where _1}
     │   %3 = Core.tuple(%%v)::Tuple{Float64}              │    
     │   %4 = Core.tuple(%3)::Tuple{Tuple{Float64}}        │╻    broadcasted
     │   %5 = new(Base.Broadcast.Broadcasted{Base.Broadcast.Style{Tuple},Nothing,typeof(identity),Tuple{Tuple{Float64}}}, identity, %4, nothing)::Base.Broadcast.Broadcasted{Base.Broadcast.Style{Tuple},Nothing,typeof(identity),Tuple{Tuple{Float64}}}
     │        Base.Broadcast.materialize!(%2, %5)          │    
     └──      return %%R                                   │    
) => Array{Float64,2}

martinholters · 2018-06-28T07:43:23Z

Drilled down to:

julia> Core.Compiler.tmerge(SubArray{_1,_2,_3,_4,false} where _4 where _3 where _2 where _1, SubArray{_1,_2,_3,_4,true} where _4 where _3 where _2 where _1)
# before:
Union{SubArray{_1,_2,_3,_4,false} where _4 where _3 where _2 where _1, SubArray{_1,_2,_3,_4,true} where _4 where _3 where _2 where _1}
# after:
SubArray

I guess we need to be a bit more promiscuous about when we allow a Union to be formed directly and when we fuse types with a common typename to their wrapper. I'll look into it.

martinholters · 2018-06-28T07:48:40Z

That said, the fact that the wider type information leads to more aggressive inlining which is then less performant seems to be an issue by itself.

martinholters · 2018-06-28T12:28:16Z

PR that addresses the regression at #27843.

JeffBezanson · 2018-06-28T21:58:03Z

Yes I think this is mostly an inlining issue. materialize! is declared @inline, which is causing us to inline it even when that's clearly a bad idea (the body is full of dynamic dispatches). But, if an argument has a Union type it seems the inlining logic then decides not to inline it anyway.

…7417) The `unioncomplexity` is used as an estimate for the number of times `tmerge` had to be called to form the type.

martinholters added the compiler:inference Type inference label Jun 4, 2018

martinholters requested a review from vtjnash June 4, 2018 11:30

martinholters commented Jun 4, 2018

View reviewed changes

martinholters mentioned this pull request Jun 4, 2018

Hang during inference #27316

Closed

StefanKarpinski assigned vtjnash Jun 11, 2018

vtjnash requested a review from nanosoldier June 11, 2018 18:37

vtjnash reviewed Jun 11, 2018

View reviewed changes

martinholters force-pushed the mh/unioncomplexity branch from 037a7b8 to 4ee62ae Compare June 12, 2018 08:28

martinholters force-pushed the mh/unioncomplexity branch from 4ee62ae to ad49de7 Compare June 13, 2018 08:21

martinholters force-pushed the mh/unioncomplexity branch from ad49de7 to a1954b6 Compare June 19, 2018 12:12

vtjnash approved these changes Jun 19, 2018

View reviewed changes

martinholters force-pushed the mh/unioncomplexity branch from a1954b6 to 3921171 Compare June 20, 2018 06:11

martinholters mentioned this pull request Jun 20, 2018

manually inline setfield! in Stateful popfirst! #27685

Merged

martinholters force-pushed the mh/unioncomplexity branch from 3921171 to 44c7a3b Compare June 21, 2018 11:43

JeffBezanson reviewed Jun 21, 2018

View reviewed changes

Limit the unioncomplexity instead of the unionlen in tmerge

4a43aa4

The `unioncomplexity` is used as an estimate for the number of times `tmerge` had to be called to form the type.

martinholters force-pushed the mh/unioncomplexity branch from 44c7a3b to 4a43aa4 Compare June 25, 2018 06:21

martinholters changed the title ~~RFC: Limit the unioncomplexity instead of the unionlen in tmerge~~ Limit the unioncomplexity instead of the unionlen in tmerge Jun 25, 2018

martinholters merged commit bd13437 into master Jun 26, 2018

martinholters deleted the mh/unioncomplexity branch June 26, 2018 06:23

JeffBezanson mentioned this pull request Jun 27, 2018

make dims argument to mapslices a keyword arg #27828

Merged

martinholters mentioned this pull request Jun 28, 2018

Let tmerge form a Union more often #27843

Merged

JeffBezanson mentioned this pull request Jun 29, 2018

some updates and fixes to inlining cost #27857

Merged

jrevels pushed a commit that referenced this pull request Jul 2, 2018

Limit the unioncomplexity instead of the unionlen in tmerge (#2…

f569529

…7417) The `unioncomplexity` is used as an estimate for the number of times `tmerge` had to be called to form the type.

Limit the unioncomplexity instead of the unionlen in tmerge #27417

Limit the unioncomplexity instead of the unionlen in tmerge #27417

Conversation

martinholters commented Jun 4, 2018

Choose a reason for hiding this comment

vtjnash Jun 11, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martinholters commented Jun 5, 2018

martinholters commented Jun 11, 2018

StefanKarpinski commented Jun 11, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martinholters commented Jun 12, 2018

martinholters commented Jun 12, 2018

nanosoldier commented Jun 12, 2018

martinholters commented Jun 13, 2018

martinholters commented Jun 13, 2018

nanosoldier commented Jun 13, 2018

martinholters commented Jun 13, 2018

martinholters commented Jun 18, 2018

vtjnash commented Jun 19, 2018

martinholters commented Jun 19, 2018

martinholters commented Jun 20, 2018

martinholters commented Jun 20, 2018

KristofferC commented Jun 20, 2018

fredrikekre commented Jun 20, 2018

martinholters commented Jun 20, 2018

nanosoldier commented Jun 20, 2018

martinholters commented Jun 20, 2018

martinholters commented Jun 21, 2018

nanosoldier commented Jun 21, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martinholters commented Jun 25, 2018

martinholters commented Jun 25, 2018

nanosoldier commented Jun 25, 2018

martinholters commented Jun 26, 2018

JeffBezanson commented Jun 27, 2018 • edited Loading

JeffBezanson commented Jun 28, 2018

martinholters commented Jun 28, 2018

martinholters commented Jun 28, 2018

martinholters commented Jun 28, 2018

JeffBezanson commented Jun 28, 2018

Limit the `unioncomplexity` instead of the `unionlen` in `tmerge` #27417

Limit the `unioncomplexity` instead of the `unionlen` in `tmerge` #27417

vtjnash Jun 11, 2018 •

edited

Loading

JeffBezanson commented Jun 27, 2018 •

edited

Loading