Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

18x increase in compile time on v1.8 #46492

Closed
baggepinnen opened this issue Aug 26, 2022 · 5 comments · Fixed by #46581
Closed

18x increase in compile time on v1.8 #46492

baggepinnen opened this issue Aug 26, 2022 · 5 comments · Fixed by #46581
Labels
compiler:latency Compiler latency regression Regression in behavior compared to a previous version

Comments

@baggepinnen
Copy link
Contributor

baggepinnen commented Aug 26, 2022

The following snippet takes

7.833256 seconds (36.60 M allocations: 1.727 GiB, 7.05% gc time, 100.00% compilation time

on julia v1.7.3
but

146.158104 seconds (474.09 M allocations: 79.036 GiB, 4.20% gc time, 100.00% compilation time)

on julia v1.8

using ControlSystems
a_2 = [-5 -3; 2 -9]
C_212 = ss(a_2, [1; 2], [1 0; 0 1], [0; 0])
@time norm(C_212)

an @profile is available in html format here.

image

@KristofferC KristofferC added regression Regression in behavior compared to a previous version compiler:latency Compiler latency labels Aug 26, 2022
@KristofferC
Copy link
Member

There are at least two regressions I think. Bisecting I see at one time it goes from ~10 seconds to

 39.839046 seconds (102.91 M allocations: 9.583 GiB, 9.93% gc time, 100.00% compilation time)

@ianatol
Copy link
Member

ianatol commented Aug 26, 2022

There are at least two regressions I think. Bisecting I see at one time it goes from ~10 seconds to

 39.839046 seconds (102.91 M allocations: 9.583 GiB, 9.93% gc time, 100.00% compilation time)

Any particularly guilty commits? @KristofferC

@KristofferC
Copy link
Member

Few more steps left :)

@KristofferC
Copy link
Member

KristofferC commented Aug 26, 2022

First 5x regression is 2c0e1bc, cc @JeffBezanson

@KristofferC
Copy link
Member

The second big regression (30s -> 200s) comes from 26b0b6e, cc @aviatesk

aviatesk added a commit that referenced this issue Aug 29, 2022
`CachedMethodTable` was removed within #44240 as we couldn't confirm any
performance improvement then. However it turns out the optimization was
critical in some real world cases (e.g. #46492), so this commit revives
the mechanism with the following tweaks that should make it more effective:
- create method table cache per inference (rather than per local
  inference on a function call as on the previous implementation)
- only use cache mechanism for abstract types (since we already cache
  lookup result at the next level as for concrete types)

As a result, the following snippet reported at #46492 recovers the
compilation performance:
```julia
using ControlSystems
a_2 = [-5 -3; 2 -9]
C_212 = ss(a_2, [1; 2], [1 0; 0 1], [0; 0])
@time norm(C_212)
```

> on master
```
julia> @time norm(C_212)
364.489044 seconds (724.44 M allocations: 92.524 GiB, 6.01% gc time, 100.00% compilation time)
0.5345224838248489
```

> on this commit
```
julia> @time norm(C_212)
 26.539016 seconds (62.09 M allocations: 5.537 GiB, 5.55% gc time, 100.00% compilation time)
0.5345224838248489
```
aviatesk added a commit that referenced this issue Aug 30, 2022
`CachedMethodTable` was removed within #44240 as we couldn't confirm any
performance improvement then. However it turns out the optimization was
critical in some real world cases (e.g. #46492), so this commit revives
the mechanism with the following tweaks that should make it more effective:
- create method table cache per inference (rather than per local
  inference on a function call as on the previous implementation)
- only use cache mechanism for abstract types (since we already cache
  lookup result at the next level as for concrete types)

As a result, the following snippet reported at #46492 recovers the
compilation performance:
```julia
using ControlSystems
a_2 = [-5 -3; 2 -9]
C_212 = ss(a_2, [1; 2], [1 0; 0 1], [0; 0])
@time norm(C_212)
```

> on master
```
julia> @time norm(C_212)
364.489044 seconds (724.44 M allocations: 92.524 GiB, 6.01% gc time, 100.00% compilation time)
0.5345224838248489
```

> on this commit
```
julia> @time norm(C_212)
 26.539016 seconds (62.09 M allocations: 5.537 GiB, 5.55% gc time, 100.00% compilation time)
0.5345224838248489
```
KristofferC pushed a commit that referenced this issue Aug 30, 2022
`CachedMethodTable` was removed within #44240 as we couldn't confirm any
performance improvement then. However it turns out the optimization was
critical in some real world cases (e.g. #46492), so this commit revives
the mechanism with the following tweaks that should make it more effective:
- create method table cache per inference (rather than per local
  inference on a function call as on the previous implementation)
- only use cache mechanism for abstract types (since we already cache
  lookup result at the next level as for concrete types)

As a result, the following snippet reported at #46492 recovers the
compilation performance:
```julia
using ControlSystems
a_2 = [-5 -3; 2 -9]
C_212 = ss(a_2, [1; 2], [1 0; 0 1], [0; 0])
@time norm(C_212)
```

> on master
```
julia> @time norm(C_212)
364.489044 seconds (724.44 M allocations: 92.524 GiB, 6.01% gc time, 100.00% compilation time)
0.5345224838248489
```

> on this commit
```
julia> @time norm(C_212)
 26.539016 seconds (62.09 M allocations: 5.537 GiB, 5.55% gc time, 100.00% compilation time)
0.5345224838248489
```

(cherry picked from commit 8445744)
aviatesk added a commit that referenced this issue Aug 30, 2022
`CachedMethodTable` was removed within #44240 as we couldn't confirm any
performance improvement then. However it turns out the optimization was
critical in some real world cases (e.g. #46492), so this commit revives
the mechanism with the following tweaks that should make it more effective:
- create method table cache per inference (rather than per local
  inference on a function call as on the previous implementation)
- only use cache mechanism for abstract types (since we already cache
  lookup result at the next level as for concrete types)

As a result, the following snippet reported at #46492 recovers the
compilation performance:
```julia
using ControlSystems
a_2 = [-5 -3; 2 -9]
C_212 = ss(a_2, [1; 2], [1 0; 0 1], [0; 0])
@time norm(C_212)
```

> on master
```
julia> @time norm(C_212)
364.489044 seconds (724.44 M allocations: 92.524 GiB, 6.01% gc time, 100.00% compilation time)
0.5345224838248489
```

> on this commit
```
julia> @time norm(C_212)
 26.539016 seconds (62.09 M allocations: 5.537 GiB, 5.55% gc time, 100.00% compilation time)
0.5345224838248489
```
aviatesk added a commit that referenced this issue Aug 31, 2022
`CachedMethodTable` was removed within #44240 as we couldn't confirm any
performance improvement then. However it turns out the optimization was
critical in some real world cases (e.g. #46492), so this commit revives
the mechanism with the following tweaks that should make it more effective:
- create method table cache per inference (rather than per local
  inference on a function call as on the previous implementation)
- only use cache mechanism for abstract types (since we already cache
  lookup result at the next level as for concrete types)

As a result, the following snippet reported at #46492 recovers the
compilation performance:
```julia
using ControlSystems
a_2 = [-5 -3; 2 -9]
C_212 = ss(a_2, [1; 2], [1 0; 0 1], [0; 0])
@time norm(C_212)
```

> on master
```
julia> @time norm(C_212)
364.489044 seconds (724.44 M allocations: 92.524 GiB, 6.01% gc time, 100.00% compilation time)
0.5345224838248489
```

> on this commit
```
julia> @time norm(C_212)
 26.539016 seconds (62.09 M allocations: 5.537 GiB, 5.55% gc time, 100.00% compilation time)
0.5345224838248489
```
vtjnash added a commit that referenced this issue Sep 1, 2022
Added in #43415, this was too aggressive for many cases. Unlike the
comment suggested, it is unneeded in many cases, so only do it when it
is expected to be maximally profitable.

Fixes #46492

julia> @time norm(C_212)
before 45.959497 seconds (81.85 M allocations: 6.976 GiB, 6.31% gc time, 100.00% compilation time)
after  15.781804 seconds (20.81 M allocations: 1.294 GiB, 6.32% gc time, 100.00% compilation time)
vtjnash added a commit that referenced this issue Sep 2, 2022
…#46581)

Added in #43415, this was too aggressive for many cases. Unlike the
comment suggested, it is unneeded in many cases, so only do it when it
is expected to be maximally profitable.

Fixes #46492

```
julia> @time norm(C_212)
before 45.959497 seconds (81.85 M allocations: 6.976 GiB, 6.31% gc time, 100.00% compilation time)
after  15.781804 seconds (20.81 M allocations: 1.294 GiB, 6.32% gc time, 100.00% compilation time)
```
aviatesk pushed a commit that referenced this issue Sep 9, 2022
…#46581)

Added in #43415, this was too aggressive for many cases. Unlike the
comment suggested, it is unneeded in many cases, so only do it when it
is expected to be maximally profitable.

Fixes #46492

```
julia> @time norm(C_212)
before 45.959497 seconds (81.85 M allocations: 6.976 GiB, 6.31% gc time, 100.00% compilation time)
after  15.781804 seconds (20.81 M allocations: 1.294 GiB, 6.32% gc time, 100.00% compilation time)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:latency Compiler latency regression Regression in behavior compared to a previous version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants