-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Latency regression in Plots on master vs 1.7 #41914
Comments
From sysimg log, there seems to be significant difference between precompiled statements
and I guess we need to compile more things at runtime on master ? Haven't bisected what caused this particular regression though. EDIT: well, there is not much difference between
|
Plots is a tough case because the most effective way to be cross-version and cross-platform is to use if ccall(:jl_generating_output, Cint, ()) == 1
while false; end # disable the interpreter
do_work(small_dataset)
end rather than if ccall(:jl_generating_output, Cint, ()) == 1
precompile(do_work, (DataSetType,))
end but plotting has side effects.Still, they should probably try to transition to this style wherever possible. Currently they rely on the bot to generate precompiles. |
Yes, looks like something regressed vs beta3? beta3: julia> @time using Plots
3.277906 seconds (7.42 M allocations: 511.464 MiB, 9.06% gc time, 0.20% compilation time)
julia> @time (p = plot(rand(2,2)); display(p));
7.475758 seconds (18.31 M allocations: 962.836 MiB, 2.53% gc time, 15.59% compilation time) on #41781: julia> @time using Plots
4.682400 seconds (8.93 M allocations: 610.531 MiB, 7.02% gc time, 0.16% compilation time)
julia> @time (p = plot(rand(2,2)); display(p));
9.137335 seconds (19.63 M allocations: 1.005 GiB, 2.10% gc time, 16.63% compilation time) Should be fairly quick to bisect perhaps. |
I get: 1.7.0-rc1:
master:
|
This comment has been minimized.
This comment has been minimized.
So there are two levels of TTFP regression...:
I filed an issue around
|
Tried again: I get: julia> @time using Plots
# 1.6.2:
2.795078 seconds (8.13 M allocations: 564.960 MiB, 3.26% gc time, 0.10% compilation time)
# 1.7.0-rc1:
4.379164 seconds (10.48 M allocations: 713.449 MiB, 5.69% gc time, 0.15% compilation time)
# master:
7.175294 seconds (16.00 M allocations: 1018.494 MiB, 3.80% gc time, 0.17% compilation time) julia> @time display(plot(rand(10)))
# 1.6.2:
6.755702 seconds (18.10 M allocations: 1.013 GiB, 4.01% gc time, 15.37% compilation time)
# 1.7.0-rc1:
8.278466 seconds (18.85 M allocations: 1001.208 MiB, 1.95% gc time)
# mater:
10.549896 seconds (25.17 M allocations: 1.321 GiB, 1.90% gc time, 13.70% compilation time) So 1.7 is bad and master is really bad. 2x regression in using seems milestone worthy. I'll try to bisect it. |
😱 okay, I will also try to bisect it (I think currently we don't have a solid way to define a latency regression) |
I checked out 1.6.2 locally and get:
And if I use the released version downloaded from the website:
Hard to bisect when the same Julia version can give so dramatically different results... I even cleaned out Edit: I need to check when I'm back at the computer that I am not running with assertions on or something like that. |
Do you have any Make.user flags set? |
Literally just updated my comment :) |
I started # ~/julia4 tags/v1.7.0-rc1 aviatesk@amdci2
# ❯ ./usr/bin/julia -e '@time using Plots; @time plot(rand(10,3))'
# 4.071415 seconds (8.09 M allocations: 569.073 MiB, 7.24% gc time, 0.10% compilation time)
# 2.741481 seconds (3.46 M allocations: 194.945 MiB, 5.41% gc time)
# ~/julia master aviatesk@amdci2 4m 33s
# ❯ ./usr/bin/julia -e '@time using Plots; @time plot(rand(10,3))'
# 7.275295 seconds (12.46 M allocations: 793.529 MiB, 5.06% gc time, 0.65% compilation time)
# 2.663575 seconds (3.55 M allocations: 196.283 MiB, 2.25% gc time)
make cleanall
make -j 64
./usr/bin/julia -e 'using Pkg; Pkg.precompile()'
./usr/bin/julia -e """
let
allocated = @allocated using Plots
if allocated > 750_000_000 # let's set threshold on 750 MiB
exit(1)
else
exit(0)
end
end
""" And it bisects 5405994, which is fairly reasonable: before the commit,
This suggests we may need to change our mind on how to the latency regression quickly ? |
With regards to 5405994, I think there was some consensus that Was that your question, @aviatesk, or is it something else? I haven't bisected or done anything else like that. |
Yes, thanks for your response.
Yeah, when I merged the PR, I assumed the previous I wonder |
Okay, so embarrassingly, making sure that I don't run with assertions, I don't see a regression in 1.7 vs 1.6. And the regression on master is explained by #41914 (comment). So I am not sure it makes sense to keep this open? |
I'm trying to find the different between 1.7 and 1.6, and let me close this if I can't find the difference. |
Okay, I got:
So I'd like to conclude there is not much difference between 1.6 and 1.7. |
There's a very tiny Fairly recent master: tim@diva:~/src/julia-master$ juliam -q --project --startup-file=no
julia> @time using Requires
0.069032 seconds (50.07 k allocations: 4.413 MiB, 50.21% compilation time)
julia>
tim@diva:~/src/julia-master$ juliam -q --project --startup-file=no
julia> tstart = time(); using Requires; time()-tstart
0.0838780403137207 1.6: julia>
tim@diva:~/src/julia-master$ julia -q --project --startup-file=no
julia> @time using Requires
0.038621 seconds (108.95 k allocations: 6.865 MiB, 92.67% compilation time)
julia>
tim@diva:~/src/julia-master$ julia -q --project --startup-file=no
julia> tstart = time(); using Requires; time()-tstart
0.06404614448547363 There's run-to-run variability, but these particular instances are pretty typical.
tim@diva:~/src/julia-master$ juliam -q --startup-file=no
julia> Core.Compiler.Timings.reset_timings(); Core.Compiler.__set_measure_typeinf(true); using Requires; Core.Compiler.__set_measure_typeinf(false); Core.Compiler.Timings.close_current_timer()
julia> traw = deepcopy(Core.Compiler.Timings._timings[1]);
julia> using SnoopCompile
julia> tinf = SnoopCompile.SnoopCompileCore.InferenceTimingNode(traw)
InferenceTimingNode: 0.073816/0.073816 on Core.Compiler.Timings.ROOT() with 0 direct children The 0 direct children means it's not an inference issue. Of course, this doesn't stress-test things like method table insertion, but the benchmarks above make it seem like this isn't a big issue. |
1.7:
master:
I haven't looked deeply into it yet.
The text was updated successfully, but these errors were encountered: