-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default Thread settings (threads=true) cause bad performance on CPU #45
Comments
That isn't good. Here's what I see (on a 2-core machine, Julia 1.5.2, everything updated): julia> @btime Zygote.gradient(f, x)[1];
9.570 ms (163 allocations: 30.53 MiB)
julia> @btime Zygote.gradient(ft, x)[1];
13.836 ms (26 allocations: 30.52 MiB)
# forward only
julia> @btime f($x);
949.564 μs (92 allocations: 6.03 KiB)
julia> @btime ft($x);
1.825 ms (0 allocations: 0 bytes) I wonder what's causing it to be so different? Do you see a slowdown on the forward-only evaluation too? |
Hey, I've got the same versions on my machine. I'm not sure what exactly happens, but in a fresh REPL I get similar results. using Zygote, Tullio, BenchmarkTools
x = abs.(randn((500, 500)));
f(x) = (@tullio y = abs2(x[i+1, j] - x[i-1, j]) + abs2(x[i, j+1] - x[i, j-1]))
ft(x) = (@tullio threads=false y = abs2(x[i+1, j] - x[i-1, j]) + abs2(x[i, j+1] - x[i, j-1]))
@btime f($x);
@btime ft($x);
@btime Zygote.gradient($f, $x)[1];
@btime Zygote.gradient($ft, $x)[1]; returns
However, initially I tested this in a larger Jupyter notebook where additional packages were loaded. After testing each package separatly found the source: using Zygote, Tullio, BenchmarkTools, ImageView
x = abs.(randn((500, 500)));
f(x) = (@tullio y = abs2(x[i+1, j] - x[i-1, j]) + abs2(x[i, j+1] - x[i, j-1]))
ft(x) = (@tullio threads=false y = abs2(x[i+1, j] - x[i-1, j]) + abs2(x[i, j+1] - x[i, j-1]))
@btime f($x);
@btime ft($x);
@btime Zygote.gradient($f, $x)[1];
@btime Zygote.gradient($ft, $x)[1]; returns
Note that for threaded we see ms and not µs. Profile (with a for loop to increase total computing time) inspection suggests that some GTK functions are involved. Without Threads:
With Threads:
Searching for GTK performance issues brings me to: JuliaGraphics/Gtk.jl#503, JuliaLang/julia#35552 So I'm sorry that I posted it here since it turns out to be nothing caused by Tullio. Felix |
Ok, that sounds like a pretty thorny issue. Glad to hear it's not my fault though! |
Hey,
I'm on a 4 core machine. I started Julia with 4 threads and observe (relatively) poor performance with the default Tullio settings
producing:
What's actually the reason for that? I could observe that none of the 4 Julia Threads achieved 100% CPU Usage. Rather 50 - 80% in the main thread, and around 20% in the other three.
Thanks,
Felix
The text was updated successfully, but these errors were encountered: