-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For Loop Slowdown #11787
Comments
@JaredCrean2 |
Right, thanks for the correction. |
Also note, that data was collected on 6/18 (I didn't get around to creating this issue until today, 6/20), so those times should be 18 and 2 days old, respectively. |
That's not a problem at all. Git makes it easy to "go back in time" and see how performance has changed. I have a bisect running right now to narrow down what might have caused this — the information you gave was very valuable! |
Thanks, I'm glad I could help. |
Thanks for reporting this. Do I read correctly that the third test case got faster? |
AFAICT, it's more that construction of ArrayViews got better with recent julia---at least in my tests, there is no allocation whatsoever with ArrayViews. At one point the two were comparable: #8501 (comment). However, I'm not sure this is the only thing going on: since those benchmarks there have been several changes to SubArrays (#9329, #10296, #10505, #11296, #10525, and #11595), and I just haven't been able to find the time (and probably won't soon) to dig into this more than superficially. |
Nice detective work, @mbauman. |
I'm trying to dig into this one a bit more. With my latest performance fixes (#11833 + #11843), we're much improved (I've duplicated the $ ./julia tmp/arrayspeed/main.jl
1.138 seconds (22783 allocations: 948 KB)
527.508 milliseconds (4 allocations: 144 bytes)
double loop @time printed above
600.666 milliseconds (38441 allocations: 1635 KB)
560.828 milliseconds (4 allocations: 144 bytes)
ArrayView @time printed above
3.274 seconds (120 M allocations: 4581 MB, 8.19% gc time)
3.116 seconds (120 M allocations: 4578 MB, 8.66% gc time)
slice @time printed above Now the question is if the regression that was introduced in dadf243 was coupled to this or not… |
Hurrah, this seems to be solved. Current master: $ ./julia tmp/arrayspeed/main.jl
1.342 seconds (21263 allocations: 876 KB)
592.298 milliseconds (4 allocations: 144 bytes)
double loop @time printed above
600.586 milliseconds (38426 allocations: 1635 KB)
562.081 milliseconds (4 allocations: 144 bytes)
ArrayView @time printed above
3.253 seconds (120 M allocations: 4581 MB, 7.99% gc time)
3.103 seconds (120 M allocations: 4578 MB, 8.53% gc time)
slice @time printed above Julia @ dadf243 $ ~/Applications/julia-0.4-pre-alloca/bin/julia tmp/arrayspeed/main.jl
1.294 seconds (122 k allocations: 4513 KB)
626.594 milliseconds (4 allocations: 144 bytes)
double loop @time printed above
755.303 milliseconds (38880 allocations: 1659 KB)
639.865 milliseconds (4 allocations: 144 bytes)
ArrayView @time printed above
4.769 seconds (240 M allocations: 8243 MB, 9.22% gc time)
4.206 seconds (240 M allocations: 8240 MB, 9.57% gc time)
slice @time printed above I even went back and added my performance fixes to dadf243 just to make sure that I wasn't independently "covering up" the perf regression with independent fixes. But that had little-to-no effect on the total performance. It may be the case that there are still some non-optimal corner-cases in user (or other base) code that are pessimized here, but without a solid example I think this can be closed. |
Excellent. |
I was testing a few different ways to calculate a quantity over an array, and noticed for loops slowed down significantly from commit 317a4d1* (16 days old master). @timholy suggested I file an issue here.
The test code is here: https://github.com/JaredCrean2/julia_tests/tree/master/array_speed2. It does the calculating using a manually inlined double for loop, ArrayViews, and SubArrays.
Using Julia Version 0.4.0-dev+5149 (2015-06-01 18:58 UTC) Commit 317a4d1* (16 days old master)
Using Julia Version 0.4.0-dev+5439 (2015-06-18 13:48 UTC) Commit a23caa0* (0 days old master):
I looked at
@code_warntye getEulerFlux(q,F)
, and the a23caa0 produced a huge amount of code (>1100 lines) compared to the 317a4d1 (~70 lines), and included many lines that looked likethat were not present in 317a4d1.
The text was updated successfully, but these errors were encountered: