-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use call overload in FFTW.plan_* #11994
Conversation
Also worth noting that the performance improvement is 6x the one I get by specifying Edit: Update the number. I certainly did't know how to divide just now.... |
Better benchmark. As expected, the speed up is the most visible with small input size (in my case the size is ~ Tested with function test_fft(ary)
println(length(ary))
plan = plan_fft!(copy(ary), 1:1, FFTW.MEASURE)
iplan = plan_ifft!(copy(ary), 1:1, FFTW.MEASURE)
plan(ary)
gc()
@time for i in 1:10_000
plan(ary)
iplan(ary)
end
end
for s in (10, 100, 10000, 16, 128, 16384)
test_fft(rand(Complex{Float64}, s))
end Current master
This PR
|
This will make #6193 significantly harder to merge because it will introduce lots of conflicts... Continual rebasing of 6193 is hard enough as it is. |
@stevengj An easy (although possibly stupid) way would be to just revert this in #6193. I would definitely prefer having #6193 if it was not in the I didn't read #6193 very carefully. Is the structure/main part of the current P.S. can you also have a look at the place that I suspect missing a assert? (here Line 768 in 8ad0c1c
Edit: add link to the code since it was not that trivial to find in my original post. |
@ScottPJones I am not sure what you are voting on here and why you have a viewpoint on one developer's time is more important than another. It would be nice to restrict commenting where you are making a contribution or raising an issue. It is ok to upvote once in a while, and only if you are part of the implementation. I realize that you are excited and want to chime in on everything - but please think before chiming in, as it could end up being noise rather than adding value. |
Actually this shouldn't introduce any bad conflict since the first commit in #6193 removes (move/split) this file and the conflict resolution should just be a reset. It would indeed be better if this is changed to a non-breaking and no-function-added part of #6193 (basically moving code and returning functors in Other than that, I still believe it is nice to have this performance gain in for 0.4. |
It's on 0.5 milestone so my impression is unlikely.
Yep. That's exactly my plan. It shouldn't be too bad and I'm waiting to see if there's anything I can change in this PR to make @stevengj's work on #6193 easier. |
bump @stevengj |
I'd rather just rip out the pure-Julia FFT from #6193 if people prefer that route. That should be mainly a simple matter of just deleting a couple of files from that patch. |
Close in favor of #12087 . |
I'm aware of #6193. However, I think it is still useful to get this into 0.4 for following reasons.
This PR does not break the API
The API in the PR does not conflict with the one in WIP: new DFT api #6193 (the commits themselves might)
Unless we're going to fix the anonymous function performance issue, this PR has a non-negligible performance improvment.
I have a time propagator which uses FFT to transform between position and momentum space and this PR boost the performance by 23% (2.6s to 2.0s) and reduce the allocation number by 10x and size by 130M (2720 k -> 300k times, 440M -> 310M). These are numbers that includes the time spend for other calculations. (cleaner benchmark: Use call overload in FFTW.plan_* #11994 (comment))
A few other notes about this PR:
assert_applicable
?@stevengj