-
Notifications
You must be signed in to change notification settings - Fork 81
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exploit structure of upsampled input data when doing FFTs #304
Comments
I pushed a demo for the partial FFTs as part of the #287 PR. |
Any news here? Seems like an amazing performance boost! Are there any limitations as to why this shouldn't work well both for cufinufft and finufft or certain upsampling factors? |
No news; too busy w/ 2.2 & various deadlines. The complication is its
interaction with the vectorized interface (batching in FFTW via
plan_many_dfts...)
It will only make a difference for upsampfac=2 for the CPU.
To be honest, cufft is so fast that for the GPU it won't make much
difference. (Unless maybe you have timing breakdowns in your MRI
application that says otherwise... let us know).
Happy new year, Alex
…On Thu, Dec 28, 2023 at 12:59 PM turbotage ***@***.***> wrote:
Any news here? Seems like an amazing performance boost! Are there any
limitations as to why this shouldn't work well both for cufinufft and
finufft or certain upsampling factors?
—
Reply to this email directly, view it on GitHub
<#304 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACNZRSVDWVNTSGZMGKPPONTYLWXPHAVCNFSM6AAAAAAZXN2DCKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZRGM4DGMRRGQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
*-------------------------------------------------------------------~^`^~._.~'
|\ Alex Barnett Center for Computational Mathematics, Flatiron Institute
| \ http://users.flatironinstitute.org/~ahb 646-876-5942
|
I think #463 achieves this for DUCC. I do not think supporting FFTW in the same way is worth. |
I disagree, that the project of FFTW exploiting sparsity is still
interesting, and has potential 2x in 3D, when FFT-dominated.
I am delighted with the ducc0 performance, but we can't throw away FFTW
completely...
…On Wed, Jul 17, 2024 at 11:44 AM Marco Barbone ***@***.***> wrote:
I think #463 <#463>
achieves this for DUCC. I do not think supporting FFTW in the same way is
worth.
—
Reply to this email directly, view it on GitHub
<#304 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACNZRSWEAS5C7VO5RBWOP63ZM2GO5AVCNFSM6AAAAABLA5AZQGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZTGYZTEMBRGU>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
*-------------------------------------------------------------------~^`^~._.~'
|\ Alex Barnett Center for Computational Mathematics, Flatiron Institute
| \ http://users.flatironinstitute.org/~ahb 646-876-5942
|
To clarify. I do not thing adding support for sparsity in fftw is worth the maintainability tradeoff. I am not advocating for dropping fftw. This can be reopened in the future in case there a strong need, or can be moved to discussions for future reference. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Originally posted by @mreineck in #293 (comment)
I've implemented a POC for the suggestion by @mreineck that shows typical speedups ~1.5 on 2D data vs the naive way with MKL. I'm not doing upsampling quite the way it's done in
finufft
, but the answer should correct up to a phase shift as far as I can tell. I don't have a lot of cycles to work on this right now, but this seems like a very easy way to get a huge performance bump in higher dimensions.The text was updated successfully, but these errors were encountered: