-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Status #77
Comments
Status Update:
The only benchmark where we are slightly slower than FINUFFT is the 3D one. Here are two remarks:
|
Status update: Improved Since we are faster or on the same level as FINUFFT, the performance work now is basically done. |
I just hook in here. |
@roflmaostc: That is a little bit more complicated:
Independently of all that: I am benchmarking large 1D / 2D / 3D transforms and they all benefit from multi-threading. May here be the issue that your transforms are 1D and short (< 10000)? |
I made several major performance improvements over the last couple of weeks and thought that it makes sense to open an issue to describe where we stand and what I have found out. This issue can also be used for performance tracking in the future. I guess that I squeezed out about a factor of 3-5 and we are now on the same level as the fastest NFFT package on the CPU (FINUFFT).
There are certainly some smaller improvements possible. Here are some ideas:
@simd
macro did not help. Probably we already use SIMD?The text was updated successfully, but these errors were encountered: