-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize convolve_sensitive_float_matrix! #486
Comments
@rgiordan -- There's a comment at the hot spot that suggests you've thought about avoiding the copy. Do you have ideas for how to do that? That loop and the one after it accounts for half the total runtime. The fft takes very little time in comparison.
|
One somewhat invasive idea would be to make each element of a Another crazier idea (that would require more Julia-foo than I have available off the top of my head) would be to write an |
The orange bars right below the tall (and widest) towers in the profile image seem to be this line and this line which are loading floating point values into a buffer. The two towers are the dft calculations so loading the buffers take way longer than computing the dft and the loading seems to consume about 40-45pct of the total runtime.
@jrevels and I looked a bit into this and our guess right now is that the loading is slow because of the two pointer loads and jumpy memory access pattern. This seems to be the main bottleneck right now and it can probably be optimized in various ways.
The text was updated successfully, but these errors were encountered: