64-bit FFTW #1617

ViralBShah · 2012-11-27T14:56:33Z

FFTW also has a 32-bit interface. As per the documentation, it seems that we have to use the guru interface to get 64-bit.

http://www.fftw.org/doc/64_002dbit-Guru-Interface.html#64_002dbit-Guru-Interface

ViralBShah · 2012-12-20T13:42:28Z

@stevengj Does FFTW have a 64-bit API? I found the link above in the documentation, which suggests that one has to use the Guru interface for 64-bit.

This is in the context of #1527 and #1532.

StefanKarpinski · 2012-12-20T16:12:35Z

You don't really need a 64-bit FFTW interface since you can just use the 32-bit interface in a loop.

ViralBShah · 2012-12-20T18:13:54Z

How? It would need non-trivial work. I'd just rather use the 64-bit interface if such a thing was available.

StefanKarpinski · 2012-12-20T19:52:28Z

Sorry, I was thinking of DSFMT rather than FFTW. You're right, of course, for FFTW you need to have a 64-bit interface.

ViralBShah · 2012-12-20T20:42:29Z

Ok. DSFMT is already patched up - that was easy.

stevengj · 2012-12-20T21:41:07Z

Yes, the 64-bit API in FFTW is the guru64 interface (http://www.fftw.org/fftw3_doc/64_002dbit-Guru-Interface.html). So, you can't use the basic interface if you want 64-bit support, but since you are creating the plans internally the messiness of calling the guru interface for simple things is hidden anyway.

PS. Would be nice to have a simple way to plan with FFTW_PATIENT. Also, there is the issue of data alignment (you need 16-byte aligned arrays if you want to get the benefit of SSE2, AVX, etc., and 32-byte aligned is better).

ViralBShah · 2012-12-20T22:58:02Z

All arrays in julia are all 16-byte aligned. Is 32-byte alignment required by AVX?

Could you say a bit more about planning with FFTW_PATIENT? Do you mean exposing some of the FFTW plan APIs in julia to provide more flexibility? Nobody had asked for it, until now.

stevengj · 2012-12-21T03:56:40Z

32-byte alignment is not required by AVX, it is just a bit faster. The good news is that 16-byte alignment is all that is required by FFTW to re-use a plan from one array to another of the same size.

I'm not surprised that no one has asked for it, since users rarely dig into performance issues or do careful benchmark comparisons in my experience. Even forgetting about FFTW_PATIENT, you are sacrificing a lot of performance by creating and destroying the plan for every FFT (which not only re-creates the plan but also recomputes the trigonometric tables etcetera each time).

For example, for a complex double-precision FFT of size 1024, @time fft(x) in Julia reports a running time of about 60us, whereas FFTW's tests/bench -oestimate 1024 reports a running time of 16us. Using FFTW_PATIENT (tests/bench -opatient 1024) results in a running time of 9us. For larger sizes, the overhead of re-creating and destroying the plan each time is less, but the advantage of FFTW_PATIENT is greater. e.g. for size 2^18, Julia takes 29ms while tests/bench -oestimate gives 25ms, but tests/bench -opatient gives a runtime of only 8ms (albeit after 5min to create the plan).

It would be nice to have something like:

  julia> p = plan_fft(x)
  julia> y = p(x);  # computes fft(x) efficiently
  julia> z = p(x.^2); # re-use plan for a different array of same dimensions/strides

That is, plan_fft would just return an optimized fft function (technically, a wrapper around a plan and fftw_execute_dft). (Re-using the plan in this way, at least for arrays of the same strides, is safe since Julia guarantees 16-byte alignment.) plan_fft(x) could default to FFTW_ESTIMATE, but you could also support plan_fft(x, FFTW_PATIENT) etcetera, and even plan_fft(x, FFTW_PATIENT, 10sec) and similar to use the fftw_timelimit mechanism to specify a time limit on the plan creation. And of course plan_fftn, plan_bfft, and similar variants.

Seems like this should be fairly easy to implement (I guess you can use finalizer to make the gc call fftw_destroy_plan when p is garbage-collected).

You'd also want to export the import/export_wisdom functions so that people could save plans.

ViralBShah · 2013-01-02T15:13:24Z

Fixed in #1856

ViralBShah mentioned this issue Dec 21, 2012

Improved FFTW interface #1805

Closed

stevengj mentioned this issue Dec 31, 2012

RFC: rewritten FFTW interface #1856

Merged

ViralBShah closed this as completed Jan 2, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

64-bit FFTW #1617

64-bit FFTW #1617

ViralBShah commented Nov 27, 2012

ViralBShah commented Dec 20, 2012

StefanKarpinski commented Dec 20, 2012

ViralBShah commented Dec 20, 2012

StefanKarpinski commented Dec 20, 2012

ViralBShah commented Dec 20, 2012

stevengj commented Dec 20, 2012

ViralBShah commented Dec 20, 2012

stevengj commented Dec 21, 2012

ViralBShah commented Jan 2, 2013

64-bit FFTW #1617

64-bit FFTW #1617

Comments

ViralBShah commented Nov 27, 2012

ViralBShah commented Dec 20, 2012

StefanKarpinski commented Dec 20, 2012

ViralBShah commented Dec 20, 2012

StefanKarpinski commented Dec 20, 2012

ViralBShah commented Dec 20, 2012

stevengj commented Dec 20, 2012

ViralBShah commented Dec 20, 2012

stevengj commented Dec 21, 2012

ViralBShah commented Jan 2, 2013