Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved FFTW interface #1805

Closed
ViralBShah opened this issue Dec 21, 2012 · 15 comments
Closed

Improved FFTW interface #1805

ViralBShah opened this issue Dec 21, 2012 · 15 comments
Labels
help wanted Indicates that a maintainer wants help on an issue or pull request performance Must go faster

Comments

@ViralBShah
Copy link
Member

From the discussion in #1617, @stevengj suggests updating the FFTW interface as follows.

I'm not surprised that no one has asked for it, since users rarely dig into performance issues or do careful benchmark comparisons in my experience. Even forgetting about FFTW_PATIENT, you are sacrificing a lot of performance by creating and destroying the plan for every FFT (which not only re-creates the plan but also recomputes the trigonometric tables etcetera each time).

For example, for a complex double-precision FFT of size 1024, @time fft(x) in Julia reports a running time of about 60us, whereas FFTW's tests/bench -oestimate 1024 reports a running time of 16us. Using FFTW_PATIENT (tests/bench -opatient 1024) results in a running time of 9us. For larger sizes, the overhead of re-creating and destroying the plan each time is less, but the advantage of FFTW_PATIENT is greater. e.g. for size 2^18, Julia takes 29ms while tests/bench -oestimate gives 25ms, but tests/bench -opatient gives a runtime of only 8ms (albeit after 5min to create the plan).

It would be nice to have something like:

julia> p = plan_fft(x)
julia> y = p(x);  # computes fft(x) efficiently
julia> z = p(x.^2); # re-use plan for a different array of same dimensions/strides

That is, plan_fft would just return an optimized fft function (technically, a wrapper around a plan and fftw_execute_dft). (Re-using the plan in this way, at least for arrays of the same strides, is safe since Julia guarantees 16-byte alignment.) plan_fft(x) could default to FFTW_ESTIMATE, but you could also support plan_fft(x, FFTW_PATIENT) etcetera, and even plan_fft(x, FFTW_PATIENT, 10sec) and similar to use the fftw_timelimit mechanism to specify a time limit on the plan creation. And of course plan_fftn, plan_bfft, and similar variants.

Seems like this should be fairly easy to implement (I guess you can use finalizer to make the gc call fftw_destroy_plan when p is garbage-collected).

You'd also want to export the import/export_wisdom functions so that people could save plans.

@ViralBShah
Copy link
Member Author

We may want to have an FFTW_Plan type, and then dispatch on that.

@staticfloat
Copy link
Member

So would there be a "default fftw plan" that is created during the sysimg.jl call, or would this have to be recomputed every time julia starts up?

@ViralBShah
Copy link
Member Author

The plan depends on the size of the input.

@staticfloat
Copy link
Member

Oh, of course. So the current fft(x) would map to plan_fft(x)(x)?

@stevengj
Copy link
Member

@staticfloat The only potential shortcomings of implementing fft(x) as plan_fft(x)(x) are (a) you might want fft(x) to garbage-collect the plan immediately rather than waiting for the gc to run and (b) given p = plan_fft(x), executing p(x) will have a slight overhead because p(x) will need to check to make sure that x has the same shape, strides, and alignment mod 16 (differing alignments may occur for subarrays of larger arrays) as when p was created, and this overhead is unnecessary in plan_fft(x)(x).

One solution to (b) would be to internally have an unsafe_plan_fft(x) function which produces a function that does not check the shape/strides/alignment of its argument. You would then implement

fft(x) = unsafe_plan_fft(x,FFTW_ESTIMATE,0)(x)

and something like

plan_fft(x,flags,timelimit)
    local p = unsafe_plan_fft(x,flags,timelimit)
    return y -> equal_shape_stride_align(x,y) ? p(y) : throw(someexception)
end
plan_fft(x,flags) = plan_fft(x,flags,0.)
plan_fft(x) = plan_fft(x,FFTW_ESTIMATE)

Actually, on second thought (a) is not a problem---the lack of immediate garbage collection is actually a good thing. If the user calls fft(x) several times in a row, then if the previous plan is not yet garbage-collected FFTW will automatically re-use its table of trigonometric constants (which is destroyed when the last plan of a given size is fftw_destroy'ed).

@stevengj
Copy link
Member

As mentioned in issue #1806, it would also be good to

  • Remove the default fft(x) = fft(x,1) behavior except for 1d arrays.
  • Support fft(x, [1,3]) and similar to transform along multiple dimensions (this is supported efficiently via FFTW's guru interface).

And similarly for plan_fft, of course. fftn could then be implemented as simply fft(x,[1:ndims(x)]), which is essentially what FFTW does internally anyway.

And for completeness you should probably have fft(x::Number) = x etc.

@ViralBShah
Copy link
Member Author

Agree. The FFTW interface is ripe for an overhaul. I propose using an FFTWPlan type can hold a bunch of information regarding how the plan is generated, and provide constructors for creating plans.

We can then even have interfaces that allow reusing a plan by providing it as the first input argument.

fft(plan::FFTWPlan, args...)

@stevengj
Copy link
Member

I don't quite understand the point of an FFTWPlan type. Wouldn't it be easier for the user to have plan_fft just return a function as I suggested? That way, the user doesn't have to think about an additional data structure, they just have two options: use the fft function, or a custom "optimized size-specific fft" function generated by plan_fft (with similar arguments).

It seems like the only reason to have a separate FFTWPlan type is if you want to do other things with the plan besides execute it. In theory, there are a few low-level routines that one could call, but in practice it is probably so rare for users to want these that such users can just call the FFTW interface directly (or lower-level routines hidden in fftw.jl).

@ViralBShah
Copy link
Member Author

The only reason I find the FFTWPlan type appealing would be to allow for multiple dispatch on fft:

fft(plan::FFTWPlan, args...)
fft(args...)

That way, fft continues to work as expected, but you get the option of using the plan with the first argument being FFTWPlan. To me, this usage feels more idiomatic, but returning a function achieves the same outcome.

@stevengj
Copy link
Member

Effectively a plan is a function, so you might as well make it one for real since you have closures; aren't higher-order functions idiomatic too?

Also, it has slightly different arguments from fft, since it only takes the array and not any extras like which dimension to transform (that would have already been fixed when the plan was created).

@tknopp
Copy link
Contributor

tknopp commented Dec 23, 2012

It would be good if the FFTWPlan would inherit from AbstractMatrix{T}. Then one can overload the operators * and \ like

*{T}(A::FFTWPlan, x::Array{T,1}) = fftn(reshape(x, A.shape))
{T}(A::FFTWPlan, x::Array{T,1}) = ifftn(reshape(x, A.shape))

Note that this would require to store the shape of the underlying multidimensional fft in the plan, since the operator * should act on a vector, which however can represent a multidimensional structure. FFTWPlan may then be actually better named FFTMatrix or something like that.

It would be cool if one could implement this for the common linear transformations in a consistent fashion (DCT, DST, Wavelet transform, ...)

My intention for a matrix type for common structured linear transformations is that then one can for instance write iterative linear solvers and use them without any modifications on structured transformations. I have written some compressed sensing algorithms using an AbstractMatrix{T} type as argument and applied a SparseFFTMatrix to it, which implements * and \ and this worked nicely.

@stevengj
Copy link
Member

@tknopp: You are missing the point: an FFTPlan is specific to a particular direction of transform, so a single plan cannot perform both fft and ifft.

Furthermore, this opens up a can of worms, I think. For one thing, one should really be able to multiply and add linear operators to produce new ones, so you need to make sure to define a bunch of new operations if you want this to be a general feature as opposed to an application-specific representation. (In some cases operators can be combined and simplified, and in other cases you will internally just compose the functions.)

Furthermore, if you have a linear operator on 1d vectors and act it on a higher-dimensional array, it would make the most sense for it to act as a tensor contraction, i.e. act on the first dimension of the higher-dimensional array and loop over the other dimensions. And if you have a linear operator on higher-dimensional arrays and act it on a lower-dimensional one, it should just throw an error. Reshaping does not make sense to me in general, except for application-specific cases as mentioned below.

Furthermore, in the former case of acting a 1d operator on a higher-dimensional array, using an FFTWPlan here does not work, because the whole point of an FFTWPlan (or an FFT function, or whatever) is that it is an optimized FFT for one particular size and shape (and alignment, etc.); performing a loop of FFT operations is best done with a separate plan.

I totally agree that iterative linear solvers should work on AbstractMatrices that can be opaque functions, much like PETSc does. But you are really best off letting the user define such functions herself rather than trying to guess what that function will look like. For example, if one is using an iterative solver with a spectral method, the operator is almost certainly not just an FFT, it will be FFTs combined with other operations, so a predefined FFTOperator by itself will be pretty useless.

So I think it makes the most sense to keep the the fft and plan_fft as separate, less-abstract/lower-level operations that transform arrays and produce functions that transform arrays, respectively, from which the user can build abstract linear operators as needed for particular applications (and she need only define the operations that are needed, unlike a "generic" implementation in Julia itself).

@tknopp
Copy link
Contributor

tknopp commented Dec 23, 2012

@stevengj: Ok, I wasn't aware anymore that the plan is only for one direction. Then, one probably should put the Matrix abstraction in a higher level package. On a first sight it seemed to be redundant to provide a low level plan and a matrix type when one type can do both. But you have proven me wrong.

About the reshaping thing: My proposal was for a multidimensional Fourier Matrix (i.e. how to build a matrix type for fftn). The reshaping is only used to apply fftn with the right shape. It would not be sensible to operate for the * operator on a multidimensional array, when the FFTMatrix is supposed to be usable in iterative linear solvers, which I really like to implement in a generic way.

I still think that it is useful to provide for common linear transformations a matrix type even when the users application might need more specialized transformations. As you propose, it should be possible to compose linear transformations. This could be done by a matrix product type. Simplification would be very cool, but would make it a very complex feature. Don't know if this is really necessary as these simplifications are usually carried out by hand.

@stevengj
Copy link
Member

stevengj commented Jan 2, 2013

Fixed by #1856.

@Keno Keno closed this as completed Jan 2, 2013
@ViralBShah
Copy link
Member Author

Relevant discussion on julia-users

https://groups.google.com/forum/#!topic/julia-users/kV-sKVFR2n0

@tknopp tknopp mentioned this issue Mar 18, 2014
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Indicates that a maintainer wants help on an issue or pull request performance Must go faster
Projects
None yet
Development

No branches or pull requests

5 participants