-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: new DFT api #6193
WIP: new DFT api #6193
Conversation
I know that we have discussed it in #1805 but still it would be nice if the plan would also allow for |
This is quite amazing. Having native code is particularly amazing. For multidimensional transforms, what would be needed to have this easily interact with One other tiny question: the word "dimensions" is used somewhat inconsistently in the standard library. Sometimes it refers to the size of an array (in the same way that you might talk about the dimensions of a room in your house), and sometimes it refers to the indices of particular coordinates (i.e., the 2nd dimension). For what you typically mean by |
What about calling it When it comes to selecting one particular "dimension" one could either use |
Like that suggestion; but since the same issue just came up on julia-users and I'll feel guilty if this PR gets hijacked by a terminology bikeshed, let's move this discussion elsewhere. https://groups.google.com/forum/?fromgroups=#!topic/julia-users/VA5rtWlOdhk |
Yes indeed. It is really cool to get a native Julia implementation of the FFT. Even better that this is done by someone that already has some experience with FFTs ... ;-) |
@tknopp, I see two options. One would be to store both the forward and backward plans in |
Or solution 3. Have some extra plan that supports both directions and maybe have some keyword for |
Having another plan type seems messy. Maybe the best solution is to have p\x or inv(p) compute the inverse plan lazily the first time it is needed, maybe caching it in p. |
Well, this would be another option. This could be generalized in a way that one can specify (as keyword argument) which of the two plans one wants to be precomputed on initialization. |
OK, now I have an honest-to-goodness engineering question for you. I'm aware that what I'm about to ask makes it harder to cope with my other question about I notice that with FFTW, transforming along dimension 2 is substantially slower than transforming along dimension 1 (all times after suitable warmup):
Presumably this is because of cache. In Images, I've taken @lindahua's advice to heart, and implemented operations on dimensions higher than 1 in terms of interwoven operations along dimension 1. For example, with
Slightly faster along dimension 2 than 1! An even more dramatic example is
Notice that the 4-fold difference with Grid's version is about the same as for FFTW. I'm not sure the FFT can be written so easily in terms of interwoven operations, but given the crucial importance of this algorithm in modern computing I thought it might be worth asking. No better time than when you're in the middle of re-implementing it from scratch. |
FFTW does implement a variety of strategies for "interweaving" the transforms along the discontiguous direction (see our Proc. IEEE paper). You should notice a much smaller performance difference if you use the |
This is hugely impressive, very nice. |
This looks pretty awesome. One thing I've wondered about: would it be reasonable to use an LRU cache for FFTW plan objects created by |
@simonster, that's a good idea, but I'd prefer to do that in a generic way that is not restricted to FFTW (i.e. a cache of However, I'd prefer to put that functionality, assuming it can be implemented cleanly, into a separate PR following this one, since there's a lot going on in this PR already. Ideally, this sort of cache would be completely transparent and wouldn't affect the public API. |
@timholy, in principle it is straightforward to get shared-memory parallelism in an FFT, since many of the loops can be executed in parallel, though getting good cache performance is hard. You can already use FFTW's shared-memory parallelism on any |
@tknopp, the latest version of the patch now supports |
great, thanks! |
# similar to FFTW, and we do the scaling generically to get the ifft: | ||
|
||
type ScaledPlan{T} <: Plan{T} | ||
p::Plan{T} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since Plan
is abstract, I think you might have to parametrize by the type of the plan itself to get type inference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to write something along the lines of type ScaledPlan{P<:Plan{T}} <: Plan{T}; p::P; ......; end
? Otherwise I don't know how to make ScaledPlan
a subtype of the correct T
for Plan{T}
.
Hmm, I guess I could do:
type ScaledPlan{T,P<:Plan} <: Plan{T}
p::P
....
end
and just make sure in the constructor that P <: Plan{T}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be a better way, but one approach that should work is to parametrize by both T
and the plan type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been playing with this, and it seems rather hard to avoid nasty recursive type definitions. Suppose we have a type FooFFT{T,true} <: Plan{T}
that computes FFTs, and that the inverse is of a slightly different type FooFFT{T,false}
. I want to put a correctly typed pinv
field in it (initially undefined) like this:
type FooFFT{T,forward}
pinv::FooFFT{T,!forward}
FooFFT() = new()
end
But this isn't allowed (no method !(TypeVar)
). So, instead, I parameterize by the type of pinv
:
type FooFFT{T,forward,P} <: Plan{T}
pinv::P
FooFFT() = new()
end
But then my initialization requires an infinitely recursive type:
FooFFT{T,true,FooFFT{T,false,FooFFT{T,true,...}}}()
One compromise would be to initialize it as
FooFFT{T,true,FooFFT{T,false,Plan{T}}}()
which means that inv(inv(p))
would not have a concrete type that is detectable by the compiler.
Is there a better way?
@stevengj this is really nice. I love the new Plan api, it will make it much easier to support the same api for GPU accelerated fft's in CLFFT.jl and I suspect cuda's fft library. |
Just rebased, and it looks like the performance problems with non-power-of-two sizes have disappeared: if I disable SIMD in FFTW (via the I'm guessing the recent inlining improvements get the credit here, but I would have to bisect to be sure. |
This is really exciting! Have you tried whether |
@jtravs, it looks like |
I'm seeing some weirdness in the FFTW linkage now that I've updgraded In particular, I use
to get the FFTW version number. When this command runs during the Julia build in order to initialize the @staticfloat, could this be similar to the PCRE woes in #1331 and #3838? How can I fix it so that Julia uses its own version of FFTW consistently, regardless of what is installed in |
@ScottPJones, the performance isn't crucial for merging as long as we still have FFTW for all of the common types, which is why I thought it didn't need to be checked off for 0.4. But it would be nice to figure out why the speed keeps oscillating over time. |
Finally back! @stevengj Sorry, I didn't make myself clear, I meant that it looked like the performance issue mentioned in the top comment had been taken care of, and that somebody had struck out the text about it, but had simply forgotten to check off the box... I didn't mean at all that this PR should be held up for any reason except for any remaining things like deprecation warnings. It looks like a very worthwhile improvement to the code... I like that it uses a more flexible framework... and for people who can't use FFTW, the decent performance it has is still infinitely better than NO performance at all. |
Okay, changed the |
I'd just like to register a vote here for merging this soon, even if performance and other issues need to be resolved later. I'm currently having to merge this pull request into a local julia repository and it is a bit of a pain TBH. |
Yes, +1 for merging |
As an additional comment, is there any chance of getting a nicer syntax for preallocated output than |
It should also be possible to make InplaceOps.jl work for this. |
Closing this as the new DFT api has been merged separately, and now a separate patch with the pure-Julia FFTs is needed. |
I've rebased this branch, should I push it out? |
Please do, thanks @yuyichao. |
The rebase is here. I simply rebased it and checked all the conflicts. I didn't check the compatibility with the current master of the non-conflict parts and I also didn't squash any of the non-empty commits. Edit: the branch is on top of the current master now (after my new DFT API tweak for type inference.) |
@yuyichao Bump! Is this going to be a PR (now that 0.4 is branched) ? :) |
I don't think I'm familiar with this enough to do the PR. I rebased the branch to make sure the part I want to get into 0.4 doesn't mess up the rest of this PR too much. CC. @stevengj |
It's not a priority for me at the moment, but it is certainly something that could be rebased onto 0.5 if needed. |
sorry for reviving a now 4+ year old thread, but is there a differentiable FFT in julia? IIUC, FFTW.jl is a C wrapper, so it is not, plus it is also GPL. thanks. |
We have the adjoints for FFTW.jl in Zygote.jl already, and will change that to AbstractFFT.jl very soon too https://github.com/FluxML/Zygote.jl/blob/17ca911b82134c4a765822cd2b7ee19e959cc8e4/src/lib/array.jl#L777 for reference |
This is not quite ready to merge yet, but I wanted to bring it to your attention to see what you think. It is a rewritten DFT API that provides:
p = plan_fft(x)
(and similar) now returns a subtype ofBase.DFT.Plan
. It acts like a linear operator: you can apply it to an existing array withp * x
, and apply it to a preallocated output array withA_mul_B!(y, p, x)
. You can apply the inverse plan withp \ x
orinv(p) * x
.Partly since we have discussed moving FFTW to a module at some point for licensing reasons, and partly to support numeric types like
BigFloat
, the patch also includes a pure-Julia FFT implementation, with the following features.AbstractVector
s of arbitraryComplex
types (includingBigFloat
), with fast algorithms for all sizes (including prime sizes). Multidimensional transforms of arbitraryStridedArray
subtypes.For non-power-of-two sizes, the performance is several times worse, but I suspect that there may be some fixable compiler problem there (a good stress test for the compiler).(Non-power-of-two sizes seem fine with latest Julia).(Now they suck again.)import FFTW
is enough to automatically switch over to the FFTW algorithms for supported types.Major to-do items:
Minor to-do items (will eventually be needed to move the FFTW code out to an external module, but should not block merge):
rfft
support.maximum(size(x))
.)cc: @JeffBezanson, @timholy