Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NFFT in parallel #17

Closed
DeVerMyst opened this issue Sep 29, 2016 · 40 comments
Closed

NFFT in parallel #17

DeVerMyst opened this issue Sep 29, 2016 · 40 comments

Comments

@DeVerMyst
Copy link

Hello, trying to update a software for Julia V.5 I got this problem:

nx = 64
nb = 195
nw = 20

serial

using NFFT

println(" serial ")
fr = rand(2,nb) - .5
plan = NFFTPlan( fr , (nx,nx))
x = rand(nx,nx) + 0.*im
F = nfft(plan,x)

parallel 1

println(" p1 ")
Fx = SharedArray( Complex{Float64}, (nb, nw))

@sync @parallel for n in 1:nw
println("worker")
plan = NFFTPlan( fr , (nx,nx))
x = rand(nx,nx) + 0.*im
Fx[:,n] = nfft(plan, x)
end

plan = NFFTPlan( fr , (nx,nx))
@sync @parallel for n in 1:nw
println("worker")
x = rand(nx,nx) + 0.*im
Fx[:,n] = nfft(plan, x)
end

julia> include("testnfft.jl")
serial
p1
From worker 2: worker

signal (11): Segmentation fault: 11
while loading no file, in expression starting on line 0
fftw_execute_dft at /Applications/Julia-0.5.app/Contents/Resources/julia/lib/julia//libfftw3.dylib (unknown line)

  • at ./fft/FFTW.jl:624
@robertdj
Copy link
Collaborator

robertdj commented Sep 29, 2016

On Julia v0.4.5 the code runs without errors. Let me find an v0.5 and test.

@robertdj
Copy link
Collaborator

I can run your code without errors in Julia v0.5.0 on Linux Mint. What does versioninfo() return on your computer?

@DeVerMyst
Copy link
Author

Hello, I forgot to tell that I have problem with this code only when Julia is using several workers (julia -p nprocs). Similar code was working perfectly with the Version 4.5 of Julia.

julia> versioninfo()
Julia Version 0.5.0
Commit 3c9d753 (2016-09-19 18:14 UTC)
Platform Info:
System: Darwin (x86_64-apple-darwin13.4.0)
CPU: Intel(R) Core(TM) i7-2640M CPU @ 2.80GHz
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
LAPACK: libopenblas64_
LIBM: libopenlibm
LLVM: libLLVM-3.7.1 (ORCJIT, sandybridge)

Regards

@robertdj
Copy link
Collaborator

I get seg faults on both Julia v0.4.5 and v0.5.0, but the fault is happening in fft (in Base Julia).

A recent change in NFFT is to use a precomputed plan to perform the ordinary FFT part: What used to be fft!(p.tmpVec) is now p.forwardFFT * p.tmpVec. Changing back to the old way makes your code run without errors.

I don't have experience with @sync and @parallel, but it seems weird to me that you can reuse the same NFFT plan in parallel loops: Each of them overwrites p.tmpVec for intermediate calculations. Have you verified that you get the correct results?
It seems to me that the only explanation is that multiple workers are not allowed to use the same ordinary FFT plan...

@robertdj
Copy link
Collaborator

robertdj commented Oct 1, 2016

Testing with the ordinary FFT confirms my suspicion: Precomputed plans for FFT do not work in parallel. Check e.g. the following:

A = rand(10, 4) + im
P = plan_fft( A[:,1] )
B = similar(A)

@sync @parallel for n in 1:4
    A_mul_B!( C[:,n], P, A[:,n] )
end

This seg faults my julia -p 2 in Julia v0.5.0.

It's unfortunate that the precomputed FFT plans in NFFT causes problems at this approach allows us to have only one nfft! method (and use less time/memory).

@tknopp : Any thoughts on this? Maybe it's better to ask wiser people for help with the FFT.

@tknopp
Copy link
Member

tknopp commented Oct 2, 2016

If I remember correctly there is an option in FFTW to make a plan thread safe. This is a little different since it is about multi-process parallelism but it may work.

But our plan is certainly not safe to be used in different threads, because of the tmp vector.

@robertdj
Copy link
Collaborator

robertdj commented Oct 6, 2016

@DeVerMyst : Have you looked into the correctness of your results when using NFFT in parallel?

@DeVerMyst
Copy link
Author

@robertdj : I use the NFFT in parallel in PAINTER.jl
Before making the choice of (or not) parallelizing some part of the algorithm I checked if results were equal (up to a small approximation) but in the case I am studying. So I can confirm that in the set of data I used to make the test, the results of NFFT in parallel were correct, and nicely fast.

@robertdj
Copy link
Collaborator

robertdj commented Oct 7, 2016

I have next to no experience with parallel computing in Julia, but I don't understand why the different threads are not overwriting the same temporary array...
We could revert to not using precomputed FFT plans, but I would like to figure out why this parallelizing works.

@tknopp
Copy link
Member

tknopp commented Oct 8, 2016

@DeVerMyst: Its not surprising that your code works because in

https://github.com/andferrari/PAINTER.jl/blob/3a4dc61699ed23060422f6851085378a708b6f81/src/paintertools.jl#L62

you have n plans and use a dedicates plan for each process. So you have simply avoided that our API is not threadsafe.

@robertdj
Copy link
Collaborator

robertdj commented Oct 8, 2016

That explains it!

@DeVerMyst
Copy link
Author

Hello,

Do you have any idea on how to make it work on julia v.5 ?

Regards

@tknopp
Copy link
Member

tknopp commented Oct 10, 2016

@DeVerMyst your code should work on Julia 0.5. The example that you posted on top of this issue is not what you have done here:

https://github.com/andferrari/PAINTER.jl/blob/3a4dc61699ed23060422f6851085378a708b6f81/src/paintertools.jl#L62

As long as you have a dedicated plan for each process, it should work.

@DeVerMyst
Copy link
Author

In order to present to You my problem I gave a lighter version of the code I use without thinking it can have a problem of temporary variable. I wrote to You because it was not working with Julia 0.5. Other libraries also, but I tried them one by one and I got this pb with NFFT and or FFTW.
For example this code which is totally equivalent to what is done in PAINTER.jl works with Julia 0.4.5 and not with Julia 0.5 with additional workers.

https://github.com/DeVerMyst/ForJulia5

Best regards

@tknopp
Copy link
Member

tknopp commented Oct 10, 2016

That code works for me and does not segfault

@tknopp
Copy link
Member

tknopp commented Oct 10, 2016

on a Mac

@DeVerMyst
Copy link
Author

Hello,

For me when I launch julia 0.5 without workers and so not in parallel, it s working.
If I launch Julia 0.5 with workers: "julia -p 2" I have a seg fault.

Here is my versioninfo() :

Julia Version 0.5.0
Commit 3c9d753 (2016-09-19 18:14 UTC)
Platform Info:
System: Darwin (x86_64-apple-darwin13.4.0)
CPU: Intel(R) Core(TM) i7-2640M CPU @ 2.80GHz
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
LAPACK: libopenblas64_
LIBM: libopenlibm
LLVM: libLLVM-3.7.1 (ORCJIT, sandybridge)

I tried on a second mac and we obtain a segfault also, here is the versioninfo():

julia> versioninfo()
Julia Version 0.5.0
Commit 3c9d753 (2016-09-19 18:14 UTC)
Platform Info:
System: Darwin (x86_64-apple-darwin13.4.0)
CPU: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
LAPACK: libopenblas64_
LIBM: libopenlibm
LLVM: libLLVM-3.7.1 (ORCJIT, haswell)

Regards

@tknopp
Copy link
Member

tknopp commented Oct 11, 2016

ok confirmed. segfaults here as well.

But this is an issue with fftw and you should be able reproducing a similar test case that does not involve any nfft. So this would be more suitable as a bug in the JuliaLang/julia repository. Or maybe @stevengj could shim in here and report if the segfault is expected (I have no experience using fifth with multi-process shared-memory Julia).

@stevengj
Copy link
Member

Seems like it needs FFTW 3.3.5 and fftw_make_planner_thread_safe.

@stevengj
Copy link
Member

Oh, sorry, this issue is not about threading...

@stevengj
Copy link
Member

FFTW plans can't be serialized and sent to other processes, I don't think. Seems like Julia should throw an exception if you try to do this.

@tknopp
Copy link
Member

tknopp commented Oct 11, 2016

Aaaa, I just saw that we switched in this package to the low level fftw interface where we store the plan

https://github.com/tknopp/NFFT.jl/blob/master/src/NFFT.jl#L205

I originally used fft! which would create the plan on the remote.

@tknopp
Copy link
Member

tknopp commented Oct 11, 2016

So one would have to modify the code in

https://github.com/DeVerMyst/ForJulia5/blob/master/painternnfftpar.jl

to somehow create the plans on the remote and not all in the master process. But I don't have a clue how to do that...

@robertdj
Copy link
Collaborator

I did indeed started to save the plan at some point (also noted earlier in this thread). For now, it's probably easier to revert to using fft!.

@robertdj
Copy link
Collaborator

I'd forgotten about this issue. Let me make the revert to fft! and tag a new release.

@tknopp
Copy link
Member

tknopp commented Oct 28, 2016

I am not so sure about that. In a single processing environment what you do is actually what we want. And also for the multi-threading case it is possible to cache the FFTW plan as Steve indicated above

@tknopp
Copy link
Member

tknopp commented Oct 28, 2016

Maybe an optional parameter and an if/else to switch between both modes?

@robertdj
Copy link
Collaborator

robertdj commented Oct 28, 2016

I agree with:

In a single processing environment what you do is actually what we want.

Didn't Steven write:

FFTW plans can't be serialized and sent to other processes, I don't think.

I read that as if it could not be done..?

Since the precomputed plans break stuff I'm thinking that we can revert to fft! for now and tag that as a patch. If we can figure out how to use plans with multithreading that can come into a new release.

@robertdj
Copy link
Collaborator

Regarding the optional par, I suppose we can check if procs only returns on number...

@tknopp
Copy link
Member

tknopp commented Oct 28, 2016

One has to distinguish multiprocessing and multithreading. In my opinion with Julia 0.5 supporting multithreading, the multiprocessing use case is not so important anymore.

@robertdj
Copy link
Collaborator

Right. If e.g. the Painter package is using multiprocessing don't you think it is worth supporting (for now)?

P.s.: Sorry if I rushed through this.

@tknopp
Copy link
Member

tknopp commented Oct 28, 2016

Thats fine.

lets just have a parameter cacheFFTWPlan which is by default set to true but can be set to false.
I would also create the cached plan when cacheFFTWPlan==false but when the fft is actually applied, cacheFFTWPlan==false will fallback to the non-cached version. Then we can teach the Painter package to set cacheFFTWPlan==false.

@robertdj
Copy link
Collaborator

I just finished writing up a fix before you answered: https://github.com/robertdj/NFFT.jl/blob/master/src/NFFT.jl#L208

What do you think of this?

@tknopp
Copy link
Member

tknopp commented Oct 28, 2016

yea that also a good approach. Lets stick with that

@robertdj
Copy link
Collaborator

Cool. I'll do the tagging later and close this issue.

@tknopp
Copy link
Member

tknopp commented Oct 28, 2016

nice, I fixed a small issue with the density compensation methods in the mean time. don't know if you use that (the sdc function). That broke when the new type parameter were introduced.

In my research group be are starting doing MRI research and therefore will use this package more and more in the future.

@robertdj
Copy link
Collaborator

@tknopp : Have you published the tag v0.1.3 to METADATA? Otherwise I'll do it now along with a new v0.1.4.

@tknopp
Copy link
Member

tknopp commented Oct 28, 2016

no this was by accident. Please do it and thanks.

@tknopp
Copy link
Member

tknopp commented Oct 28, 2016

I was /am playing around with PkgDev. Making releases seems pretty easy nowadays

@robertdj
Copy link
Collaborator

It is pretty easy :-) Now the releases await approval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants