Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updated fix for #19892 (FFTW threads initialization) #21169

Merged
merged 2 commits into from
Mar 28, 2017

Conversation

stevengj
Copy link
Member

initialize FFTW threads the first time the planner is called (I tried to be clever in #21127, but it incorrectly prevented threads from being used at all). Fixes #21163

…me the planner is called (JuliaLang#21127 incorrectly prevented threads from being used at all)
@ararslan ararslan added the bugfix This change fixes an existing bug label Mar 27, 2017
@tkelman
Copy link
Contributor

tkelman commented Mar 27, 2017

are there any simple benchmarks that could be added that would have caught this?

@giordano
Copy link
Contributor

@tkelman I'm not sure. I spotted the problem by performing benchmarks of a package of mines, but the regression was easily noticeable only in the first run (compilation) of plan_fft: the whole benchmark was taking way too much to run, but the times measured by @benchmark weren't that bad.

@stevengj
Copy link
Member Author

We could check for the use of threads by:

julia> FFTW.set_num_threads(2)

julia> contains(string(plan_fft(Array{Complex128}(1<<14), flags = FFTW.ESTIMATE)), "dft-thr")
true

@stevengj
Copy link
Member Author

Pushed a test.

@giordano
Copy link
Contributor

giordano commented Mar 27, 2017

Is it normal that the plan changes with this patch? With 4 threads:

On d694548:

julia> plan_rfft(randn(10^5,1), flags = FFTW.MEASURE)
FFTW real-to-complex plan for 100000×1 array of Float64
(rdft2-ct-dit/16
  (hc2c-direct-16/60/0 "hc2cfdftv_16_avx"
    (rdft2-r2hc-direct-16 "r2cf_16")
    (rdft2-r2hc01-direct-16 "r2cfII_16"))
  (dft-thr-vrank>=1-x4/1
    (dft-vrank>=1-x2/1
      (dft-ct-dit/25
        (dftw-direct-25/96 "t1fv_25_avx")
        (dft-vrank>=1-x25/1
          (dft-ct-dit/25
            (dftw-direct-25/192 "t2fv_25_avx")
            (dft-direct-10-x25 "n1fv_10_avx")))))
    (dft-vrank>=1-x2/1
      (dft-ct-dit/25
        (dftw-direct-25/96 "t1fv_25_avx")
        (dft-vrank>=1-x25/1
          (dft-ct-dit/25
            (dftw-direct-25/192 "t2fv_25_avx")
            (dft-direct-10-x25 "n1fv_10_avx")))))
    (dft-vrank>=1-x2/1
      (dft-ct-dit/25
        (dftw-direct-25/96 "t1fv_25_avx")
        (dft-vrank>=1-x25/1
          (dft-ct-dit/25
            (dftw-direct-25/192 "t2fv_25_avx")
            (dft-direct-10-x25 "n1fv_10_avx")))))
    (dft-vrank>=1-x2/1
      (dft-ct-dit/25
        (dftw-direct-25/96 "t1fv_25_avx")
        (dft-vrank>=1-x25/1
          (dft-ct-dit/25
            (dftw-direct-25/192 "t2fv_25_avx")
            (dft-direct-10-x25 "n1fv_10_avx")))))))

On https://github.com/stevengj/julia/tree/fftwfix2

julia> plan_rfft(randn(10^5,1), flags = FFTW.MEASURE)
FFTW real-to-complex plan for 100000×1 array of Float64
(rdft2-ct-dit/16
  (hc2c-direct-16/60/0 "hc2cfdftv_16_avx"
    (rdft2-r2hc-direct-16 "r2cf_16")
    (rdft2-r2hc01-direct-16 "r2cfII_16"))
  (dft-thr-vrank>=1-x4/1
    (dft-vrank>=1-x2/1
      (dft-ct-dit/5
        (dftw-direct-5/16 "t1fv_5_avx")
        (dft-vrank>=1-x5/1
          (dft-ct-dit/5
            (dftw-direct-5/8 "t3fv_5_avx")
            (dft-vrank>=1-x5/1
              (dft-ct-dit/25
                (dftw-direct-25/96 "t1fv_25_avx")
                (dft-direct-10-x25 "n1fv_10_avx")))))))
    (dft-vrank>=1-x2/1
      (dft-ct-dit/5
        (dftw-direct-5/16 "t1fv_5_avx")
        (dft-vrank>=1-x5/1
          (dft-ct-dit/5
            (dftw-direct-5/8 "t3fv_5_avx")
            (dft-vrank>=1-x5/1
              (dft-ct-dit/25
                (dftw-direct-25/96 "t1fv_25_avx")
                (dft-direct-10-x25 "n1fv_10_avx")))))))
    (dft-vrank>=1-x2/1
      (dft-ct-dit/5
        (dftw-direct-5/16 "t1fv_5_avx")
        (dft-vrank>=1-x5/1
          (dft-ct-dit/5
            (dftw-direct-5/8 "t3fv_5_avx")
            (dft-vrank>=1-x5/1
              (dft-ct-dit/25
                (dftw-direct-25/96 "t1fv_25_avx")
                (dft-direct-10-x25 "n1fv_10_avx")))))))
    (dft-vrank>=1-x2/1
      (dft-ct-dit/5
        (dftw-direct-5/16 "t1fv_5_avx")
        (dft-vrank>=1-x5/1
          (dft-ct-dit/5
            (dftw-direct-5/8 "t3fv_5_avx")
            (dft-vrank>=1-x5/1
              (dft-ct-dit/25
                (dftw-direct-25/96 "t1fv_25_avx")
                (dft-direct-10-x25 "n1fv_10_avx")))))))))

@stevengj
Copy link
Member Author

@giordano, the MEASURE plan creation is non-deterministic (it depends on runtime measurements), and it is normal for it to vary from run to run because there are many possible algorithms with similar performance.

@JeffBezanson
Copy link
Member

Ready to merge?

@stevengj stevengj merged commit 378ed8a into JuliaLang:master Mar 28, 2017
@nalimilan
Copy link
Member

@stevengj Any idea why the new test would fail with FFTW 3.3.3 (CentOS 7)? Is that expected?

@stevengj stevengj deleted the fftwfix2 branch April 2, 2017 11:18
@stevengj
Copy link
Member Author

stevengj commented Apr 2, 2017

@nalimilan, yes, fftw_sprint_plan wasn't available in FFTW 3.3.3. (For bureaucratic reasons, FFTW doesn't follow semver.) Hence show of a plan in FFTW 3.3.3 doesn't actually show the plan contents and the test will fail.

So, the test should only run if FFTW.version ≥ v"3.3.4", sorry.

@iamnapo iamnapo mentioned this pull request Apr 5, 2017
tkelman pushed a commit that referenced this pull request May 2, 2017
* updated fix for #19892; initialize FFTW threads the first time the planner is called (#21127 incorrectly prevented threads from being used at all)

* add test for #21163

(cherry picked from commit 378ed8a)
tkelman pushed a commit that referenced this pull request May 3, 2017
* updated fix for #19892; initialize FFTW threads the first time the planner is called (#21127 incorrectly prevented threads from being used at all)

* add test for #21163

(cherry picked from commit 378ed8a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix This change fixes an existing bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants