Skip to content

Commit

Permalink
doc trouble type
Browse files Browse the repository at this point in the history
  • Loading branch information
ahbarnett committed Dec 5, 2020
1 parent 213adb3 commit 9afe77d
Show file tree
Hide file tree
Showing 4 changed files with 151 additions and 3 deletions.
2 changes: 1 addition & 1 deletion docs/refs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ References
Please cite the following two papers if you use this software:

[FIN]
A parallel non-uniform fast Fourier transform library based on an ``exponential of semicircle'' kernel.
A parallel non-uniform fast Fourier transform library based on an "exponential of semicircle" kernel.
A. H. Barnett, J. F. Magland, and L. af Klinteberg.
SIAM J. Sci. Comput. 41(5), C479-C504 (2019). `arxiv version <https://arxiv.org/abs/1808.06736>`_

Expand Down
2 changes: 1 addition & 1 deletion docs/trouble.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ If FINUFFT is slow (eg, less than $10^6$ nonuniform points per second), here is

- Try printing debug output to see step-by-step progress by FINUFFT. Do this by setting ``opts.debug`` to 1 or 2 then looking at the timing information.

- Try reducing the number of threads, either those available via OpenMP, or via ``opts.nthreads``, perhaps down to 1 thread, to make sure you are not having collisions between threads, or slowdown due to thread overheads. Hyperthreading (more threads than physical cores) rarely helps much. Thread collisions are possible if large problems are run with a large number of (say more than 64) threads. Another ase causing slowness is very many repetitions of small problems; see ``test/manysmallprobs`` which exceeds $10^7$ points/sec with one thread via the guru interface, but can get ridiculously slower with many threads; see https://github.com/flatironinstitute/finufft/issues/86
- Try reducing the number of threads, either those available via OpenMP, or via ``opts.nthreads``, perhaps down to 1 thread, to make sure you are not having collisions between threads, or slowdown due to thread overheads. Hyperthreading (more threads than physical cores) rarely helps much. Thread collisions are possible if large problems are run with a large number of (say more than 64) threads. Another case causing slowness is very many repetitions of small problems; see ``test/manysmallprobs`` which exceeds $10^7$ points/sec with one thread via the guru interface, but can get ridiculously slower with many threads; see https://github.com/flatironinstitute/finufft/issues/86

- Try setting a crude tolerance, eg ``tol=1e-3``. How many digits do you actually need? This has a big effect in higher dimensions, since the number of flops scales like $(\log 1/\epsilon)^d$, but not quite as big an effect as this scaling would suggest, because in higher dimensions the flops/RAM ratio is higher.

Expand Down
Binary file modified finufft-manual.pdf
Binary file not shown.
150 changes: 149 additions & 1 deletion matlab/finufft_plan.m
Original file line number Diff line number Diff line change
@@ -1,4 +1,152 @@

% FINUFFT_PLAN is a class which wraps the guru interface to FINUFFT.
%
% Full documentation is given in ../finufft-manual.pdf and online at
% http://finufft.readthedocs.io
% Also see examples in the matlab/examples and matlab/test directories.
%
% PROPERTIES
% mwptr - opaque pointer to a C++ finufft_plan object (see MWrap manual),
% whose properties cannot be accessed directly
% floatprec - either 'double' or 'single', tracks what precision of C++
% library is being called
% type, dim, n_modes, n_trans, nj, nk - other plan parameters
% Note: the user should never alter these plan properties directly! Rather,
% the below methods should be used to create, use, and destroy plans.
%
% METHODS
% finufft_plan - create guru plan object for one/many general nonuniform FFTs.
% setpts - process nonuniform points for general FINUFFT transform(s).
% execute - execute single or many-vector FINUFFT transforms in a plan.
%
% General notes:
% * use delete(plan) to remove a plan after use.
% * See ERRHANDLER, VALID_*, and this code for warning/error IDs.
%
%
%
% =========== Detailed description of guru methods ==========================
%
% 1) FINUFFT_PLAN create guru plan object for one/many general nonuniform FFTs.
%
% plan = finufft_plan(type, n_modes_or_dim, isign, ntrans, eps)
% plan = finufft_plan(type, n_modes_or_dim, isign, ntrans, eps, opts)
%
% Creates a finufft_plan MATLAB object in the guru interface to FINUFFT, of
% type 1, 2 or 3, and with given numbers of Fourier modes (unless type 3).
%
% Inputs:
% type transform type: 1, 2, or 3
% n_modes_or_dim if type is 1 or 2, the number of Fourier modes in each
% dimension: [ms] in 1D, [ms mt] in 2D, or [ms mt mu] in 3D.
% Its length sets the dimension, which must be 1, 2 or 3.
% If type is 3, in contrast, its *value* fixes the dimension
% isign if >=0, uses + sign in exponential, otherwise - sign.
% eps relative precision requested (generally between 1e-15 and 1e-1)
% opts optional struct with optional fields controlling the following:
% opts.debug: 0 (silent, default), 1 (timing breakdown), 2 (debug info).
% opts.spread_debug: spreader: 0 (no text, default), 1 (some), or 2 (lots)
% opts.spread_sort: 0 (don't sort NU pts), 1 (do), 2 (auto, default)
% opts.spread_kerevalmeth: 0: exp(sqrt()), 1: Horner ppval (faster)
% opts.spread_kerpad: (iff kerevalmeth=0) 0: don't pad to mult of 4, 1: do
% opts.fftw: FFTW plan mode, 64=FFTW_ESTIMATE (default), 0=FFTW_MEASURE, etc
% opts.upsampfac: sigma. 2.0 (default), or 1.25 (low RAM, smaller FFT)
% opts.spread_thread: for ntrans>1 only. 0:auto, 1:seq multi, 2:par, etc
% opts.maxbatchsize: for ntrans>1 only. max blocking size, or 0 for auto.
% opts.nthreads: number of threads, or 0: use all available (default)
% opts.floatprec: library precision to use, 'double' (default) or 'single'.
% for type 1 and 2 only, the following opts fields are also relevant:
% opts.modeord: 0 (CMCL increasing mode ordering, default), 1 (FFT ordering)
% opts.chkbnds: 0 (don't check NU points valid), 1 (do, default)
% Outputs:
% plan finufft_plan object (opaque pointer)
%
% Notes:
% * For type 1 and 2, this does the FFTW planning and kernel-FT precomputation.
% * For type 3, this does very little, since the FFT sizes are not yet known.
% * Be default all threads are planned; control how many with opts.nthreads.
% * The vectorized (many vector) plan, ie ntrans>1, can be much faster
% than repeated calls with the same nonuniform points. Note that here the I/O
% data ordering is stacked rather than interleaved. See ../docs/matlab.rst
% * For more details about the opts fields, see ../docs/opts.rst
%
%
% 2) SETPTS process nonuniform points for general FINUFFT transform(s).
%
% plan.setpts(xj)
% plan.setpts(xj, yj)
% plan.setpts(xj, yj, zj)
% plan.setpts(xj, [], [], s)
% plan.setpts(xj, yj, [], s, t)
% plan.setpts(xj, yj, zj, s, t, u)
%
% When plan is a finufft_plan MATLAB object, brings in nonuniform point
% coordinates (xj,yj,zj), and additionally in the type 3 case, nonuniform
% frequency target points (s,t,u). Empty arrays may be passed in the case of
% unused dimensions. For all types, sorting is done to internally store a
% reindexing of points, and for type 3 the spreading and FFTs are planned.
% The nonuniform points may be used for multiple transforms.
%
% Inputs:
% xj vector of x-coords of all nonuniform points
% yj empty (if dim<2), or vector of y-coords of all nonuniform points
% zj empty (if dim<3), or vector of z-coords of all nonuniform points
% s vector of x-coords of all nonuniform frequency targets
% t empty (if dim<2), or vector of y-coords of all frequency targets
% u empty (if dim<3), or vector of z-coords of all frequency targets
% Input/Outputs:
% plan finufft_plan object
%
% Notes:
% * For type 1 and 2, the values in xj (and if nonempty, yj and zj) must
% lie in the interval [-3pi,3pi). For type 1 they are "sources", but for
% type 2, "targets". In contrast, for type 3 there are no restrictions other
% than the resulting size of the internal fine grids.
% * s (and t and u) are only relevant for type 3, and may be omitted otherwise
% * The matlab vectors xj,... and s,... should not be changed before calling
% future execute calls, because the plan stores only pointers to the
% arrays (they are not duplicated internally).
% * The precision (double/single) of all inputs must match that chosen at the
% plan stage using opts.floatprec, otherwise an error is raised.
%
%
% 3) EXECUTE execute single or many-vector FINUFFT transforms in a plan.
%
% result = plan.execute(data_in);
%
% For plan a previously created finufft_plan object also containing all
% needed nonuniform point coordinates, do a single (or if ntrans>1 in the
% plan stage, multiple) NUFFT transform(s), with the strengths or Fourier
% coefficient inputs vector(s) from data_in. The result of the transform(s)
% is returned as a (possibly multidimensional) array.
%
% Inputs:
% plan finufft_plan object
% data_in strengths (types 1 or 3) or Fourier coefficients (type 2)
% vector, matrix, or array of appropriate size. For type 1 and 3,
% this is either a length-M vector (where M is the length of xj),
% or an (M,ntrans) matrix when ntrans>1. For type 2, in 1D this is
% length-ms, in 2D size (ms,mt), or in 3D size (ms,mt,mu), or
% each of these with an extra last dimension ntrans if ntrans>1.
% Outputs:
% result vector of output strengths at targets (types 2 or 3), or array
% of Fourier coefficients (type 1), or, if ntrans>1, a stack of
% such vectors or arrays, of appropriate size.
% Specifically, if ntrans=1, for type 1, in 1D
% this is a length-ms column vector, in 2D a matrix of size
% (ms,mt), or in 3D an array of size (ms,mt,mu); for types 2 and 3
% it is a column vector of length M (the length of xj in type 2),
% or nk (the length of s in type 3). If ntrans>1 its is a stack
% of such objects, ie, it has an extra last dimension ntrans.
%
% Notes:
% * The precision (double/single) of all inputs must match that chosen at the
% plan stage using opts.floatprec, otherwise an error is raised.
%
%
% 4) To deallocate (delete) a nonuniform FFT plan, use delete(plan)
%
% This deallocates all stored FFTW plans, nonuniform point sorting arrays,
% kernel Fourier transforms arrays, etc.
classdef finufft_plan < handle

properties
Expand Down

0 comments on commit 9afe77d

Please sign in to comment.