-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spreadinterponly (CPU and GPU) #602
Open
ahbarnett
wants to merge
30
commits into
master
Choose a base branch
from
spreadinterponly
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
30 commits
Select commit
Hold shift + click to select a range
838245b
WIP
chaithyagr a033f3b
Added support to do spread interp only
chaithyagr a67b717
double free and wrap
chaithyagr d77cbb1
Remove unwanted creation
chaithyagr 2094338
Remove unwanted resizing
chaithyagr d0d60fe
Working codes with mri-nufft
chaithyagr 305482b
Fixes, update API
chaithyagr 09a9d0c
remove unwanted changes
chaithyagr 90a0675
remove span
chaithyagr 85e1c4e
WIP: merge ChaithyaGR CPU spreadinterponly, tidy up, add to fort, mat…
ahbarnett 2e490a1
got chaithya CPU spreadinterp working, added example; couple minor GP…
ahbarnett 8170567
better doc example/spreadinterponly.cpp
ahbarnett 0d4c0e4
added test (and into CI) for CPU spreadinterponly=1
ahbarnett 244791f
test spreadinterponly give utils namespace to fix windows CI which ne…
ahbarnett d8f42d6
setup_spreader needed to know if spreadinterponly=1 to switch off lar…
ahbarnett 29940c2
doc CPU opts.spreadinterponly
ahbarnett d13dade
spreadtest fix extra arg of setup_spreader()
ahbarnett 2b8786f
comment in example
ahbarnett 587ddf2
2d spreadinterponly matlab demo with plot
ahbarnett bd96294
tweak opts.h
ahbarnett 89deb03
clarify cpu spreadinterponly behavior in docs
ahbarnett 175ef83
merge in matlab opts12 snafu from master
ahbarnett 2acdb7a
add utils to test_defs.h, fixing spreadinterp1d_test build
ahbarnett b43e3a5
remove err code 23, change sionly logic in setup_spreader, correct GP…
ahbarnett 295cd10
added debug output to GPU setup_spreader, as CPU
ahbarnett 12d1760
sionly changelog
ahbarnett d5b2bad
paren typo in cuda spreadinterp.cpp
ahbarnett c4a78dd
2nd paren typo in cuda spreadinterp.cpp
ahbarnett 10342bb
set nf1=N1, etc, when gpu_spreadinterponly=1
ahbarnett ba1cc8a
gpu_sionly typo in gpu impl.h; sorry for using Jenkins as a debugger;…
ahbarnett File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
// this is all you must include for the finufft lib... | ||
#include <finufft.h> | ||
|
||
// also used in this example... | ||
#include <cassert> | ||
#include <chrono> | ||
#include <complex> | ||
#include <cstdio> | ||
#include <stdlib.h> | ||
#include <vector> | ||
using namespace std; | ||
using namespace std::chrono; | ||
|
||
int main(int argc, char *argv[]) | ||
/* Example of double-prec spread/interp only tasks, with basic math tests. | ||
Complex I/O arrays, but recall the kernel is real. Barnett 1/8/25. | ||
|
||
The math tests are: | ||
1) for spread, check sum of spread kernel masses is as expected from sum | ||
of strengths (ie testing the zero-frequency component in NUFFT). | ||
2) for interp, check each interp kernel mass is the same as from one. | ||
|
||
Without knowing the kernel, this is about all that can be done! | ||
(Better math tests would be, ironically, to wrap the spreader/interpolator | ||
into a NUFFT and test that :) But we already have that in FINUFFT.) | ||
|
||
Compile and run (static library case): | ||
|
||
g++ spreadinterponly1d.cpp -I../include ../lib-static/libfinufft.a -o | ||
spreadinterponly1d -lfftw3 -lfftw3_omp && ./spreadinterponly1d | ||
|
||
See: spreadtestnd for usage of internal (non FINUFFT-API) spread/interp. | ||
*/ | ||
{ | ||
int M = 1e7; // number of nonuniform points | ||
int N = 1e7; // size of regular grid | ||
finufft_opts opts; | ||
finufft_default_opts(&opts); | ||
opts.spreadinterponly = 1; // task: the following two control kernel used... | ||
double tol = 1e-9; // tolerance for (real) kernel shape design only | ||
opts.upsampfac = 2.0; // pretend upsampling factor (really no upsampling) | ||
// opts.spread_kerevalmeth = 0; // would be needed for any nonstd upsampfac | ||
|
||
complex<double> I = complex<double>(0.0, 1.0); // the imaginary unit | ||
vector<double> x(M); // input | ||
vector<complex<double>> c(M); // input | ||
vector<complex<double>> F(N); // output (spread to this array) | ||
|
||
// first spread M=1 single unit-strength at the origin, only to get its total mass... | ||
x[0] = 0.0; | ||
c[0] = 1.0; | ||
int unused = 1; | ||
int ier = finufft1d1(1, &x[0], &c[0], unused, tol, N, &F[0], &opts); // warm-up | ||
if (ier > 1) return ier; | ||
complex<double> kersum = 0.0; | ||
for (auto Fk : F) kersum += Fk; // kernel mass | ||
|
||
// Now generate random nonuniform points (x) and complex strengths (c)... | ||
for (int j = 0; j < M; ++j) { | ||
x[j] = M_PI * (2 * ((double)rand() / RAND_MAX) - 1); // uniform random in [-pi,pi) | ||
c[j] = | ||
2 * ((double)rand() / RAND_MAX) - 1 + I * (2 * ((double)rand() / RAND_MAX) - 1); | ||
} | ||
|
||
opts.debug = 1; | ||
auto t0 = steady_clock::now(); // now spread with all M pts... (dir=1) | ||
ier = finufft1d1(M, &x[0], &c[0], unused, tol, N, &F[0], &opts); // do it | ||
double t = (steady_clock::now() - t0) / 1.0s; | ||
if (ier > 1) return ier; | ||
complex<double> csum = 0.0; // tot input strength | ||
for (auto cj : c) csum += cj; | ||
complex<double> mass = 0.0; // tot output mass | ||
for (auto Fk : F) mass += Fk; | ||
double relerr = abs(mass - kersum * csum) / abs(mass); | ||
printf("1D spread-only, double-prec, %.3g s (%.3g NU pt/sec), ier=%d, mass err %.3g\n", | ||
t, M / t, ier, relerr); | ||
|
||
for (auto &Fk : F) Fk = complex<double>{1.0, 0.0}; // unit grid input | ||
opts.debug = 0; | ||
t0 = steady_clock::now(); // now interp to all M pts... (dir=2) | ||
ier = finufft1d2(M, &x[0], &c[0], unused, tol, N, &F[0], &opts); // do it | ||
t = (steady_clock::now() - t0) / 1.0s; | ||
if (ier > 1) return ier; | ||
csum = 0.0; // tot output | ||
for (auto cj : c) csum += cj; | ||
double maxerr = 0.0; | ||
for (auto cj : c) maxerr = max(maxerr, abs(cj - kersum)); | ||
printf("1D interp-only, double-prec, %.3g s (%.3g NU pt/sec), ier=%d, max err %.3g\n", | ||
t, M / t, ier, maxerr / abs(kersum)); | ||
return 0; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chaithyagr ...and here is the corresponding GPU code. At the risk of repetition, since the user allocates an N1*N2 output array, spreading could not write to any other size without segfault. Agreed?