Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Technical issues found during the switch to templates #484

Closed
mreineck opened this issue Jul 17, 2024 · 14 comments
Closed

Technical issues found during the switch to templates #484

mreineck opened this issue Jul 17, 2024 · 14 comments

Comments

@mreineck
Copy link
Collaborator

While introducing the templates I encountered a few strange things I'd like a second opinion on...

  • in spreadinterp.h, a function spreadinterp is declared, but it is not used anywhere in the library. But there are some test codes (perftest/spreadtestnd.cpp and perftest/spreadtestndall.cpp) which are still calling this routine. Is this obsolete or should it be kept?
  • some test codes call FFTW-specific functions for forgetting wisdom and cleaning up internal state. I gather that this is mostly used during benchmarking, when FFTW should be forced to forget its internal state completely. I could provide this by adding a function like forgetInternalState() to the FFT interface. Would that be a useful change? Leaving things as they are now feels a bit awkward.
  • spreadinterp.cpp has a prologue of about 100 lines that forward-declare things which are defined later on anyway. Would you be OK with me bringing the contents of the file in the right order, such that all these declarations are no longer necessary? I'm asking because the introduction of the templates will make this rather uglier than nicer.

Apart from this, I think I see a clear way towards the introduction of templates. The individual steps would be:

@DiamonDinoia
Copy link
Collaborator

DiamonDinoia commented Jul 17, 2024

Hi Martin,

Thanks for looking into this. Answers:

  1. we usespreadinterp, at some point we might provide FINUFFT_MEASURE (or similar) where we'll be likely to use that to gather data.
  2. I think is a good idea to wrap FFT specific functionality and abstract them away
  3. I do not have any opinion on forward declaring I think the code is understandable in any order. If you believe is better to rid of those go ahead. The file also got way too big we should consider splitting it into 3 files: kernel, spreader and interp as compilation now takes way too long (no parallelism...)

@mreineck
Copy link
Collaborator Author

Thanks for the answers!

Concerning 3), I fully agree that this should be split at some point. If that is planned anyway, then I won't reorder the file now; all the cosmetics can be done at a later point.

@DiamonDinoia
Copy link
Collaborator

DiamonDinoia commented Jul 17, 2024

Maybe worth splitting it before reordering? My issue on splitting is the makefile... Any change has to be propagate there

@mreineck
Copy link
Collaborator Author

I agree with splitting before reordering. My point was that this should probably be postponed until after templatization, to limit the amount of potential chaos :-)

@DiamonDinoia DiamonDinoia added this to the 3.0 milestone Jul 17, 2024
@ahbarnett
Copy link
Collaborator

ahbarnett commented Jul 17, 2024 via email

@mreineck
Copy link
Collaborator Author

Random numbers are only used in test codes; I feel that is a distraction from the main goal. We should resist the temptation to try and clean up all aspects of the code at once. Gradual change is better!

I agree in principle, but in this particular case I had to do something. By switching from a macro containing a call to rand_r to an inline function with a call to rand_r, the compiler got to actually see it, and on Windows rand_r doesn't exist any more, so I got CI failures. Rather than patching around the issue I thought I'd do the proper switch right away, especially since there was a comment in the sources that this should happen at some point.

@ahbarnett
Copy link
Collaborator

ahbarnett commented Jul 17, 2024 via email

@DiamonDinoia
Copy link
Collaborator

Wenda suggested using random123 #200 which is an external dependency. I do not think is a bad idea to use a faster generator for the tests. cmake can pull it for test only. I personally implemented xoshiro in Google Highway at some point which I think is still the fastest rng on CPU: https://github.com/google/highway/tree/master/hwy/contrib/random

But before that, std::minstd_rand might be fast enough. https://en.cppreference.com/w/cpp/numeric/random

@mreineck
Copy link
Collaborator Author

I'd not use anything complicated for just running unit tests; std::minstd_rand should do the trick just fine.

But anyway, let's skip the RNG pull request; once the 1/n PR is in, I'll prepare follow-ups without the RNG changes.

@DiamonDinoia DiamonDinoia removed this from the 2.3 milestone Jul 17, 2024
@ahbarnett
Copy link
Collaborator

ahbarnett commented Jul 17, 2024 via email

@mreineck
Copy link
Collaborator Author

The multithreaded random number generation can stay in any case and is used with the standard C++ RNGs in the changes I made. I don't think it's possible to hide that layer of complexity fro the users ... they have to know how exactly the random numbers are generated.

@mreineck
Copy link
Collaborator Author

Adjusting finufft.cpp and simple_interfaces.cpp turns out to be a deep rabbit hole. Complications are caused for example by

  • having two slightly different sets of macro magic (one in finufft.h and friends, one in defs.h)
  • having precision-independent and precision-dependent declarations mixed in defs.h
  • practically all tests relying on knowledge of implementation details by including defs.h

So the last step of this adjustment will touch very many files, and potentially change them a lot. Some of the header files will have to be splt.
Also, as a byproduct of the change, we get a "natural" C++ interface to the library, and should probably think how it should look like exactly. The rough outline will be something like

Finufft_plan<double> plan(opts);
plan.setpts(x,y,z); // arguments can be pointers or vectors
plan.execute(uni, nonuni); // arguments can be pointers or vectors
// no need to destroy the plan, this is done by the destructor

Anyway this is not urgent, since 2.3 has to be released first. I just wanted to point this out as I found it.

@DiamonDinoia
Copy link
Collaborator

Having a nice c++ interface would be nice. We can always use it internally until it is stable enough and release it in a future maybe major version.

We should consider that CPU/GPU API should be consistent so these changes needs to be propagated before release.

I like the way it looks, then we can also allow mixed precision with can help speedup especially GPU code.

@DiamonDinoia
Copy link
Collaborator

DiamonDinoia commented Jul 18, 2024

@mreineck, I would like to provide attention to is that we use -fPIC in finufft quite often. This causes the compiler not to inline some functions causing a performance overhead source. When changing let's make symbols static of in anonymous namespaces so that does not happen. Not sure if there are any compilation tricks that we can pull here, like building all the internal functionality in a separate lib with no -fPIC and static link against it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants