Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RANLUX++: Add compatibility engines #8383

Merged
merged 1 commit into from
Jun 23, 2021

Conversation

hahnjo
Copy link
Member

@hahnjo hahnjo commented Jun 9, 2021

These engines can be used to obtain the same sequences of numbers as RANLUX generators using recursive subtract-with-borrow steps, but with enhanced performance. Apart from the choice of parameters, the main difference between the various implementations is the way of seeding the initial state of the generator.

This commit includes engines for compatibility with:

  • the original implementation by Fred James, with parameters for
    • luxury level 3 (p = 223), also matching gsl_rng_ranlux
    • luxury level 4 (p = 389), also matching gsl_rng_ranlux389 producing floating point numbers from 24 bits of randomness;
  • the family of generators using a second-generation version of the RANLUX algorithm as implemented in the GNU Scientific Library:
    • gsl_rng_ranlxs[012] using 24 bits per floating point number, and
    • gsl_rng_ranlxd[12] using 48 bits per floating point number;
  • the implementation by Martin Lüscher written in C that uses four states per generator; similar to GSL, there are ranlxs[012] with 24 bits per number and ranlxd[12] with 48 bits per number; and
  • the generators std::ranlux{24,48} defined by the C++ standard.

The values in the tests were extracted directly from the mentioned implementations, showing that the LCG implementation is equivalent to the RANLUX algorithm.

I am not adding compatibility engines for CLHEP because its semantics are very weird: While CLHEP::RanluxEngine::setSeed yields the same sequences as the original implementation by James, the seed is treated differently when passed as an argument to the constructor.

These engines can be used to obtain the same sequences of numbers
as RANLUX generators using recursive subtract-with-borrow steps,
but with enhanced performance. Apart from the choice of parameters,
the main difference between the various implementations is the way
of seeding the initial state of the generator.

This commit includes engines for compatibility with:
 * the original implementation by Fred James, with parameters for
   - luxury level 3 (p = 223), also matching gsl_rng_ranlux
   - luxury level 4 (p = 389), also matching gsl_rng_ranlux389
   producing floating point numbers from 24 bits of randomness;
 * the family of generators using a second-generation version of the
   RANLUX algorithm as implemented in the GNU Scientific Library:
   - gsl_rng_ranlxs[012] using 24 bits per floating point number, and
   - gsl_rng_ranlxd[12] using 48 bits per floating point number;
 * the implementation by Martin Lüscher written in C that uses four
   states per generator; similar to GSL, there are ranlxs[012] with
   24 bits per number and ranlxd[12] with 48 bits per number; and
 * the generators std::ranlux{24,48} defined by the C++ standard.

The values in the tests were extracted directly from the mentioned
implementations, showing that the LCG implementation is equivalent
to the RANLUX algorithm.

I am not adding compatibility engines for CLHEP because its semantics
are very weird: While CLHEP::RanluxEngine::setSeed yields the same
sequences as the original implementation by James, the seed is treated
differently when passed as an argument to the constructor.
@hahnjo hahnjo requested review from Axel-Naumann and lmoneta June 9, 2021 11:54
@hahnjo hahnjo self-assigned this Jun 9, 2021
@phsft-bot
Copy link
Collaborator

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

@hahnjo
Copy link
Member Author

hahnjo commented Jun 9, 2021

To elaborate a bit on CLHEP: RanluxppCompatEngineJamesP3 rng(314159265) yields the same sequence as

CLHEP::RanluxEngine r;
r.setSeed(314159265);

but directly passing the seed to the constructor à la CLHEP::RanluxEngine r(314159265) gives different numbers. The reason is that the constructor, after invoking setSeed which works as documented, also calls setSeeds with the given seed parameter as the only entry in the seed table. That procedure is subtly different and could be mimicked as follows:

diff --git a/math/mathcore/src/RanluxppEngineImpl.cxx b/math/mathcore/src/RanluxppEngineImpl.cxx
index 100f8d8638..bbf508a6a8 100644
--- a/math/mathcore/src/RanluxppEngineImpl.cxx
+++ b/math/mathcore/src/RanluxppEngineImpl.cxx
@@ -219,13 +219,14 @@ public:
       // Multiplicative Congruential generator using formula constants of L'Ecuyer
       // as described in "A review of pseudorandom number generators" (Fred James)
       // published in Computer Physics Communications 60 (1990) pages 329-344.
-      int64_t seed = s;
+      int64_t seed = s & 0xffffff;
       auto next = [&]() {
          const int a = 0xd1a4, b = 0x9c4e, c = 0x2fb3, d = 0x7fffffab;
+         int64_t oldSeed = seed;
          int64_t k = seed / a;
          seed = b * (seed - k * a) - k * c ;
          if (seed < 0) seed += d;
-         return seed & 0xffffff;
+         return oldSeed & 0xffffff;
       };
 
       // Iteration is reversed because the first number from the MCG goes to the

That would add compatibility for the constructor, but leave no way to call SetSeed on an existing object. Moreover this scheme only uses the lower 24 bits of the user's seed...

@Axel-Naumann
Copy link
Member

Can you motivate why we should include those in ROOT's interface? I understand the motivation for testing! I'm sure you have a good reason to also expose them, I'd just like to see the reasons :-)

@Axel-Naumann Axel-Naumann removed their request for review June 9, 2021 12:48
@hahnjo
Copy link
Member Author

hahnjo commented Jun 9, 2021

@Axel-Naumann yes, testing is one of the motivations, in particular continuous testing to prevent future regressions (now we can check against an external implementation, instead of just copying the current values and declaring them "known-good").

The other reason, and why I think this might provide benefit for users, is performance: The original RANLUX implementation by James (at least its implementation in GSL) needs 40 seconds to sum 1 million numbers at luxury level 3, gsl_rng_ranlux389 (luxury level 4) takes a bit more than 1 minute. The same sequence takes less than 8 seconds with RanluxppCompatEngineJamesP[34], respectively (due to the LCG, you don't even pay for higher decorrelation!).
The difference is even larger for std::ranlux48 (used directly, not through std::uniform_real_distribution which eats up more than one number per iteration): 2m55s compared to 12 seconds with RanluxppCompatEngineStdRanlux48. And because we can generate the same sequence, switching the generator won't change the output of a simulation / analysis / ... (only the interface is slightly different). Plus the users get the possibility to skip in the very same sequence without generating the intermediate numbers.

Now we could argue that all users should switch to RanluxppEngine2048, which on top of that provides better seeding and even higher decorrelation. On the other hand, the implementations above have been around for some time now and are so widely available (std::ranlux{24,48} comes with any C++ compiler) that they will remain used...

@hahnjo
Copy link
Member Author

hahnjo commented Jun 23, 2021

ping @lmoneta

Copy link
Member

@lmoneta lmoneta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
I agree that is good exposing the compatible engines who can generate the same sequences as the old implementations but faster.
Very nice contribution!

@hahnjo hahnjo merged commit 86f17eb into root-project:master Jun 23, 2021
@hahnjo hahnjo deleted the RANLUX++-compat branch June 23, 2021 15:43
pzhristov pushed a commit to alisw/root that referenced this pull request Aug 27, 2021
These engines can be used to obtain the same sequences of numbers
as RANLUX generators using recursive subtract-with-borrow steps,
but with enhanced performance. Apart from the choice of parameters,
the main difference between the various implementations is the way
of seeding the initial state of the generator.

This commit includes engines for compatibility with:
 * the original implementation by Fred James, with parameters for
   - luxury level 3 (p = 223), also matching gsl_rng_ranlux
   - luxury level 4 (p = 389), also matching gsl_rng_ranlux389
   producing floating point numbers from 24 bits of randomness;
 * the family of generators using a second-generation version of the
   RANLUX algorithm as implemented in the GNU Scientific Library:
   - gsl_rng_ranlxs[012] using 24 bits per floating point number, and
   - gsl_rng_ranlxd[12] using 48 bits per floating point number;
 * the implementation by Martin Lüscher written in C that uses four
   states per generator; similar to GSL, there are ranlxs[012] with
   24 bits per number and ranlxd[12] with 48 bits per number; and
 * the generators std::ranlux{24,48} defined by the C++ standard.

The values in the tests were extracted directly from the mentioned
implementations, showing that the LCG implementation is equivalent
to the RANLUX algorithm.

I am not adding compatibility engines for CLHEP because its semantics
are very weird: While CLHEP::RanluxEngine::setSeed yields the same
sequences as the original implementation by James, the seed is treated
differently when passed as an argument to the constructor.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants