GSL integration produces different results if adding a printing statement to the integrand in X-PSI installed in Lisa #270

thjsal · 2023-02-07T12:42:11Z

This might be hardware specific problem, but reporting it still here (have not managed to re-produce this same problem in other systems than in Lisa cluster so far):

The integration over this function:

xpsi/xpsi/likelihoods/default_background_marginalisation.pyx

Line 80 in 4760c72

cdef double marginal_integrand(double B, void *params) nogil:

using GSLgsl_integration_cquadfunction here:

xpsi/xpsi/likelihoods/default_background_marginalisation.pyx

Line 503 in 4760c72

gsl_integration_cquad(&f, lower, upper,

gives a different result, depending on whether a print statement (just printf("????")) is added somewhere inside or before the for-loop of that function. Without adding the print statement, the result is in contradiction with the result obtained in other systems. After adding the print statement the result agrees with what is obtained elsewhere.

Here are the input values for the gsl_integration_cquad function (same in both cases):
lower,upper,epsabs,epsrel,w=0.00452244, 0.00638843, 0.00000000, 0.00000001, 1227.38282439
And here are the output values without the print statement in Lisa:
result, abserr, nevals = 0.00000000, 0.00000000, 33
And here are the output values with the print statement in Lisa, and always in other systems:
result, abserr, nevals = 0.00023387, 0.00000000, 299

This difference causes a big effect on the final result since in the former case, likelihood evaluation is not performed to the end, but a negative number close to minus infinity is returned instead (there is an if-condition requiring that the result variable must be larger than zero).

This problem happens when running examples/examples_modeling_tutorial/TestRun_BB.py in Lisa with X-PSI version 2.0.0-rc, GSL version 2.7, and Cython version 0.29.32.

Would need to still separate the function from X-PSI to test if the problem exists also without X-PSI.

The text was updated successfully, but these errors were encountered:

sguillot · 2023-02-21T17:39:58Z

If you have a suggestion to test the gsl_integration_cquad (or marginal_integrand) function separate from X-PSI, I'll try it

ychatonnier · 2023-02-24T10:13:09Z

I found this:
https://stackoverflow.com/questions/28085423/the-notorious-printf-fix
it seems this magic printf fix is known, there are some answer track in this post.

thjsal · 2023-02-27T10:51:26Z

@sguillot : I was thinking of making a simple Cython program that runs this same GSL integration (over the same marginal_integrand function) with the same input parameters I shared above. And see if that integration still gives a different result depending on whether or not having that print statement inside the integrand. Before the integration, we need probably also call the gsl_integration_cquad_workspace_alloc() to get it working.

@ychatonnier: Thanks for sharing this discussion! I think have also before encountered/heard about issues like this. Unlike there, however, our code goes back to "non-working" when taking the print statement away.

thjsal · 2023-03-18T15:43:20Z

Looks like in Lisa this problem only occurs when installing X-PSI using Intel-compiler. If installing using GCC, I get the correct answer without additional printings. I am still working on a simple test program. I noticed I still need to find out the values saved in a.data, a.n, a.SCALE, a.A, and a.star before calling this function and see if they are always the same.

EDIT: Values of these input parameters seem to be always the same, and the following:

 a.SCALE = 30759.61456563
 a.n =  32
 a.A = 39027.92894509

For a.data and a.star .... see the DATA and STAR arrays from the test program (in the zipped file) I posted in the next message.

thjsal · 2023-03-19T22:46:23Z

test_gsl.zip
Attached some simplified test program. Can be compiled by either by:
LDSHARED="icc -shared" CC=icc python setup.py install
or
CC=gcc python setup.py install
And then tested by running:
python test_gsl_py.py

Based on running this test in Lisa, it seems that the problem is indeed something related to GSL, or how it works together with Intel-compilers. If compiling the test program with gcc, the output is always the same. If compiling the test program with icc, I get the (presumably) correct output only with the "printf trick" (uncommenting the line 66 in test_gsl.pyx).

I also tried different versions of GSL in Lisa, one already existing in the cluster (version called GSL/2.7-GCC-11.3.0), some installed myself using icc, but the problem seems to exist regardless of GSL version and the way of installing it.

On the other hand, in Snellius cluster Intel compiler seems to still work fine with GSL (i.e. no printf tricks needed), at least with the GSL version 2.5 that already natively exists there, and also with the latest GSL version (2.7.1) installed from the source (same that did not work for Lisa). When installing that in Snellius, however, I had to add some additional "-fPIC" and "-host" flags ../configure FC=ifort CC=icc CFLAGS='-O3 -xAVX -axCORE-AVX2 -mieee-fp -fPIC -funroll-loops' --prefix=$HOME/gsl -host=x86_64, to make the test program compilation to work (following otherwise instructions here: https://github.com/xpsi-group/xpsi/blob/v0.7.12/docs/source/SURFsara_systems.rst).

Intel compiler version should be the same both in Lisa and Snellius (icc (ICC) 2021.6.0 20220226).

thjsal · 2023-03-30T15:16:08Z

So after a lot of testing, it appears the problem is caused by over-aggressive optimization, caused by numerical issues in the integrand function of X-PSI, when using Intel-compilers on Intel CPUs (which are used in Lisa but not in Snellius). The suggested solution to make integration and likelihood calculation work on all systems is provided in the linked pull request.

thjsal self-assigned this Mar 18, 2023

thjsal linked a pull request Mar 30, 2023 that will close this issue

Gsl integration bugfix #300

Merged

thjsal closed this as completed in #300 Apr 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GSL integration produces different results if adding a printing statement to the integrand in X-PSI installed in Lisa #270

GSL integration produces different results if adding a printing statement to the integrand in X-PSI installed in Lisa #270

thjsal commented Feb 7, 2023 •

edited

Loading

sguillot commented Feb 21, 2023

ychatonnier commented Feb 24, 2023

thjsal commented Feb 27, 2023

thjsal commented Mar 18, 2023 •

edited

Loading

thjsal commented Mar 19, 2023 •

edited

Loading

thjsal commented Mar 30, 2023 •

edited

Loading

GSL integration produces different results if adding a printing statement to the integrand in X-PSI installed in Lisa #270

GSL integration produces different results if adding a printing statement to the integrand in X-PSI installed in Lisa #270

Comments

thjsal commented Feb 7, 2023 • edited Loading

sguillot commented Feb 21, 2023

ychatonnier commented Feb 24, 2023

thjsal commented Feb 27, 2023

thjsal commented Mar 18, 2023 • edited Loading

thjsal commented Mar 19, 2023 • edited Loading

thjsal commented Mar 30, 2023 • edited Loading

thjsal commented Feb 7, 2023 •

edited

Loading

thjsal commented Mar 18, 2023 •

edited

Loading

thjsal commented Mar 19, 2023 •

edited

Loading

thjsal commented Mar 30, 2023 •

edited

Loading