Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mod2Pi Test failure in 0.31 and master branch when compiled with Intel tool chain and MKL #8799

Closed
starrynight opened this issue Oct 24, 2014 · 17 comments
Labels
building Build system, or building Julia or its dependencies

Comments

@starrynight
Copy link

Environment: Ubuntu 14.04 icc 15.0 ifort 15.0 mkl 11.2
source /opt/intel/composerxe/bin/compilervars.sh intel64
source /opt/intel/mkl/bin/mklvars.sh intel64 ilp64
export LD=/opt/intel/composerxe/bin/xild
export AR=/opt/intel/composerxe/bin/xiar
export MKL_INTERFACE_LAYER=ILP64
export MKLROOT = /opt/intel/composerxe/mkl

Make.user:
USE_INTEL_MKL = 1
USE_INTEL_MKL_FFT = 1
USE_INTEL_LIBM = 1
USEICC = 1
USEIFC = 1

Built succesful. However, when running make test, it stopped at mod2pi unit test

ERROR: assertion failed: |totalErrNew - 0.0| <= 1.7763568394002505e-11
totalErrNew = 12.56641004881809
0.0 = 0.0
difference = 12.56641004881809 > 1.7763568394002505e-11
in error at error.jl:22
in test_approx_eq at test.jl:109
in testModPi at mod2pi.jl:182
in runtests at /home/mickey/Sources_ML/julia/test/testdefs.jl:5
in anonymous at multi.jl:855
in run_work_thunk at multi.jl:621
in anonymous at task.jl:855
while loading mod2pi.jl, in expression starting on line 184
ERROR: assertion failed: |totalErrNew - 0.0| <= 1.7763568394002505e-11
totalErrNew = 12.56641004881809
0.0 = 0.0
difference = 12.56641004881809 > 1.7763568394002505e-11
in anonymous at task.jl:1367
while loading mod2pi.jl, in expression starting on line 184
while loading /home/mickey/Sources_ML/julia/test/runtests.jl, in expression starting on line 35

@ViralBShah
Copy link
Member

Does this happen on master too? I have icc 14 where I had tried everything before. Trying the 0.3 release branch now.

@ViralBShah ViralBShah added the building Build system, or building Julia or its dependencies label Oct 24, 2014
@starrynight
Copy link
Author

Yes, it happens on master too. I've tried on 0.32 as well but 0.32 has some other errors. So far, 0.31 and master are the branches giving minimum errors.

@jiahao
Copy link
Member

jiahao commented Feb 13, 2015

The problem lies in Base.Math.ieee754_rem_pio2, which is a wrapper around libopenspecfun:__ieee754_rem_pio2.

~/julia-release/julia -e 'println(Base.Math.ieee754_rem_pio2(17.0))'
(11,[-0.2787595947438628,-7.421924678439073e-18])
ᐅ  ~/julia-intel/julia -e 'println(Base.Math.ieee754_rem_pio2(17.0))' 
(10,[-2.782002278622783e-16,0.0])

@ViralBShah
Copy link
Member

It would be nice to have a julia implementation. We have other problems with rem_pio2 as well in openlibm, where it produces the wrong result on i486.

@simonbyrne
Copy link
Contributor

I agree on having a Julia implementation: it would also avoid the overhead of allocating a 2-element array on each call.

@tkelman
Copy link
Contributor

tkelman commented Feb 14, 2015

It's not in openlibm any more, but presumably moving it to openspecfun didn't fix it on i486. Replacing it with a Julia implementation would be a good way of making openspecfun smaller (along with #8536). Openspecfun probably doesn't need to exist forever as an independent library, especially since we're not far at all from really only needing Faddeeva from it.

@simonbyrne
Copy link
Contributor

If anyone really wants to debug this, it seems like these are the lines causing the problem:
https://github.com/JuliaLang/openspecfun/blob/381db9bc865e51de67be9dcaa1610a6f90029c72/rem_pio2/e_rem_pio2.c#L132-L138

My usual guess for things like this are unexpected use of x87 80bit floats: one option would be to try playing around with compiler options, and see if that fixes it.

I've made a start on translating it to Julia here:
https://github.com/simonbyrne/libm.jl/blob/master/src/rempio2.jl

It still requires an implementation of the Payne-Hanek reduction for large numbers (I might not have much time to look at this for a week or two), but it at least solves the above issue.

@ViralBShah
Copy link
Member

Glenn Chisholm mentioned that this can be fixed with

CFLAGS = -mieee-fp -std=c99 -Wall -O3

@ViralBShah
Copy link
Member

I don't think icc 14 has -mieee-fp. We already use -fp-model precise, but this is not passed to openlibm and openspecfun. It is worth trying that to fix this issue.

@gchisholm
Copy link

Decided to run it quickly but yes:

CFLAGS = -fp-model precise -std=c99 -Wall -O3

[julia]$ ./julia test/runtests.jl mod2pi
* mod2pi
SUCCESS

[julia]$ icc -v
icc version 15.0.1 (gcc version 4.8.2 compatibility)

@nolta nolta closed this as completed in 3ab9af1 May 8, 2015
nolta added a commit that referenced this issue May 14, 2015
(cherry picked from commit 3ab9af1)

Conflicts:
	deps/Makefile
@tkelman
Copy link
Contributor

tkelman commented May 14, 2015

backported the fix to release-0.3 in 7c42e5a

@tkelman
Copy link
Contributor

tkelman commented May 14, 2015

@garrison
Copy link
Member

Thanks @tkelman!

@waldyrious
Copy link
Contributor

We have other problems with rem_pio2 as well in openlibm, where it produces the wrong result on i486.

Ref: JuliaMath/openlibm#57

It's not in openlibm any more, but presumably moving it to openspecfun didn't fix it on i486.

So, should the issue above be closed / copied over to the openspecfun repo?

Replacing it with a Julia implementation would be a good way of making openspecfun smaller

Am I reading you correctly that this would solve the problem mentioned above? And if so, is there an issue tracking this conversion?

@tkelman
Copy link
Contributor

tkelman commented Jun 1, 2015

So, should the issue above be closed / copied over to the openspecfun repo?

Yes.

Am I reading you correctly that this would solve the problem mentioned above?

Which problem? The i486 problem? Hard to say, we start depending on whether LLVM will handle everything correctly.

And if so, is there an issue tracking this conversion?

Don't think so. @simonbyrne has some initial code apparently, but missing some pieces.

@simonbyrne
Copy link
Contributor

While I often enjoy a bit of paleo-computing myself, I'm not sure that this is worth spending a lot of time on. The problem is almost certainly an extended-precision issue on the x87. You would need to dig through the intel architecture manual to see what operations have changed over various processor iterations.

I don't know if implementing in Julia would help: technically, LLVM only supports pentium and later.

As a fallback, you could switch to the assembly implementations of the trig functions: the argument reductions for these aren't quite as accurate as those in the C code, but it should be fine for "reasonable" values.

@waldyrious
Copy link
Contributor

I'll leave it to you guys to decide the best course of action regarding issue JuliaMath/openlibm#57, then. Although I initially reported it, it's not a critical issue for me.

mbauman pushed a commit to mbauman/julia that referenced this issue Jun 6, 2015
tkelman pushed a commit to tkelman/julia that referenced this issue Jun 6, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
building Build system, or building Julia or its dependencies
Projects
None yet
Development

No branches or pull requests

8 participants