-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack overflow segfault in getrf_parallel using Breeze / Netlib-Java on Debian 8 Jessie #1082
Comments
You mentioned 4 products. Can you clarify which are installed from Debian repositories, and which are unzipped by you. Debian builds pthread openblas and patches out thread safety warning (oops) One fix is to build your own openblas 0.2.10 - OpenMP or no threading (no thrading gives better control to SCALA, which is smart enough to partition computations to many CPUs) Other fix is to convince debian openblas maintainer to rebuild a thread-safe i.e. OpenMP openblas, copying what Ubuntu does. |
If you are prepared to build your own OpenBLAS you could reduce the (probably arbitrary) limit of MAX_CPU_NUMBER introduced by 5d33121 to something like 16 and see if this is the only problem in your context. (Not sure how big the overhead of malloc is across all supported architectures). (Alternatively, wouldn't running java with -Xss4m or thereabouts take care of this as well, or is this no longer an option in JDK 8 ?) |
I installed OpenBLAS and OpenJDK 8 from Jessie (Backports?) I also installed OpenBLAS 0.2.19 from Sid and built my own 0.2.12/0.2.19 (all repro-ed with NUM_THREADS=64 and USE_OPENMP=0). Scala was downloaded off their website and Breeze & Netlib Java came from building Apache Spark 2.0.2 from source. All of the above fixes / workarounds work. My objection is more that something simple breaks badly out of the box. If the consensus of this thread is that OpenMP is the correct solution, I'd be happy to file an issue on Debian requesting they build with it in future. |
It should be fairly easy to check your scala code on Ubuntu that it yields no pain out of the box. |
From https://anonscm.debian.org/git/debian-science/packages/openblas.git/tree/debian/patches?h=debian/0.2.12-1 5d33121 is not in this debian package. |
Thank you for checking. I suspect even with 0.2.19 or develop the current default threshold value of 80 (meanwhile centrally set in common.h) will exceed the default java stacksize of 1m, given that for OS_WINDOWS it is explicitly set to 32 citing the 1m stack limit for that platform. As I do not see how OpenBLAS could know at build time that it is going to be used in a java context later, my gut feeling is that telling java to increase its stacksize is the way to go here(?). |
There is no way to detect stack size at runtime. No rlimit in java either. |
My point was just that I think running java code that links to OpenBLAS as "java -Xss4m" is preferrable to lowering the default threshold for all non-windows platforms to 32 just in case someone might want to call it from java later. (That is unless someone can state with certainty that there is no more reason to prefer stack allocation of job_t nowadays - on all supported platforms). |
Hi
I got a segfault using Breeze 0.12 on Debian 8 Jessie.
I gdb'ed down into 4 recursions of getrf_parallel before it overflowed the stack.
Jessie's OpenBLAS is 0.2.12 with some patches (I think including 5d33121). Most importantly it sets NUM_THREADS = 64, which causes the getrf_parallel stack overflow from #246 (and probably #912) to easily blow out Java's 1 MB Stack.
Could you just always heap allocate job_t?
Details:
JVM Open JDK 8
Scala 2.11.8
Breeze 0.12
Code:
The text was updated successfully, but these errors were encountered: