-
Notifications
You must be signed in to change notification settings - Fork 722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dacapo sub-benchmark "eclipse" crashes when run with large load #7452
Comments
@mpirvu @vijaysun-omr FYI |
I don't know if that's correct though. |
Flagging as high priority as it's a crash |
So there is a large table lookup at the top of the jitdump in the javacores that were attached. The recompile did not generate any useful info in that dump. The table look up has about 600 entries and multiple entries that all go to the same place - block_422 which is a return. I did manage to get a crash running on my local machine running that benchmark as specified so I'm just trying to get some details about where the crash is happening to see what the problem is. |
So the crash looks to be at https://github.com/eclipse/omr/blob/42aab208572031037e55f0c915a36e6b40def177/compiler/optimizer/SwitchAnalyzer.cpp#L1347 due to a divide by zero. The divide by zero apparently indicates an unreachable successor according to the assert in the code specified. Continuing investigation to see what has happened to cause an unreachable successor and how it happened... |
Ok so here's the problem - in the crash block_422 is a successor of multiple table cases (256 to be exact) - so many that the int8_t counter for the number of successors going to the block overflows... This results in a division by zero if the overflow results in a zero (note signed overflow behaviour is undefined by the spec but an overflow to zero is one possible outcome). The case is rare because if the number overflows to some non-zero value the division will work - it just won't make much sense... This code has been unmodified for a long time - for some reason we have just started hitting the corner case of 0. Increasing the size of the counter to int32_t will ensure we don't overflow. Pull request against OMR eclipse-omr/omr#4450. |
OMR change has been merged. I believe this is now fixed so marking closed. Please reopen if this is still a problem with the change added to the VM. |
I am seeing an intermittent crash when running dacapo-eclipse on aarch64 [2021-01-06T17:39:24.624Z] Index workspace ....................#0: /home/jenkins/workspace/Test_openjdk8_j9_sanity.perf_aarch64_linux/openjdkbinary/j2sdk-image/jre/lib/aarch64/compressedrefs/libj9jit29.so(+0x6e0594) [0xffffa640d594] |
https://trss.adoptopenjdk.net/output/test?id=5ff616932bad78506601ca3a How can I know which aarch64 build was used in the run?
It looks like it failed by accessing the address R3 + 0x10. |
It is not the JDK 8 0.24m2 build.
|
@smlambert Thank you. The binary with https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk8u/job/jdk8u-linux-aarch64-openj9/186/ seems to be no longer available.
|
Its not available on Jenkins job anymore, but presume that its been pushed to our nightly release repo, so the Jan 6th nightly build is available from the API/webpage. https://adoptopenjdk.net/nightly.html?variant=openjdk11&jvmVariant=openj9 Guessing this is the direct link: |
Thank you. This looks like an unknown problem. I don't think I have seen this before.
|
The crash in |
Java -version output
openjdk version "1.8.0_232"
OpenJDK Runtime Environment (build 1.8.0_232-201910110340-b08)
Eclipse OpenJ9 VM (build master-fe39103a7, JRE 1.8.0 Linux amd64-64-Bit Compressed References 20191011_432 (JIT enabled, AOT enabled)
OpenJ9 - fe39103
OMR - bc5ceea
JCL - 660f897065 based on jdk8u232-b08)
Summary of problem
The benchmark runs fine with default and small size workloads. Only with "large" size, this issue is observed.
HotSpot build works fine even with large size.
Command used to run the benchmark:
java -jar ../dacapo/dacapo-9.12-MR1-bach.jar -s large eclipse
Platform: Linux X64
Below is the error I get:
javacores.zip
The text was updated successfully, but these errors were encountered: