Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmdLineTester_criu_nonPortableRestore_7_FAILED testTimeCompensation() hang #18471

Closed
JasonFengJ9 opened this issue Nov 16, 2023 · 1 comment · Fixed by ibmruntimes/openj9-openjdk-jdk21#64
Labels
comp:vm criu Used to track CRIU snapshot related work test failure

Comments

@JasonFengJ9
Copy link
Member

JasonFengJ9 commented Nov 16, 2023

Failure link

From https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/67/tapResults/

07:29:57  openjdk version "21.0.1-internal" 2023-10-17
07:29:57  OpenJDK Runtime Environment (build 21.0.1-internal-adhoc.****.BuildJDK21s390xlinuxPersonal)
07:29:57  Eclipse OpenJ9 VM (build HEAD-81bae17c368, JRE 21 Linux s390x-64-Bit Compressed References 20231115_124 (JIT enabled, AOT enabled)
07:29:57  OpenJ9   - 81bae17c368
07:29:57  OMR      - 07a175a4d23
07:29:57  JCL      - 4b666c7419b based on jdk-21.0.1+12)

Rerun in Grinder - Change TARGET to run only the failed test targets.

Optional info

Failure output (captured from console output)

variation: -Xgcpolicy:gencon -Xgcthreads64 -XX:CheckpointGCThreads=1
JVM_OPTIONS:  -Xgcpolicy:gencon -Xgcthreads64 -XX:CheckpointGCThreads=1 

Testing: Create CRIU checkpoint image and restore once - testTimeCompensation
Test start time: 2023/11/16 08:43:26 Eastern Standard Time
Running command: bash /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/jdkbinary/j2sdk-image/bin/java " -Xgcpolicy:gencon -Xgcthreads64 -XX:CheckpointGCThreads=1  -Xtrace:print={j9criu,j9jcl.533} --add-exports java.base/openj9.internal.criu=ALL-UNNAMED" org.openj9.criu.TimeChangeTest testTimeCompensation 1 false false
Time spent starting: 1 milliseconds
***[TEST INFO 2023/11/16 08:48:26] ProcessKiller detected a timeout after 300000 milliseconds!***
INFO: Cannot find '/usr/bin/gdb' using 'gdb' from the path.
***[TEST INFO 2023/11/16 08:48:26] executing gdb -batch -x /tmp/debugger13618158602440569193.txt bash 29649***
java.io.IOException: Cannot run program "gdb": error=2, No such file or directory
	at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
	at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
	at java.base/java.lang.Runtime.exec(Runtime.java:681)
	at java.base/java.lang.Runtime.exec(Runtime.java:530)
	at Test$ProcessKiller.captureCoreForProcess(Test.java:697)
	at Test$ProcessKiller.captureCoreForProcess(Test.java:646)
	at Test$ProcessKiller.run(Test.java:596)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:295)
	at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:225)
	at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
	... 6 more
***[TEST INFO 2023/11/16 08:48:26] executing kill -ABRT 29649***
Time spent executing: 300056 milliseconds
Test result: FAILED
Output from test:
 [OUT] start running script
 [OUT] export GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC,-XSAVE,-AVX2,-ERMS,-AVX,-AVX_Fast_Unaligned_Load
 [OUT] export LD_BIND_NOT=on
 [OUT] /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/jdkbinary/j2sdk-image/bin/java -XX:+EnableCRIUSupport  -Xgcpolicy:gencon -Xgcthreads64 -XX:CheckpointGCThreads=1  -Xtrace:print={j9criu,j9jcl.533} --add-exports java.base/openj9.internal.criu=ALL-UNNAMED -cp /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criu.jar org.openj9.criu.TimeChangeTest testTimeCompensation 1


---TEST RESULTS---
***[TEST INFO 2023/11/16 08:48:26] kill -ABRT signal sent***
Number of PASSED tests: 56 out of 57
***[TEST INFO 2023/11/16 08:48:26] ABRT completed***
Output from test:
Number of FAILED tests: 1 out of 57

---SUMMARY OF FAILED TESTS---
Create CRIU checkpoint image and restore once - testTimeCompensation
-----------------------------

-----------------------------------
cmdLineTester_criu_nonPortableRestore_7_FAILED

50x grinder

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/67/functional_test_output.tar.gz

1XMCURTHDINFO  Current thread
3XMTHREADINFO      "main" J9VMThread:0x00000000000A0500, omrthread_t:0x000003FF9C00A220, java/lang/Thread:0x0000000011540458, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x2, isDaemon:false)
3XMJAVALTHRCCL            jdk/internal/loader/ClassLoaders$AppClassLoader(0x000000001153DF48)
3XMTHREADINFO1            (native thread ID:0x73D6, native priority:0x5, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00041020)
3XMTHREADINFO2            (native stack address range from:0x000003FFA0736000, to:0x000003FFA0776000, size:0x40000)
3XMCPUTIME               CPU usage total: 0.192125256 secs, current category="Application"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20552 (0x5048)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at jdk/internal/misc/Unsafe.park(Native Method) <--- park indefinitely at single thread mode
4XESTACKTRACE                at java/util/concurrent/locks/LockSupport.park(LockSupport.java:221)
4XESTACKTRACE                at java/util/concurrent/locks/AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:754)
4XESTACKTRACE                at java/util/concurrent/locks/AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:990)
4XESTACKTRACE                at java/util/concurrent/locks/ReentrantLock$Sync.lock(ReentrantLock.java:153(Compiled Code))
4XESTACKTRACE                at java/util/concurrent/locks/ReentrantLock.lock(ReentrantLock.java:322(Compiled Code))
4XESTACKTRACE                at jdk/internal/misc/InternalLock.lock(InternalLock.java:74(Compiled Code))
4XESTACKTRACE                at java/io/PrintStream.writeln(PrintStream.java:824)
4XESTACKTRACE                at java/io/PrintStream.println(PrintStream.java:1168)
4XESTACKTRACE                at org/openj9/test/util/TimeUtilities.showThreadCurrentTime(TimeUtilities.java:36)
4XESTACKTRACE                at org/openj9/test/util/TimeUtilities.checkElapseTime(TimeUtilities.java:47)
4XESTACKTRACE                at org/openj9/criu/TimeChangeTest$1.run(TimeChangeTest.java:155)
4XESTACKTRACE                at org/eclipse/openj9/criu/CRIUSupport.lambda$registerCheckpointHookHelper$2(CRIUSupport.java:642)
4XESTACKTRACE                at org/eclipse/openj9/criu/CRIUSupport$$Lambda/0x0000000000000000.run(Bytecode PC:8)
4XESTACKTRACE                at org/eclipse/openj9/criu/J9InternalCheckpointHookAPI$J9InternalCheckpointHook.runHook(J9InternalCheckpointHookAPI.java:143)
4XESTACKTRACE                at org/eclipse/openj9/criu/J9InternalCheckpointHookAPI.runHooks(J9InternalCheckpointHookAPI.java:98)
4XESTACKTRACE                at org/eclipse/openj9/criu/J9InternalCheckpointHookAPI.runPreCheckpointHooksSingleThread(J9InternalCheckpointHookAPI.java:107)
4XESTACKTRACE                at org/eclipse/openj9/criu/CRIUSupport.checkpointJVMImpl(Native Method)
4XESTACKTRACE                at org/eclipse/openj9/criu/CRIUSupport.checkpointJVM(CRIUSupport.java:812)
4XESTACKTRACE                at org/openj9/criu/CRIUTestUtils.checkPointJVMNoSetup(CRIUTestUtils.java:103)
4XESTACKTRACE                at org/openj9/criu/TimeChangeTest.testTimeCompensation(TimeChangeTest.java:183)
4XESTACKTRACE                at org/openj9/criu/TimeChangeTest.main(TimeChangeTest.java:74)


3XMTHREADINFO      "testTimeCompensation() preCheckpoint timer delayed 2s" J9VMThread:0x00000000003DDC00, omrthread_t:0x000003FF54010430, java/lang/Thread:0x000000008C31CEF0, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x5B, isDaemon:false)
3XMJAVALTHRCCL            jdk/internal/loader/ClassLoaders$AppClassLoader(0x000000001153DF48)
3XMTHREADINFO1            (native thread ID:0x7431, native priority:0x5, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x00200001)
3XMTHREADINFO2            (native stack address range from:0x000003FF74A40000, to:0x000003FF74A80000, size:0x40000)
3XMCPUTIME               CPU usage total: 0.000816913 secs, current category="Application"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=4096 (0x1000)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at java/lang/StringCoding.implEncodeAsciiArray(StringCoding.java:82)
4XESTACKTRACE                at java/lang/Access.encodeASCII(Access.java:425)
4XESTACKTRACE                at sun/nio/cs/UTF_8$Encoder.encodeArrayLoop(UTF_8.java:456)
4XESTACKTRACE                at sun/nio/cs/UTF_8$Encoder.encodeLoop(UTF_8.java:564)
4XESTACKTRACE                at java/nio/charset/CharsetEncoder.encode(CharsetEncoder.java:586)
4XESTACKTRACE                at sun/nio/cs/StreamEncoder.implWrite(StreamEncoder.java:370)
4XESTACKTRACE                at sun/nio/cs/StreamEncoder.implWrite(StreamEncoder.java:357)
4XESTACKTRACE                at sun/nio/cs/StreamEncoder.lockedWrite(StreamEncoder.java:158)
4XESTACKTRACE                at sun/nio/cs/StreamEncoder.write(StreamEncoder.java:139)
4XESTACKTRACE                at java/io/OutputStreamWriter.write(OutputStreamWriter.java:219)
4XESTACKTRACE                at java/io/BufferedWriter.implFlushBuffer(BufferedWriter.java:178)
4XESTACKTRACE                at java/io/BufferedWriter.flushBuffer(BufferedWriter.java:163)
4XESTACKTRACE                at java/io/PrintStream.implWriteln(PrintStream.java:848)
4XESTACKTRACE                at java/io/PrintStream.writeln(PrintStream.java:826)
4XESTACKTRACE                at java/io/PrintStream.println(PrintStream.java:1168)
4XESTACKTRACE                at org/openj9/test/util/TimeUtilities.showThreadCurrentTime(TimeUtilities.java:36)
4XESTACKTRACE                at org/openj9/test/util/TimeUtilities.checkElapseTime(TimeUtilities.java:82)
4XESTACKTRACE                at org/openj9/test/util/TimeUtilities$TimeTestTask.run(TimeUtilities.java:143)
4XESTACKTRACE                at java/util/TimerThread.mainLoop(Timer.java:605)
4XESTACKTRACE                at java/util/TimerThread.run(Timer.java:543)

FYI @tajila

@JasonFengJ9 JasonFengJ9 added comp:vm test failure criu Used to track CRIU snapshot related work labels Nov 16, 2023
@JasonFengJ9
Copy link
Member Author

From aqa-tests/TKG/output_17001384796330/cmdLineTester_criu_nonPortableRestore_7/testOutput

testTimeCompensation() starts, current thread name: main, Thu Nov 16 08:43:26 EST 2023, System.currentTimeMillis(): 1700142206994, System.nanoTime(): 1700142206991684294
testTimeCompensation() wait 100ms checkElapseTime starts, current thread name: main, Thu Nov 16 08:43:27 EST 2023, System.currentTimeMillis() 1700142207132, System.nanoTime() 1700142207129868528
testTimeCompensation() wait 100ms: startMillisTime (1700142207030ms) startNanoTime (1700142207027803262ns) currentMillisTime (1700142207130ms) currentNanoTime (1700142207127997600ns) elapsedMillisTime (100ms) elapsedNanoTime (100194338ns)
testTimeCompensation() wait 100ms checkElapseTime ends, current thread name: main, Thu Nov 16 08:43:27 EST 2023, System.currentTimeMillis() 1700142207147, System.nanoTime() 1700142207144163230
testTimeCompensation() sleep 100ms checkElapseTime starts, current thread name: main, Thu Nov 16 08:43:27 EST 2023, System.currentTimeMillis() 1700142207247, System.nanoTime() 1700142207244664131
testTimeCompensation() sleep 100ms: startMillisTime (1700142207147ms) startNanoTime (1700142207144209381ns) currentMillisTime (1700142207247ms) currentNanoTime (1700142207244507439ns) elapsedMillisTime (100ms) elapsedNanoTime (100298058ns)
testTimeCompensation() sleep 100ms checkElapseTime ends, current thread name: main, Thu Nov 16 08:43:27 EST 2023, System.currentTimeMillis() 1700142207248, System.nanoTime() 1700142207245103608
testTimeCompensation() preCheckpoint timer delayed 100ms checkElapseTime starts, current thread name: testTimeCompensation() preCheckpoint timer delayed 100ms, Thu Nov 16 08:43:27 EST 2023, System.currentTimeMillis() 1700142207349, System.nanoTime() 1700142207346258673
testTimeCompensation() preCheckpoint timer delayed 100ms: startMillisTime (1700142207248ms) startNanoTime (1700142207245157758ns) currentMillisTime (1700142207349ms) currentNanoTime (1700142207346123350ns) elapsedMillisTime (101ms) elapsedNanoTime (100965592ns)
testTimeCompensation() preCheckpoint timer delayed 100ms checkElapseTime ends, current thread name: testTimeCompensation() preCheckpoint timer delayed 100ms, Thu Nov 16 08:43:27 EST 2023, System.currentTimeMillis() 1700142207349, System.nanoTime() 1700142207346843329
testTimeCompensation() preCheckpoint timer delayed 2s checkElapseTime starts, current thread name: testTimeCompensation() preCheckpoint timer delayed 2s, Thu Nov 16 08:43:29 EST 2023, System.currentTimeMillis() 1700142209251, System.nanoTime() 1700142209248992994
testTimeCompensation() preCheckpoint checkElapseTime starts, current thread name: main, Thu Nov 16 08:43:29 EST 2023, System.currentTimeMillis() 1700142209257, System.nanoTime() 1700142209254262568
testTimeCompensation() preCheckpoint: startMillisTime (1700142207249ms) startNanoTime (1700142207246545888ns) currentMillisTime (1700142209257ms) currentNanoTime (1700142209254238563ns) elapsedMillisTime (2008ms) elapsedNanoTime (2007692675ns)
testTimeCompensation() preCheckpoint checkElapseTime ends, current thread name: main, Thu Nov 16 08:43:29 EST 2023, System.currentTimeMillis() 1700142209257, System.nanoTime() 1700142209254927086
testTimeCompensation() preCheckpoint timer delayed 2s: startMillisTime (1700142207248ms) startNanoTime (1700142207245851571ns) currentMillisTime (1700142209250ms) currentNanoTime (1700142209247148962ns) elapsedMillisTime (2002ms) elapsedNanoTime (2001297391ns)
Performing CRIUSupport.checkpointJVM(), current thread name: main, Thu Nov 16 08:43:29 EST 2023, System.currentTimeMillis(): 1700142209258, System.nanoTime(): 1700142209255159946
13:43:29.282*0xa0500          j9criu.9        > Java_org_eclipse_openj9_criu_CRIUSupport_checkpointJVMImpl
13:43:29.282 0xa0500          j9criu.18       - Taking a checkpoint with active clinit
JVMDUMP034I User requested Java dump using '/home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/aqa-tests/TKG/output_17001384796330/cmdLineTester_criu_nonPortableRestore_7/javacore.20231116.084329.29653.0001.txt' through CRIUSingleThreadModeJVMCRIUException
JVMDUMP010I Java dump written to /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_s390x_linux_Personal_testList_1/aqa-tests/TKG/output_17001384796330/cmdLineTester_criu_nonPortableRestore_7/javacore.20231116.084329.29653.0001.txt
13:43:29.313 0xa0500          j9criu.10       < Java_org_eclipse_openj9_criu_CRIUSupport_checkpointJVMImpl
testTimeCompensation() preCheckpoint timer delayed 2s checkElapseTime ends, current thread name: testTimeCompensation() preCheckpoint timer delayed 2s, Thu Nov 16 08:43:29 EST 2023, System.currentTimeMillis() 1700142209261, System.nanoTime() 1700142209258677000
Exception in thread "main" org.eclipse.openj9.criu.JVMCheckpointException: Exception thrown when running user pre-checkpoint
        at openj9.criu/org.eclipse.openj9.criu.CRIUSupport.lambda$registerCheckpointHookHelper$2(CRIUSupport.java:644)
        at openj9.criu/org.eclipse.openj9.criu.J9InternalCheckpointHookAPI$J9InternalCheckpointHook.runHook(J9InternalCheckpointHookAPI.java:143)
        at openj9.criu/org.eclipse.openj9.criu.J9InternalCheckpointHookAPI.runHooks(J9InternalCheckpointHookAPI.java:98)
        at openj9.criu/org.eclipse.openj9.criu.J9InternalCheckpointHookAPI.runPreCheckpointHooksSingleThread(J9InternalCheckpointHookAPI.java:107)
        at openj9.criu/org.eclipse.openj9.criu.CRIUSupport.checkpointJVMImpl(Native Method)
        at openj9.criu/org.eclipse.openj9.criu.CRIUSupport.checkpointJVM(CRIUSupport.java:812)
        at org.openj9.criu.CRIUTestUtils.checkPointJVMNoSetup(CRIUTestUtils.java:103)
        at org.openj9.criu.TimeChangeTest.testTimeCompensation(TimeChangeTest.java:183)
        at org.openj9.criu.TimeChangeTest.main(TimeChangeTest.java:74)
Caused by: org.eclipse.openj9.criu.JVMCheckpointException: Blocking operation is not allowed in CRIU single thread mode.
        at java.base/jdk.internal.misc.Unsafe.park(Native Method)
        at java.base/java.util.concurrent.locks.LockSupport.park(LockSupport.java:221)
        at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:754)
        at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:990)
        at java.base/java.util.concurrent.locks.ReentrantLock$Sync.lock(ReentrantLock.java:153)
        at java.base/java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:322)
        at java.base/jdk.internal.misc.InternalLock.lock(InternalLock.java:74)
        at java.base/java.io.PrintStream.writeln(PrintStream.java:824)
        at java.base/java.io.PrintStream.println(PrintStream.java:1168)
        at org.openj9.test.util.TimeUtilities.showThreadCurrentTime(TimeUtilities.java:36)
        at org.openj9.test.util.TimeUtilities.checkElapseTime(TimeUtilities.java:47)
        at org.openj9.criu.TimeChangeTest$1.run(TimeChangeTest.java:155)
        at openj9.criu/org.eclipse.openj9.criu.CRIUSupport.lambda$registerCheckpointHookHelper$2(CRIUSupport.java:642)
        ... 8 more
testTimeCompensation() preCheckpoint timer delayed 4s checkElapseTime starts, current thread name: testTimeCompensation() preCheckpoint timer delayed 4s, Thu Nov 16 08:43:31 EST 2023, System.currentTimeMillis() 1700142211249, System.nanoTime() 1700142211246892314
testTimeCompensation() preCheckpoint timer delayed 4s: startMillisTime (1700142207249ms) startNanoTime (1700142207246486491ns) currentMillisTime (1700142211249ms) currentNanoTime (1700142211246728638ns) elapsedMillisTime (4000ms) elapsedNanoTime (4000242147ns)
FAILED: testTimeCompensation() preCheckpoint timer delayed 4s elapsedMillisTime (4000ms) should NOT be less than minElapsedMillisTime (6000ms)
testTimeCompensation() preCheckpoint timer delayed 4s checkElapseTime ends, current thread name: testTimeCompensation() preCheckpoint timer delayed 4s, Thu Nov 16 08:43:31 EST 2023, System.currentTimeMillis() 1700142211257, System.nanoTime() 1700142211254692885

The parking indefinitely in single thread mode was caught indeed and org.eclipse.openj9.criu.JVMCheckpointException was thrown.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:vm criu Used to track CRIU snapshot related work test failure
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant