-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tool.drcachesim.threads-with-config-file occasional timeout on A64 Jenkins #4954
Comments
The coherence test hit the same hang. It runs the same app so likely the same problem: http://139.178.84.19:8080/job/DynamoRIO-AArch64-Precommit/126/consoleFull
|
We hit this again: http://139.178.84.19:8080/job/DynamoRIO-AArch64-Precommit/147/console |
Xref #3971 where coherence hung on Windows. Hung again on AArch64 Jenkins: http://139.178.84.19:8080/job/DynamoRIO-AArch64-Precommit/161/consoleFull |
I think we should add this test ignore list. Failures have become very frequent. It's blocking #4941 right now. I see the original issue was a timeout, but we're seeing a regex mismatch too now: http://139.178.84.19:8080/job/DynamoRIO-AArch64-Precommit/178/console, http://139.178.84.19:8080/job/DynamoRIO-AArch64-Precommit/234/ |
Adds nativeexec tests to ignore list for x86-64. We've been seeing a lot of red on CI due to failures on different options for these tests. Also ignores common.fib for one combination of options on vs2017-32. Also ignores tool.drcachesim.threads-with-config-file failures on Arch64. Issue: #5010, #1807, #4954
Hit again on coherence test on Jenkins: http://139.178.84.19:8080/job/DynamoRIO-AArch64-Precommit/271/consoleFull |
These all only hang in release build; can't reproduce a problem in debug |
Adds private loader redirection of open, close, read, and write to DR's syscall-wrapper versions (plus file descriptor isolation, for open and close). The libc write invokes pthread code for cancel features, and we are not able to create a private libpthread or isolate pthread resources (#956) which leads to poor interactions with application pthread uses and observed hangs. Tested on the AArch64 Jenkins machine where these tests all hung every 5 to 10 runs in release build before and now they succeed 20,000 times in a row: -------------------------------------------------- derek@dynamorio:~/dr/build_rel$ for i in sim.threads\$ sim.TLB-threads sim.coherence sim.threads-with; do echo $i; ctest --repeat-until-fail 20000 -R $i > RUN-$i 2>&1; done sim.threads$ sim.TLB-threads sim.coherence sim.threads-with derek@dynamorio:~/dr/build_rel$ grep -c Passed RUN-* RUN-sim.coherence:20000 RUN-sim.threads$:20000 RUN-sim.threads-with:20000 RUN-sim.TLB-threads:20000 derek@dynamorio:~/dr/build_rel$ grep failed RUN-* RUN-sim.coherence:100% tests passed, 0 tests failed out of 1 RUN-sim.threads$:100% tests passed, 0 tests failed out of 1 RUN-sim.threads-with:100% tests passed, 0 tests failed out of 1 RUN-sim.TLB-threads:100% tests passed, 0 tests failed out of 1 -------------------------------------------------- While at it, removes drcachesim.invariants which was tested as well and has no failures. Issue: #4928, #4954, #2417, #956 Fixes #4928 Fixes #4954 Fixes #2892
Adds private loader redirection of open, close, read, and write to DR's syscall-wrapper versions (plus file descriptor isolation, for open and close). The libc write invokes pthread code for cancel features, and we are not able to create a private libpthread or isolate pthread resources (#956) which leads to poor interactions with application pthread uses and observed hangs. Tested on the AArch64 Jenkins machine where these tests all hung every 5 to 10 runs in release build before and now they succeed 20,000 times in a row: ``` -------------------------------------------------- derek@dynamorio:~/dr/build_rel$ for i in sim.threads\$ sim.TLB-threads sim.coherence sim.threads-with; do echo $i; ctest --repeat-until-fail 20000 -R $i > RUN-$i 2>&1; done sim.threads$ sim.TLB-threads sim.coherence sim.threads-with derek@dynamorio:~/dr/build_rel$ grep -c Passed RUN-* RUN-sim.coherence:20000 RUN-sim.threads$:20000 RUN-sim.threads-with:20000 RUN-sim.TLB-threads:20000 derek@dynamorio:~/dr/build_rel$ grep failed RUN-* RUN-sim.coherence:100% tests passed, 0 tests failed out of 1 RUN-sim.threads$:100% tests passed, 0 tests failed out of 1 RUN-sim.threads-with:100% tests passed, 0 tests failed out of 1 RUN-sim.TLB-threads:100% tests passed, 0 tests failed out of 1 -------------------------------------------------- ``` While at it, removes drcachesim.invariants which was tested as well and has no failures, under the theory that the original failures were these same release-build hangs. Today, it's a debug-only test. Issue: #4928, #4954, #2417, #956 Fixes #4928 Fixes #4954 Fixes #2892
Hit a timeout in PR #4952 on tool.drcachesim.threads-with-config-file: took 90.00s:
http://139.178.84.19:8080/job/DynamoRIO-AArch64-Precommit/98/consoleFull
It ends there, in iteration 3.
This happened on PR #4949 too: http://139.178.84.19:8080/job/DynamoRIO-AArch64-Precommit/95/
It passed on re-running and took very little time:
So it seems like it's not just always close the time limit and it's maybe an actual hang or something?
The text was updated successfully, but these errors were encountered: