Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All tests hang on alpha #200

Open
matoro opened this issue Oct 7, 2022 · 8 comments
Open

All tests hang on alpha #200

matoro opened this issue Oct 7, 2022 · 8 comments

Comments

@matoro
Copy link

matoro commented Oct 7, 2022

Hi, in both of the two extant distros supporting alpha (Debian & Gentoo) all tests in the test suite timeout. I tried increasing the meson timeout past the 150-second limit but the tests hung indefinitely.

Test log from my run (Gentoo):

>>> Test phase: dev-util/umockdev-0.17.13
 * .alpha: running multilib-minimal_abi_src_test
meson test --print-errorlogs -C /var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13-.alpha --num-processes 1 --no-suite fails-valgrind
ninja: Entering directory `/var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13-.alpha'
ninja: no work to do.
umockdev / umockdev time out (After 150 seconds)
1/4 umockdev / umockdev     TIMEOUT        150.10s   killed by signal 15 SIGTERM
>>> GI_TYPELIB_PATH=/var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13-.alpha MALLOC_CHECK_=3 MALLOC_PERTURB_=218 G_DEBUG=fatal-warnings,fatal-criticals,gc-friendly LD_LIBRARY_PATH=/var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13-.alpha PATH=/var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13-.alpha:/var/tmp/portage/dev-util/umockdev-0.17.13/temp/python3.11/bin:/usr/lib/portage/python3.10/ebuild-helpers/xattr:/usr/lib/portage/python3.10/ebuild-helpers:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin TOP_SRCDIR=/var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13 /var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13/src/umockdev-wrapper /var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13-.alpha/test-umockdev

umockdev / ioctl-tree time out (After 150 seconds)
2/4 umockdev / ioctl-tree   TIMEOUT        150.07s   killed by signal 15 SIGTERM
>>> GI_TYPELIB_PATH=/var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13-.alpha MALLOC_CHECK_=3 G_DEBUG=fatal-warnings,fatal-criticals,gc-friendly MALLOC_PERTURB_=47 LD_LIBRARY_PATH=/var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13-.alpha PATH=/var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13-.alpha:/var/tmp/portage/dev-util/umockdev-0.17.13/temp/python3.11/bin:/usr/lib/portage/python3.10/ebuild-helpers/xattr:/usr/lib/portage/python3.10/ebuild-helpers:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin TOP_SRCDIR=/var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13 /var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13/src/umockdev-wrapper /var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13-.alpha/test-ioctl-tree

umockdev / umockdev-run time out (After 150 seconds)
3/4 umockdev / umockdev-run TIMEOUT        150.06s   killed by signal 15 SIGTERM
>>> GI_TYPELIB_PATH=/var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13-.alpha MALLOC_CHECK_=3 G_DEBUG=fatal-warnings,fatal-criticals,gc-friendly MALLOC_PERTURB_=130 LD_LIBRARY_PATH=/var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13-.alpha PATH=/var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13-.alpha:/var/tmp/portage/dev-util/umockdev-0.17.13/temp/python3.11/bin:/usr/lib/portage/python3.10/ebuild-helpers/xattr:/usr/lib/portage/python3.10/ebuild-helpers:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin TOP_SRCDIR=/var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13 /var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13/src/umockdev-wrapper /var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13-.alpha/test-umockdev-run

umockdev / static-code time out (After 150 seconds)
4/4 umockdev / static-code  TIMEOUT        150.05s   killed by signal 15 SIGTERM
>>> GI_TYPELIB_PATH=/var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13-.alpha MALLOC_CHECK_=3 G_DEBUG=fatal-warnings,fatal-criticals,gc-friendly MALLOC_PERTURB_=131 LD_LIBRARY_PATH=/var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13-.alpha PATH=/var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13-.alpha:/var/tmp/portage/dev-util/umockdev-0.17.13/temp/python3.11/bin:/usr/lib/portage/python3.10/ebuild-helpers/xattr:/usr/lib/portage/python3.10/ebuild-helpers:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin TOP_SRCDIR=/var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13 /var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13/src/umockdev-wrapper /var/tmp/portage/dev-util/umockdev-0.17.13/work/umockdev-0.17.13/tests/test-static-code


Summary of Failures:

1/4 umockdev / umockdev     TIMEOUT        150.10s   killed by signal 15 SIGTERM
2/4 umockdev / ioctl-tree   TIMEOUT        150.07s   killed by signal 15 SIGTERM
3/4 umockdev / umockdev-run TIMEOUT        150.06s   killed by signal 15 SIGTERM
4/4 umockdev / static-code  TIMEOUT        150.05s   killed by signal 15 SIGTERM

Ok:                 0
Expected Fail:      0
Fail:               0
Unexpected Pass:    0
Skipped:            0
Timeout:            4

Downstream Gentoo bug: https://bugs.gentoo.org/805083

Debian: https://buildd.debian.org/status/fetch.php?pkg=umockdev&arch=alpha&ver=0.17.13-1&stamp=1663045653&raw=1
(build history at https://buildd.debian.org/status/logs.php?pkg=umockdev&arch=alpha )

I have real hardware available that I can provide access to if it would be helpful for debugging. Thanks!

@martinpitt
Copy link
Owner

SSH access would be useful indeed, as there are no Debian porter boxes for alpha. my SSH key

As a first diagnosis, can you please run a single test in verbose mode, to see how far that gets? (in the build tree):

LD_LIBRARY_PATH=. LD_PRELOAD=libumockdev-preload.so ./test-umockdev

Does this simplest possible test case work?

LD_LIBRARY_PATH=. LD_PRELOAD=libumockdev-preload.so ./test-umockdev -p /umockdev-testbed/empty

@matoro
Copy link
Author

matoro commented Oct 14, 2022

SSH access would be useful indeed, as there are no Debian porter boxes for alpha. my SSH key

As a first diagnosis, can you please run a single test in verbose mode, to see how far that gets? (in the build tree):

LD_LIBRARY_PATH=. LD_PRELOAD=libumockdev-preload.so ./test-umockdev

Does this simplest possible test case work?

LD_LIBRARY_PATH=. LD_PRELOAD=libumockdev-preload.so ./test-umockdev -p /umockdev-testbed/empty

Nope, just tried - both commands just hang with no output. They don't respond to SIGINT either, have to kill them.

I've added you as a user - should be able to log in to martinpitt [at] alphadev [dot] matoro [dot] tk (IPv6 only). Let me know if you're able to take a look - thank you so much!!

@martinpitt
Copy link
Owner

@matoro Thanks! I can log in, but umockdev's build dependencies are missing. I started a tmux session if you want to attach/have a look. First error:

meson.build:1:0: ERROR: Could not execute Vala compiler: valac

Could you install these? Or is there some kind of build chroot on that box which I can access without root?

@matoro
Copy link
Author

matoro commented Oct 20, 2022

@matoro Thanks! I can log in, but umockdev's build dependencies are missing. I started a tmux session if you want to attach/have a look. First error:

meson.build:1:0: ERROR: Could not execute Vala compiler: valac

Could you install these? Or is there some kind of build chroot on that box which I can access without root?

Oops, sorry - gobject-introspection is installed now. Vala was actually already installed, but Gentoo does not symlink the plain valac command, so you have to set VALAC=valac-0.56, see our ebuild: https://github.com/gentoo/gentoo/blob/master/dev-util/umockdev/umockdev-9999.ebuild

@martinpitt
Copy link
Owner

martinpitt commented Oct 21, 2022

Initial notes:

LD_LIBRARY_PATH=. LD_PRELOAD=libumockdev-preload.so ./test-umockdev -p /umockdev-testbed/empty

This locks up hard, not even Control-C or Control-\ help. Needs to be SIGKILLed. This makes gdb unusable as well, and strace isn't installed. @matoro, installing strace would be much appreciated!

UMOCKDEV_DEBUG=all LD_PRELOAD=libumockdev-preload.so ls

that works fine.

UMOCKDEV_DEBUG=all LD_LIBRARY_PATH=. ./umockdev-run ls

that segfaults immediately. This once again means that I can't run gdb inside of umockdev-run, and running it outside is rather pointless as the only backtrace that it gives me is for the Process.raise (Process.term_sig (child_status)) (i.e. passing on ls'es signal).

Same conundrum with a core dump -- dumping "outside" isn't useful, and dumping "inside" with

UMOCKDEV_DEBUG=all LD_LIBRARY_PATH=. ./umockdev-run -- bash -c 'ulimit -c unlimited; ls' 

crashes before bash loads properly. So it seems this needs some fine-grained printf debugging..

The closest that I can get to is the _access("/tmp/umockdev.T0BOU1/disabled", F_OK) call in the preload's path_exists(). No idea why this would hang. Replacing it with

struct statx st;
res = _statx(AT_FDCWD, path, 0, STATX_MODE, &st);

doesn't help either. Disabling the whole path_exists() check also doesn't help; this smells like a memory corruption that happened before already.

@martinpitt
Copy link
Owner

martinpitt commented Oct 21, 2022

@matoro , can you please install strace and valgrind? This is a really hairy one, I'm afraid..

@matoro
Copy link
Author

matoro commented Oct 21, 2022

@matoro , can you please install strace and valgrind? This is a really hairly one, I'm afraid..

Normally I have strace installed, 5.19 is just blocked by strace/strace#220 at the moment. But I just made a local repo and installed 5.18 from it for now.

Valgrind unfortunately requires platform-specific code, so there is of course no support for most of the more obscure platforms: https://valgrind.org/info/platforms.html

Let me know if there is anything else that would help the investigation. Thank you so much for looking at this!

@martinpitt
Copy link
Owner

Ah right, no valgrind.. Perhaps there's something simpler like electric-fence. I ran with strace, but this is inconclusive. With printf debugging and

pkill -9 -e chatter; rm -rf /tmp/umockdev.*; meson compile && G_MESSAGES_DEBUG=all UMOCKDEV_DEBUG=all LD_LIBRARY_PATH=. ./umockdev-run ./chatter /dev/null

it keeps hanging when calling the real glibc functions like access() and open(), and I don't know how to sensibly debug that. OTOH, when calling ls instead of chatter, it crashes much earlier. It's likely that there's some memory corruption somewhere, but I don't know how to figure this out. Valgrind on x86 and other architectures does not find anything.

If someone has another idea how to debug this, I'm all ears -- but there's only so much time I can throw at this, sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants