Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] tests/int: run bats tests in parallel #2759

Closed
wants to merge 16 commits into from

Conversation

kolyshkin
Copy link
Contributor

@kolyshkin kolyshkin commented Jan 21, 2021

Currently this includes and is based on both #2741 and #2757.

Won't work: see #2759 (comment)

All the tests are run with cd to bundle directory, so
it does not make ense to explicitly specify it.

This relies on the feature of functions like set_cgroups_path
or update_config to use current directory by default.

In those cases where we need to change the directory
(runc create/run -b), save the current one and use it later.

Signed-off-by: Kir Kolyshkin <[email protected]>
Since commit abd0515 it is no longed used.

Signed-off-by: Kir Kolyshkin <[email protected]>
It is only used in a couple of places and is easy to replace
with ROOT=xxx teardown_running_container.

Signed-off-by: Kir Kolyshkin <[email protected]>
All tests are run with cd to bundle directory, so
let's just use $(pwd).

Signed-off-by: Kir Kolyshkin <[email protected]>
... by using retry, to avoid code duplication.

While at it, make it check the number of arguments and error out on
invalid usage.

Signed-off-by: Kir Kolyshkin <[email protected]>
... instead of more generic "retry", to simplify code.

Signed-off-by: Kir Kolyshkin <[email protected]>
Overloading $HELLO_BUNDLE with a second purpose of having an alternative
runc root is confusing.

Instead, let's create a temp directory in setup() and clean it up in
teardown().

Signed-off-by: Kir Kolyshkin <[email protected]>
It's a little known fact, but BATS_TMPDIR is just /tmp, and it is not
being cleaned by bats after the test.

BATS_RUN_TMPDIR (available since bats 1.2.1) is $BATS_TMPDIR/bats-run-$$
(i.e. it is per bats run and includes PID), and it is cleaned up after
the test (unless --no-tmpdir-cleanup is given).

Use BATS_RUN_TMPDIR for better isolation and cleanup.

Signed-off-by: Kir Kolyshkin <[email protected]>
1. Do not use which as it's not available in some installs.
   Change it to "command -v" which is a shell built-in.

2. Do not redirect stderr to /dev/null (there's no need).

Signed-off-by: Kir Kolyshkin <[email protected]>
Use the value of $(pwd) as we cd to bundle dir before running a test.

Signed-off-by: Kir Kolyshkin <[email protected]>
1. Do not depend on $DEBIAN_BUNDLE, instead use the current directory.

2. Simplify setup() by calling teardown().

3. In teardown(), check that LIBPATH is set.

4. Move global variables into setup.

Signed-off-by: Kir Kolyshkin <[email protected]>
It is not used anywhere since commit 30601ef.

Signed-off-by: Kir Kolyshkin <[email protected]>
This simplifies and optimizes getting container images used for tests.

Currently, we have three different ways of getting images:

1. (for hello-world) the image is in this repo under tests/integration/testdata.

2. (for busybox) download it from github (the repo that is used for
   preparing official Docker image) using curl.

3. (for debian) download from Docker hub, using skopeo and umoci.

To further complicate things, we have to do this downloading in multiple
scenarios (at least 4): locally, in github CI, from Dockefile, inside a
Vagrant VM. For each scenario, we have to install skopeo and umoci, and
those two are not yet universally available for all the distros that we
use.

Yet another complication is those images are used for tests/integration
(bats-driven tests) as well as for libcontainer/integration (go tests).
The tests in libcontainer/integration rely on busybox being available
from /busybox, and the bats tests just download the images to a
temporary location during every run.

It is also hard to support CI for other architectures, because all
the machinery for preparing images is so complicated.

This commit is an attempt to simplify and optimize getting images,
mostly by getting rid of skopeo and umoci dependencies, but also
by moving the download logic into one small shell script, which
is used from all the places.

Benefits:

 - images (if not present) are only downloaded once;
 - same images are used for both kind of tests (go and bats);
 - same images are used for local and inside-docker tests
   (because source directory is mounted into container);
 - the download logic is located within 1 simple shell script.

[v2: fix eval; more doc to get-images; print URL if curl failed]
[v3: use "slim" debian, twice as small]

Signed-off-by: Kir Kolyshkin <[email protected]>
Currently, set -e ("exit on error") is only set before running tests.
The bad consequence is the script may fail to set up test environment
but will still run the tests.

Use "set -e -u -o pipefail" for the whole script to catch potential
issues with test setup and/or the script itself.

A few caveats:

1. We have to set RUNC_USE_SYSTEMD to an empty value if it is unset, so
   that the subsequent checks like [ -z "$RUNC_USE_SYSTEMD" ] won't
   trigger a "unbound variable" error. Same for ROOTLESS_TESTPATH.

2. Functions that have code like [ -f $file ] && do_something towards
   the end may return 1 in case $file does not exist (as test aka [
   was the last operation performed, and its exit code is returned.
   This is why we had to add return 0.

Signed-off-by: Kir Kolyshkin <[email protected]>
@kolyshkin
Copy link
Contributor Author

So, runc requires a terminal (in case terminal: true) is set in config, and bats -j N (where N > 1, i.e. when GNU parallel is used) does not provide one:

$ cat b.bats 
@test 1 {
	stty size
}

@test 2 {
	stty size
}
$ bats b.bats 
 ✓ 1 
 ✓ 2 

2 tests, 0 failures

$ bats -j 2 b.bats 
 ✗ 1
   (in test file b.bats, line 2)
     `stty size' failed
   stty: 'standard input': Inappropriate ioctl for device
 ✗ 2
   (in test file b.bats, line 6)
     `stty size' failed
   stty: 'standard input': Inappropriate ioctl for device

2 tests, 2 failures

Before #2535 runc probably fails if under under bats -j, but after this commit it opens /dev/tty (as long as runc is not disowned, I guess), and sets some options for it.

Now we are doomed, because setting terminal options from a background process leads to SIGTTOU for the whole group. Here's a simple demo without runc:

$ cat c.bats 
@test 1 {
	stty size
}

@test 2 {
	stty size
}

@test 3 {
	stty -F /dev/tty sane
}
$ bats c.bats 
 ✓ 1 
 ✓ 2 
 ✓ 3 

3 tests, 0 failures

$ bats -j 3  c.bats 
# (hangs forever)

In another terminal we see

1195926 pts/5    S+     0:00  |       |   \_ /usr/bin/bash /usr/libexec/bats-core/bats -j 3 c.bats
1195933 pts/5    S+     0:00  |       |       \_ /usr/bin/bash /usr/libexec/bats-core/bats-exec-suite -j 3 -x /home/kir/go/src/github.com/opencontainers/runc/c.bats
1195944 pts/5    S+     0:00  |       |       |   \_ /usr/bin/perl /usr/bin/parallel --keep-order --jobs 3 bats-exec-file -j 3 -x {} /tmp/bats-run-1195926/test_list_file.txt
1195950 pts/5    S      0:00  |       |       |       \_ /usr/bin/bash /usr/libexec/bats-core/bats-exec-file -j 3 -x /home/kir/go/src/github.com/opencontainers/runc/c.bats /tmp/bats-run-1195926/t
1195960 pts/5    S      0:00  |       |       |           \_ /usr/bin/perl /usr/bin/parallel --semaphore -j 3 --fg -- bats-exec-test -x /home/kir/go/src/github.com/opencontainers/runc/c.bats test
1195999 pts/5    T      0:00  |       |       |               \_ /usr/bin/bash /usr/libexec/bats-core/bats-exec-test -x /home/kir/go/src/github.com/opencontainers/runc/c.bats test_3 3
1196007 pts/5    T      0:00  |       |       |                   \_ stty -F /dev/tty sane
1195934 pts/5    S+     0:00  |       |       \_ /usr/bin/bash /usr/libexec/bats-core/bats -j 3 c.bats
1195935 pts/5    S+     0:00  |       |       \_ /usr/bin/bash /usr/libexec/bats-core/bats-format-pretty

So, unless we fake a terminal (using script, ssh -tt, screen, tmux, expect or the like), we can't parallelize runc integration tests using bats -j.

@kolyshkin
Copy link
Contributor Author

Closing for now (see the comment above for reasons why). Might need to address this in bats.

@kolyshkin kolyshkin closed this Feb 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant