Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman upgrade tests #8749

Merged
merged 1 commit into from
Feb 26, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions .cirrus.yml
Original file line number Diff line number Diff line change
Expand Up @@ -598,6 +598,38 @@ rootless_system_test_task:
main_script: *main
always: *logs_artifacts

# FIXME: we may want to consider running this from nightly cron instead of CI.
# The tests are actually pretty quick (less than a minute) but they do rely
# on pulling images from quay.io, which means we're subject to network flakes.
#
# FIXME: how does this env matrix work, anyway? Does it spin up multiple VMs?
# We might just want to encode the version matrix in runner.sh instead
upgrade_test_task:
name: "Upgrade test: from $PODMAN_UPGRADE_FROM"
alias: upgrade_test
skip: *tags
only_if: *not_docs
depends_on:
- local_system_test
matrix:
- env:
PODMAN_UPGRADE_FROM: v1.9.0
- env:
PODMAN_UPGRADE_FROM: v2.0.6
- env:
PODMAN_UPGRADE_FROM: v2.1.1
gce_instance: *standardvm
env:
TEST_FLAVOR: upgrade_test
DISTRO_NV: ${FEDORA_NAME}
VM_IMAGE_NAME: ${FEDORA_CACHE_IMAGE_NAME}
# ID for re-use of build output
_BUILD_CACHE_HANDLE: ${FEDORA_NAME}-build-${CIRRUS_BUILD_ID}
clone_script: *noop
gopath_cache: *ro_gopath_cache
setup_script: *setup
main_script: *main
always: *logs_artifacts

# This task is critical. It updates the "last-used by" timestamp stored
# in metadata for all VM images. This mechanism functions in tandem with
Expand Down Expand Up @@ -654,6 +686,7 @@ success_task:
- local_system_test
- remote_system_test
- rootless_system_test
- upgrade_test
- meta
container: *smallcontainer
env:
Expand Down
4 changes: 4 additions & 0 deletions contrib/cirrus/runner.sh
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,10 @@ function _run_sys() {
dotest system
}

function _run_upgrade_test() {
bats test/upgrade |& logformatter
}

function _run_bindings() {
# shellcheck disable=SC2155
export PATH=$PATH:$GOSRC/hack
Expand Down
1 change: 1 addition & 0 deletions contrib/cirrus/setup_environment.sh
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,7 @@ case "$TEST_FLAVOR" in
compose) ;&
int) ;&
sys) ;&
upgrade_test) ;&
bindings) ;&
endpoint)
# Use existing host bits when testing is to happen inside a container
Expand Down
2 changes: 1 addition & 1 deletion test/system/helpers.bash
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ function run_podman() {
echo "$_LOG_PROMPT $PODMAN $*"
# BATS hangs if a subprocess remains and keeps FD 3 open; this happens
# if podman crashes unexpectedly without cleaning up subprocesses.
run timeout --foreground -v --kill=10 $PODMAN_TIMEOUT $PODMAN "$@" 3>/dev/null
run timeout --foreground -v --kill=10 $PODMAN_TIMEOUT $PODMAN $_PODMAN_TEST_OPTS "$@" 3>/dev/null
# without "quotes", multiple lines are glommed together into one
if [ -n "$output" ]; then
echo "$output"
Expand Down
87 changes: 87 additions & 0 deletions test/upgrade/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
Background
==========

For years we've been needing a way to test podman upgrades; this
became much more critical on December 7, 2020, when Matt disclosed
a bug he had found over the weekend
([#8613](https://github.com/containers/podman/issues/8613))
in which reuse of a previously-defined field name would
result in fatal JSON decode failures if current-podman were
to try reading containers created with podman <= 1.8 (FIXME: confirm)

Upgrade testing is a daunting problem; but in the December 12
Cabal meeting Dan suggested using podman-in-podman. This PR
is the result of fleshing out that idea.

Overview
========

The BATS script in this directory fetches and runs an old-podman
container image from quay.io/podman, uses it to create and run
a number of containers, then uses new-podman to interact with
those containers.

As of 2021-02-23 the available old-podman versions are:

```console
$ ./bin/podman search --list-tags quay.io/podman/stable | awk '$2 ~ /^v/ { print $2}' | sort | column -c 75
v1.4.2 v1.5.0 v1.6 v1.9.0 v2.0.2 v2.1.1
v1.4.4 v1.5.1 v1.6.2 v1.9.1 v2.0.6 v2.2.1
```

Test invocation is:
```console
$ sudo env PODMAN=bin/podman PODMAN_UPGRADE_FROM=v1.9.0 PODMAN_UPGRADE_TEST_DEBUG= bats test/upgrade
```
(Path assumes you're cd'ed to top-level podman repo). `PODMAN_UPGRADE_FROM`
can be any of the versions above. `PODMAN_UPGRADE_TEST_DEBUG` is empty
here, but listed so you can set it `=1` and leave the podman_parent
container running. Interacting with this container is left as an
exercise for the reader.

The script will pull the given podman image, invoke it with a scratch
root directory, and have it do a small set of podman stuff (pull an
image, create/run some containers). This podman process stays running
because if it exits, it kills containers running inside the container.

We then invoke the current (host-installed) podman, using the same
scratch root directory, and perform operations on those images and
containers. Most of those operations are done in individual @tests.

The goal is to have this upgrade test run in CI, iterating over a
loop of known old versions. This list would need to be hand-maintained
and updated on new releases. There might also need to be extra
configuration defined, such as per-version commands (see below).

Findings
========

Well, first, `v1.6.2` won't work on default f32/f33: the image
does not include `crun`, so it can't work at all:

ERRO[0000] oci runtime "runc" does not support CGroups V2: use system migrate to mitigate

I realize that it's kind of stupid not to test 1.6, since that's
precisely the test that would've caught #8613 early, but I just
don't think it's worth the hassle of setting up cgroupsv1 VMs.

For posterity, in an earlier incantation of this script I tried
booting f32 into cgroupsv1 and ran into the following warnings
when running new-podman on old-containers:
```
ERRO[0000] error joining network namespace for container 322b66d94640e31b2e6921565445cf0dade4ec13cabc16ee5f29292bdc038341: error retrieving network namespace at /var/run/netns/cni-577e2289-2c05-2e28-3c3d-002a5596e7da: failed to Statfs "/var/run/netns/cni-577e2289
```

Where To Go From Here
=====================

* Tests are still (2021-02-23) incomplete, with several failing outright.
See FIXMEs in the code.

* Figuring out how/if to run rootless. I think this is possible, perhaps
even necessary, but will be tricky to get right because of home-directory
mounting.

* Figuring out how/if to run variations with different config files
(e.g. running OLD-PODMAN that creates a user libpod.conf, tweaking
that in the test, then running NEW-PODMAN upgrate tests)
11 changes: 11 additions & 0 deletions test/upgrade/helpers.bash
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# -*- bash -*-

load "../system/helpers"

setup() {
:
}

teardown() {
:
}
Loading