Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman upgrade tests #8749

Merged
merged 1 commit into from
Feb 26, 2021

Conversation

edsantiago
Copy link
Member

Initial validation of using podman-in-podman to create an
old-podman root, then use new-podman to play with the
containers created therein.

Signed-off-by: Ed Santiago [email protected]

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 16, 2020
@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: edsantiago

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 16, 2020
@edsantiago
Copy link
Member Author

/hold
/do-not-merge-this-is-just-pre-alpha-not-even-wip

@containers/podman-maintainers PTAL (low priority), this is a followup from our podman upgrade testing conversation last week. I think it shows promise but am not sure how much farther I'll get to it this week.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 16, 2020
@edsantiago
Copy link
Member Author

@mheon if you have a few moments this week could you PTAL and LMK if this approach seems sensible? And if so, should I pursue it, perhaps in the nightly-cron-CI context?

In particular, is new-podman exec container-still-running-from-old-podman something we want to test? ISTR that it worked last month, but right now I see 1.9.0 -> current master fail with:

# ./bin/podman --root=/tmp/podman-upgrade-test.VSYlCFp/root --runroot=/tmp/podman-upgrade-test.VSYlCFp/runroot exec myrunningcontainer date
WARN[0000] Failed to add conmon to systemd sandbox cgroup: Invalid unit name '/libpod_parent'
Error: error opening file `/run/crun/489c36b307e9e096b16782ecfa4a3c1de7ee640a1354cbc01912974be7f3b09f/status`: No such file or directory: OCI not found

@mheon
Copy link
Member

mheon commented Jan 7, 2021

Approach LGTM.

My initial impression is that exec should not be guaranteed to work with older versions, but we should still make a good effort to ensure it continues to function - people will expect long-running containers to continue to work.

@edsantiago
Copy link
Member Author

@mheon I still can't get exec working but there's a more interesting challenge right now:

# host-podman ... stop myrunningcontainer
Error: container a5c36b57351c92fa7ebb165e3dadf9733819f00921d43888e4aea8dd7826721c conmon process missing, cannot retrieve exit code: conmon process killed

Cause: this container was created by containerized-podman, so the conmon pid in the db is of course in that namespace.

Question: is there any way I can fake the output of podman inspect, or replace it, so host-podman will get the host-space conmon PID? (Overwriting $RUNROOT/overlay-containers/sha/userdata/conmon.pid has no effect).

@mheon
Copy link
Member

mheon commented Jan 12, 2021

You'd need a manual database edit, so I don't think this is easily possible. Of course, what we're storing in the DB is JSON, so theoretically you could look for the string "conmonPid": followed by an (ASCII) integer in the DB and replace the following integer?

@edsantiago edsantiago force-pushed the upgrade_test branch 5 times, most recently from 8602bf7 to 07ea8fe Compare January 13, 2021 03:18
@edsantiago
Copy link
Member Author

Yay!

@github-actions
Copy link

A friendly reminder that this PR had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Feb 22, 2021

@edsantiago What do you want to do with this PR?

@edsantiago
Copy link
Member Author

I've been running it periodically on my end, as a sanity check for 3.0. I'm still not entirely convinced that it'll find anything other than the most egregious incompatibilities, but maybe that's good enough. I'll clean it up for review and find a way to hook it into CI.

@mheon
Copy link
Member

mheon commented Feb 22, 2021

@edsantiago Most egregious sounds correct - I basically just want a guarantee we don't completely break Podman on upgrade. As long as things mostly work, we can handle the rest via bug reports.

Initial validation of using podman-in-podman to create an
old-podman root, then use new-podman to play with the
containers created therein.

Signed-off-by: Ed Santiago <[email protected]>
@edsantiago edsantiago changed the title WIP: podman upgrade tests - proof of concept podman upgrade tests Feb 23, 2021
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 23, 2021
@edsantiago
Copy link
Member Author

This is unexpected (and a little worrying): all tests passed. On my f33 laptop, conmon-2.0.26-1.fc33, the exec test fails:

✗ exec
   (from function `is' in file test/upgrade/../system/helpers.bash, line 406,
    in test file test/upgrade/test-upgrade.bats, line 216)
     `is "$output" "$RANDOM_STRING_1" "exec into myrunningcontainer"' failed
   # bin/podman exec myrunningcontainer cat /var/www/index.txt
   time="2021-02-23T11:08:50-07:00" level=warning msg="Failed to add conmon to systemd sandbox cgroup: Invalid unit name '/libpod_parent'"
   huMJJkeNdvRgLZC
   #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
   #|     FAIL: exec into myrunningcontainer
   #| expected: 'huMJJkeNdvRgLZC'
   #|   actual: 'time="2021-02-23T11:08:50-07:00" level=warning msg="Failed to add conmon to systemd sandbox cgroup: Invalid unit name '/libpod_parent'"'
   #|         > 'huMJJkeNdvRgLZC'
   #\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I'm uncomfortable asking for review with a big unknown like that... but, realistically speaking, I'm not going to try to track that down. Nor the other weird skips in the tests (see FIXME comments). So, @containers/podman-maintainers , WDYT? Shall we try to get this merged, and leave the FIXMEs for a future rainy day?

@edsantiago
Copy link
Member Author

CI map showing new step (in light tan, near middle right): cirrus-map-upgrade_test

@mheon
Copy link
Member

mheon commented Feb 23, 2021

This is a warning we used to silently ignore, but the change of log-level has revealed it to the world now. I think we may have changed it back to an Info, but maybe my memory is faulty since you seem to be rebased to latest.

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :)

@rhatdan rhatdan removed the stale-pr label Feb 24, 2021
@rhatdan
Copy link
Member

rhatdan commented Feb 24, 2021

Awesome job @edsantiago
/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 24, 2021
@rhatdan
Copy link
Member

rhatdan commented Feb 24, 2021

@edsantiago do you want to throw the switch on this?

@edsantiago
Copy link
Member Author

@rhatdan I'd like confirmation that I've picked the right previous-versions (in .cirrus.yml) or at least a reasonable set, I'd also LOVE it if someone explained the FIXME failures, but there's only so much I can fantasize about...

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The versions look good to me.

I suggest to create a follow-up GitHub issue to look at the FIXMEs. We'll likely run out of time this week.

👍 to merge from my side, @containers/podman-maintainers PTAL

@rhatdan
Copy link
Member

rhatdan commented Feb 26, 2021

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 26, 2021
@openshift-merge-robot openshift-merge-robot merged commit 397aae3 into containers:master Feb 26, 2021
@edsantiago edsantiago deleted the upgrade_test branch March 1, 2021 13:45
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 23, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants