Force CgroupsV1 on Ubuntu #146

edsantiago · 2022-07-06T12:20:51Z

PR #115 removed a force-cgroups-v2 setup for Ubuntu, possibly
assuming that Ubuntu uses cgroups v1 by default? That doesn't
seem to be the case: the Ubuntu I've looked at (via Cirrus
rerun-with-terminal) seems to default to v2. End result is
that we've been running CI for months without testing runc.
This PR forces cgroups v1 on Ubuntu, via grub boot args.

As of 2022-07-20 the version of criu in Ubuntu is broken,
which requires us to install from something called OBS.
There was some OBS-installing code present, but it didn't
lend itself to reuse, so I refactored it and added a
temporary use-criu-from-obs line with a timestamped FIXME.

Signed-off-by: Ed Santiago [email protected]

edsantiago · 2022-07-06T14:03:47Z

Exactly the same problem as last week but with a different version. That one griped about 3.3, this one is griping about 3.4. I will re-run in about 4 hours, that should fix it.

    ubuntu: The following packages have unmet dependencies:
    ubuntu:  libsystemd-dev : Depends: libsystemd0 (= 249.11-0ubuntu3) but 249.11-0ubuntu3.4 is to be installed
    ubuntu:  libudev-dev : Depends: libudev1 (= 249.11-0ubuntu3) but 249.11-0ubuntu3.4 is to be installed
    ubuntu: E: Unable to correct problems, you have held broken packages.
    ubuntu:     exit(100)

github-actions · 2022-07-07T01:05:40Z

Cirrus CI build successful. Image ID c5075989926510592 ready for use.

edsantiago · 2022-07-07T13:55:03Z

Looks like criu is broken in f35 and ubuntu.

edsantiago · 2022-07-07T20:46:29Z

criu failure is actually b0rkage in glibc. Being tracked here for now: checkpoint-restore/criu#1935

cevich

LGTM. Thanks @edsantiago for taking this on. Note: The AWS images don't show up yet in the 'new image ID' comment posted by the github-actions bot. You have to manually go into the cirrus task (for each AWS cache-image) and pull out the "AMI-" ID. I've got a jira card to fix this in the pipeline.

cevich · 2022-07-11T14:18:44Z

Example, for fedora-aws Cache Image, you can look at the manifest.json artifact to see the AMI ID.

edsantiago · 2022-07-11T14:39:12Z

/hold

@cevich thanks but this cannot merge: criu is totally broken. I don't know when it'll be fixed.

github-actions · 2022-07-19T03:48:03Z

Cirrus CI build successful. Image ID c5005250640740352 ready for use.

github-actions · 2022-07-19T14:32:33Z

Cirrus CI build successful. Image ID c5316115306905600 ready for use.

github-actions · 2022-07-19T21:16:34Z

Cirrus CI build successful. Image ID c4996377506742272 ready for use.

cevich

LGTM, just one small question/change request.

cache_images/ubuntu_packaging.sh

cevich · 2022-07-20T14:38:59Z

Reminder: Rebase this to pick up the golang 1.18 change. Also I'm going to add labels to block the prior-fedora builds. Since the team decided to suspend testing there for now (due to golang 1.18 unavailability).

cevich · 2022-07-20T15:14:23Z

Built image ID: c6706201604915200 (bot is broken ATM)

For podman-machine, the x86_64 AMI ID is: ami-0829a020372a04284

edsantiago · 2022-07-20T22:07:51Z

Well, the resulting images are manifesting all sorts of crises, but all of them seem to be our fault, not the images.

@cevich this is ready for review at your convenience. I've confirmed that the resulting images use runc on Ubuntu, and that the criu works.

rhatdan · 2022-07-21T09:09:52Z

LGTM

cevich · 2022-07-21T13:44:21Z

all of them seem to be our fault, not the images.

This is fairly typical, and can be the seeds for months long podman PRs 😞 As Lokesh found, they can be quite overwhelming. I recommend focusing on one problem at a time and leaning on the team extensively for help. I'll take a look as well...

cevich · 2022-07-21T13:46:27Z

cache_images/ubuntu_packaging.sh

+# >>> PLEASE REMOVE THIS ONCE CRIU GETS FIXED IN REGULAR UBUNTU!
+# >>> (No, I -- Ed -- have no idea how to even check that, sorry).
+# Context: https://github.com/containers/podman/pull/14972
+# Context: https://github.com/checkpoint-restore/criu/issues/1935


This is fine, thanks for the comment and links.

Note: Fedora-35 is disabled due to missing golang 1.18 Ref: containers/automation_images#140 and containers/automation_images#149 and containers/automation_images#146 Signed-off-by: Chris Evich <[email protected]>

@Luap99

...and enable the at-test-time confirmation, the one that double-checks that if CI requests runc we actually use runc. This exposed a nasty surprise in our setup: there are steps to define $OCI_RUNTIME, but that's actually a total fakeout! OCI_RUNTIME is used only in e2e tests, it has no effect whatsoever on actual podman itself as invoked via command line such as in system tests. Solution: use containers.conf Given how fragile all this runtime stuff is, I've also added new tests (e2e and system) that will check $CI_DESIRED_RUNTIME. Image source: containers/automation_images#146 Since we haven't actually been testing with runc, we need to fix a few tests: - handle an error-message change (make it work in both crun and runc) - skip one system test, "survive service stop", that doesn't work with runc and I don't think we care. ...and skip a bunch, filing issues for each: - containers#15013 pod create --share-parent - containers#15014 timeout in dd - containers#15015 checkpoint tests time out under $CONTAINER - containers#15017 networking timeout with registry - containers#15018 restore --pod gripes about missing --pod - containers#15025 run --uidmap broken - containers#15027 pod inspect cgrouppath broken - ...and a bunch more ("podman pause") that probably don't even merit filing an issue. Also, use /dev/urandom in one test (was: /dev/random) because the test is timing out and /dev/urandom does not block. (But the test is still timing out anyway, even with this change) Also, as part of the VM switch we are now using go 1.18 (up from 1.17) and this broke the gitlab tests. Thanks to @Luap99 for a quick fix. Also, slight tweak to containers#15021: include the timeout value, and reword message so command string is at end. Also, fixed a misspelling in a test name. Fixes: containers#14833 Signed-off-by: Ed Santiago <[email protected]>

cevich approved these changes Jul 11, 2022

View reviewed changes

cevich mentioned this pull request Jul 11, 2022

CI: runc is not being tested containers/podman#14833

Closed

edsantiago force-pushed the ubuntu_cgroups_v1 branch from f869817 to 5239508 Compare July 19, 2022 02:26

edsantiago mentioned this pull request Jul 19, 2022

Bump VMs, to Ubuntu 2204 with cgroups v1 containers/podman#14972

Merged

edsantiago force-pushed the ubuntu_cgroups_v1 branch from 5239508 to 34e1a6a Compare July 19, 2022 13:04

edsantiago force-pushed the ubuntu_cgroups_v1 branch from 34e1a6a to 54167a6 Compare July 19, 2022 19:57

edsantiago force-pushed the ubuntu_cgroups_v1 branch 6 times, most recently from a68c90d to acf8000 Compare July 20, 2022 13:50

cevich approved these changes Jul 20, 2022

View reviewed changes

cache_images/ubuntu_packaging.sh Show resolved Hide resolved

cevich added no_prior-fedora Don't build any prior-fedora images no_prior-fedora_podman Don't build the prior-fedora_podman image labels Jul 20, 2022

cevich mentioned this pull request Jul 20, 2022

Cirrus: Use pre-installed bats containers/podman#14719

Merged

cevich approved these changes Jul 21, 2022

View reviewed changes

cevich merged commit 4f34a04 into containers:main Jul 21, 2022

edsantiago deleted the ubuntu_cgroups_v1 branch July 21, 2022 19:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Force CgroupsV1 on Ubuntu #146

Force CgroupsV1 on Ubuntu #146

edsantiago commented Jul 6, 2022 •

edited

Loading

edsantiago commented Jul 6, 2022

github-actions bot commented Jul 7, 2022

edsantiago commented Jul 7, 2022

edsantiago commented Jul 7, 2022

cevich left a comment

cevich commented Jul 11, 2022

edsantiago commented Jul 11, 2022

github-actions bot commented Jul 19, 2022

github-actions bot commented Jul 19, 2022

github-actions bot commented Jul 19, 2022

cevich left a comment

cevich commented Jul 20, 2022

cevich commented Jul 20, 2022 •

edited

Loading

edsantiago commented Jul 20, 2022

rhatdan commented Jul 21, 2022

cevich commented Jul 21, 2022

cevich Jul 21, 2022

Force CgroupsV1 on Ubuntu #146

Force CgroupsV1 on Ubuntu #146

Conversation

edsantiago commented Jul 6, 2022 • edited Loading

edsantiago commented Jul 6, 2022

github-actions bot commented Jul 7, 2022

edsantiago commented Jul 7, 2022

edsantiago commented Jul 7, 2022

cevich left a comment

Choose a reason for hiding this comment

cevich commented Jul 11, 2022

edsantiago commented Jul 11, 2022

github-actions bot commented Jul 19, 2022

github-actions bot commented Jul 19, 2022

github-actions bot commented Jul 19, 2022

cevich left a comment

Choose a reason for hiding this comment

cevich commented Jul 20, 2022

cevich commented Jul 20, 2022 • edited Loading

edsantiago commented Jul 20, 2022

rhatdan commented Jul 21, 2022

cevich commented Jul 21, 2022

cevich Jul 21, 2022

Choose a reason for hiding this comment

edsantiago commented Jul 6, 2022 •

edited

Loading

cevich commented Jul 20, 2022 •

edited

Loading