-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bump VMs, to Ubuntu 2204 with cgroups v1 #14972
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -71,27 +71,20 @@ fi | |
|
||
cd "${GOSRC}/" | ||
|
||
# Defined by lib.sh: Does the host support cgroups v1 or v2 | ||
# Defined by lib.sh: Does the host support cgroups v1 or v2? Use runc or crun | ||
# respectively. | ||
# **IMPORTANT**: $OCI_RUNTIME is a fakeout! It is used only in e2e tests. | ||
# For actual podman, as in system tests, we force runtime in containers.conf | ||
case "$CG_FS_TYPE" in | ||
tmpfs) | ||
if ((CONTAINER==0)); then | ||
warn "Forcing testing with runc instead of crun" | ||
if [[ "$OS_RELEASE_ID" == "ubuntu" ]]; then | ||
# Need b/c using cri-o-runc package from OBS | ||
echo "OCI_RUNTIME=/usr/lib/cri-o-runc/sbin/runc" \ | ||
>> /etc/ci_environment | ||
else | ||
echo "OCI_RUNTIME=runc" >> /etc/ci_environment | ||
fi | ||
echo "OCI_RUNTIME=runc" >> /etc/ci_environment | ||
printf "[engine]\nruntime=\"runc\"\n" >>/etc/containers/containers.conf | ||
fi | ||
;; | ||
cgroup2fs) | ||
if ((CONTAINER==0)); then | ||
# This is necessary since we've built/installed from source, | ||
# which uses runc as the default. | ||
warn "Forcing testing with crun instead of runc" | ||
echo "OCI_RUNTIME=crun" >> /etc/ci_environment | ||
fi | ||
# Nothing to do: podman defaults to crun | ||
;; | ||
*) die_unknown CG_FS_TYPE | ||
esac | ||
|
@@ -368,7 +361,7 @@ case "$TEST_FLAVOR" in | |
slug="gitlab.com/gitlab-org/gitlab-runner" | ||
helper_fqin="registry.gitlab.com/gitlab-org/gitlab-runner/gitlab-runner-helper:x86_64-latest-pwsh" | ||
ssh="ssh $ROOTLESS_USER@localhost -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o CheckHostIP=no env GOPATH=$GOPATH" | ||
showrun $ssh go get -u github.com/jstemmer/go-junit-report | ||
showrun $ssh go install github.com/jstemmer/go-junit-report/[email protected] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A comment explaining why this is necessary would be helpful. Our setup for running the gitlab unit tests was carefully crafted based on explicit instructions from the gitlab runner maintainers. I don't recall this version-reference being in them. But in general, our task should simply follow the "latest and greatest" upstream. If we deviate and it breaks, it can be time-consuming to engage with them (they have other priorities). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. go get simply does not work with go 1.18, you have to go install. You could add There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Furthermore: I cannot find anywhere in that file where it says "this is carefully crafted from anything". If this is fragile code, it must include comments saying so and linking to exact instructions (e.g. on gitlab.com). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not 'go installing' for podman, it's doing so for the upstream gitlab tests.
Hrmmm, I thought I had left a comment about that...Yeah, it's this one: but you're right, it's not clear at all how specialized this setup is. I s'pose was unintentionally relying on the sheer volume of setup section steps in that link. I'll post a docs PR to make it more clear. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMHO, I think Paul's idea is probably okay, so:
This tool is only needed at the very end of testing, and AFAIK only used to check and convert the results into a useful (for upstream developers) format. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Luap99's idea is to use a strict version, not There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Normally yes, but this is a different context. And there's always the |
||
showrun $ssh git clone https://$slug $GOPATH/src/$slug | ||
showrun $ssh make -C $GOPATH/src/$slug development_setup | ||
showrun $ssh bash -c "'cd $GOPATH/src/$slug && GOPATH=$GOPATH go get .'" | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -524,6 +524,8 @@ subdir**` | |
// See https://github.com/containers/podman/issues/13535 | ||
It("Remote build .containerignore filtering embedded directory (#13535)", func() { | ||
SkipIfNotRemote("Testing remote .containerignore file filtering") | ||
Skip("FIXME: #15014: test times out in 'dd' on f36.") | ||
|
||
podmanTest.RestartRemoteService() | ||
|
||
// Switch to temp dir and restore it afterwards | ||
|
@@ -552,7 +554,7 @@ subdir**` | |
Expect(ioutil.WriteFile(filepath.Join(subdirPath, "extra"), contents.Bytes(), 0644)). | ||
ToNot(HaveOccurred()) | ||
randomFile := filepath.Join(subdirPath, "randomFile") | ||
dd := exec.Command("dd", "if=/dev/random", "of="+randomFile, "bs=1G", "count=1") | ||
dd := exec.Command("dd", "if=/dev/urandom", "of="+randomFile, "bs=1G", "count=1") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can never remember which is the fast pseudo-random one? We should use that. The entropy-pool in these VMs is a problem we've struggled with in different contexts before. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can safely assume that if I changed it, with a corresponding acknowledgment in the commit message, it is the correct change. But you can also confirm at
I'm guessing that we're hitting this delay, I just don't know what to do about it, so I'm leaving that as SEP (#15014). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In case anybody does some archaeology on this change: GCE uses QEMU/KVM VMs and does not provide a virtual RNG for them (we asked). That means the entropy pool is incredibly limited, and we rely on the (pre-enabled) |
||
ddSession, err := Start(dd, GinkgoWriter, GinkgoWriter) | ||
Expect(err).ToNot(HaveOccurred()) | ||
Eventually(ddSession, "10s", "1s").Should(Exit(0)) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -152,4 +152,19 @@ var _ = Describe("Podman Info", func() { | |
Expect(session.OutputToString()).To(ContainSubstring("memory")) | ||
Expect(session.OutputToString()).To(ContainSubstring("pids")) | ||
}) | ||
|
||
It("Podman info: check desired runtime", func() { | ||
// defined in .cirrus.yml | ||
want := os.Getenv("CI_DESIRED_RUNTIME") | ||
if want == "" { | ||
if os.Getenv("CIRRUS_CI") == "" { | ||
Skip("CI_DESIRED_RUNTIME is not set--this is OK because we're not running under Cirrus") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it make sense to just define There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Absolutely not. The Makefile has no knowledge of which VMs want crun and which want runc. Only Although these tests add some duplication, the reason is that I never, ever, ever want to go through this nightmare week again. Nor do I want anyone else to. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Relating to Edit: removed comment. I read the code wrong. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
} | ||
Fail("CIRRUS_CI is set, but CI_DESIRED_RUNTIME is not! See #14912") | ||
} | ||
session := podmanTest.Podman([]string{"info", "--format", "{{.Host.OCIRuntime.Name}}"}) | ||
session.WaitWithDefaultTimeout() | ||
Expect(session).To(Exit(0)) | ||
Expect(session.OutputToString()).To(Equal(want)) | ||
}) | ||
}) |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -293,6 +293,9 @@ var _ = Describe("Podman manifest", func() { | |
}) | ||
|
||
It("authenticated push", func() { | ||
if podmanTest.Host.Distribution == "ubuntu" && IsRemote() { | ||
Skip("FIXME: #15017. Registry times out.") | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @lsm5 was fighting with this one. Not sure if he opened an issue or not. IIRC, paul suggested checking the networking and/or firewall. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah. I don't care right now. My only goal is to get this merged. Someone else will have to deal with it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I smell a massive month-long "FIXME" & bug-fix effort in the team's future 😞 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep. Absolutely. And I'm 100000% fine with that: small little chunky fixes over time. The alternative is to hold this PR until every single runc bug is fixed, which may take a month or more, and in the meantime more and more runc-related bugs will sneak in. |
||
registryOptions := &podmanRegistry.Options{ | ||
Image: "docker-archive:" + imageTarPath(REGISTRY_IMAGE), | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@edsantiago I don't remember checking this or not, so maybe a duplicate comment: In
lib.sh
there is a list of "pass-through" env. var. names. Please make sure this one is on that list to ensure the variable and value are passed into (for example) rootless and podman-in-podman CI scripts.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a really good point, thanks. It looks like the test passed in int-rootless, but not in containerized, so I have work to do. I've also added a triple-check so the test will fail if
$CIRRUS_CI
is defined but$CI_DESIRED_RUNTIME
is not.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof, one thing to watch out for here is: I don't think we set
CIRRUS_CI=true
orCI=true
inhack/get_ci_env.sh
. However that script automatically runssetup_environment.sh
for developers before handing them a shell. So checks against those vars could cause problems.BTW, have you tried
hack/get_ci_vm.sh
recently? For a number of months, it has been updated to not requiresudo
, only runs rootless, and only touches files under your~/.config/gcloud
directory. All things you took issue with in the past. Anyway...in case it's useful for you. It's certainly faster for debugging CI changes than waiting for CIrrus to run them.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did try
get_ci_vm.sh
2-3 weeks ago, rootless (thank you!), but it failed with a long complicated error that I couldn't figure out. I tried a reset(?) arg, still failed. Gave up.It's 100% OK if
$CIRRUS_DESIRED_RUNTIME
is set without$CIRRUS_CI
. What is not OK is the converse. And I think (fingers crossed) that this iteration, the one running in CI just now, handles all the corner cases. We'll see in two hours.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That may have been the SNAFU w/ AWS. Should be fixed now.
You can try completely removing
~/.config/gcloud
worst-case. Everythnig in there is generated automatically. All you'll have to do is re-init (do the browser URL copy-paste thing, select region, etc.).