-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test/e2e: check for stderr errors in cleanup() #18442
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Luap99 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@edsantiago WDYT? I don't expect this to work without further tweeks but I think this important to catch more bugs and flakes. |
I really like it! I think it might be too soon to put this in production (I've tried before and it has been impossible), but maybe once unlinkat/EBUSY gets fixed we can push this, thank you! |
Looks actually doable. Between 10-20 failures depending on which test you look, and yes several flakes as well. |
Yow. Everything looks new. Since the hard failures seem consistent, I'm treating those as SEP, and am concentrating on looking at flakes only. The |
It may be back but this a completely different failure, |
Filled #18460, trivial to reproduce. Still going through the logs to look for other hard failures. |
Much better! Only about ten flakes, all of them the |
A friendly reminder that this PR had no activity for 30 days. |
Ok, all last outstanding errors are like this:
https://api.cirrus-ci.com/v1/artifact/task/6373332413579264/html/int-podman-debian-12-root-host-boltdb.log.html Only debian but with root and rootless so could be something runc or cgroupv1 specific? |
Looks like #11784 which was closed ENOTHINGWECANDOABOUTIT |
Thanks, except this here is no flake. It is trivial to reproduce actually when using runc
|
In fact you do not even need two containers just using
|
I tested with runc from the main branch and it seems to not emit these warnings anymore. So I guess the answer is wait a bit longer until this is released and packaged in debian. |
There is no reliable way to terminate the container since cgroups are not used. We shouldn't allow it as well as joining another PID namespace |
Note I ran the reproducers locally on cgroupv2 so it shouldn't matter for this specific warning I assume. |
A friendly reminder that this PR had no activity for 30 days. |
With few exceptions, commands that exit 0 should not emit any messages with level=warning or =error. Let's start enforcing that in run_podman. Allow one-off exceptions, typically when we're testing an actual warning condition (usual case: "podman stop" where it times out to SIGKILL). Exceptions are specified via: run_podman 0+w subcommand... ^^^---- or, rarely, 0+e "0" stands for "expect exit status 0", which is the default so it's implicit anyway. The +w / +e (or even +we) is the new part. I have added it to tests where necessary. And, because life is what it is, add two global exceptions: - Debian. Because runc has too many flakes. - kube. Ditto. Kube commands emit lots of nasty error messages (yes, level=error) that don't seem to affect results. Similar to containers#18442 Signed-off-by: Ed Santiago <[email protected]>
Can we close this or update it? |
I want this although, I guess I can rebase and see what blows up. |
Debian still seems to print these runc warnings for some tests. I guess one way forward is to check stderr for crun/fedora only? @edsantiago WDYT? |
I'm totally fine with skipping warning checks on Debian. There is precedent, see #21191 And needless to say I would LOVE to have this enabled wherever possible. |
There are many code paths which only do logrus but still exit 0 so this should catch more bugs. Unfortunately runc logs way to much random stuff so we ignore this check for runc right now. Signed-off-by: Paul Holzinger <[email protected]>
/lgtm thank you! |
Note I really do not expect this to pass, I only submit in order to gather some data and see if this shows us more bugs. In the long term something like this should be included. There are many code paths which only do logrus but still exit 0 so this should catch more bugs.
Does this PR introduce a user-facing change?