-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unset SocketLabel after system finishes checkpointing #12388
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rhatdan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@edsantiago no idea if this will help the failure situation, but if this works we know we have a race condition. |
Oh, interesting idea. Thank you! |
Flake is #9597, restarted. |
And, this time, flake is #10710. Restarted. |
Oh well. Third flake is the one you were trying to prevent.
|
515d145
to
647b834
Compare
TIL that one can Re-run a test that passed. Took two retries but here's the log, and here's the (not-especially-helpful) excerpt:
|
@giuseppe WDYT? Does not look like DAC or SELinux, could this error happen because systemd refused the connection and gave permission denied? |
c5a2fcd
to
3274ad4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very quick first pass. This is way above my head, so please take my comments (except the misspelling) with a chunk of salt.
libpod/oci_conmon_linux.go
Outdated
runtime.UnlockOSThread() | ||
if oldRuntimeDirSet { | ||
if err = os.Setenv("XDG_RUNTIME_DIR", oldRuntimeDir); err != nil { | ||
logrus.Warnf("cannot unsset XDG_RUNTIME_DIR: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be "restore", not unset (unless I'm misunderstanding, which is very possible)
Also, should this be done in a defer
? I can see code paths that would bail out before reset/unset.
libpod/oci_conmon_linux.go
Outdated
} | ||
} else { | ||
if err = os.Unsetenv("XDG_RUNTIME_DIR"); err != nil { | ||
logrus.Warnf("cannot unsset XDG_RUNTIME_DIR: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one 's' in "unset"
runtime.LockOSThread() | ||
if err := label.SetSocketLabel(ctr.ProcessLabel()); err != nil { | ||
return 0, err | ||
} | ||
runtimeCheckpointStarted := time.Now() | ||
err = utils.ExecCmdWithStdStreams(os.Stdin, os.Stdout, os.Stderr, nil, r.path, args...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
linter is complaining about unused err
I have seen this error when experimenting with checkpoint and restore in CRI-O but never thought that it might be a problem for Podman. It was needed for CRI-O because it keeps running, but I have never seen it with Podman. The problem I had was that if during setting the label and resetting the label another Go thread was created it was created with the wrong socket label. I see that you are using LockOSThread() and UnlockOSThread(). I was not aware that something like that exist. Good to know. Overall this is exactly what I tried in CRI-O (minus locking and unlocking OSThread) and looks like the right solution. |
This should fix the SELinux issue we are seeing with talking to /run/systemd/private. Fixes: containers#12362 Also unset the XDG_RUNTIME_DIR if set, since we don't know when running as a service if this will cause issue.s Signed-off-by: Daniel J Walsh <[email protected]>
Yay, CI is green on the first try! |
I am taking @adrianreber And @edsantiago as two LGTM |
LGTM, I had just a comment that I've fixed in a follow up PR: #12404 |
This should fix the SELinux issue we are seeing with talking to
/run/systemd/private.
Fixes: #12362
Also unset the XDG_RUNTIME_DIR if set, since we don't know when running
as a service if this will cause issue.s
Signed-off-by: Daniel J Walsh [email protected]
What this PR does / why we need it:
How to verify it
Which issue(s) this PR fixes:
Special notes for your reviewer: