-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement SD-NOTIFY proxy in conmon #8508
Conversation
Replaces: #7607 |
@goochjj I will take this one over and try to finish it up. Thanks for the PR. |
@haircommander Is this problem caused by older conmon still? |
yeah it looks like it |
@cevich PTAL |
// Bind notify proxy | ||
if c.config.SdNotifyMode == define.SdNotifyModeContainer { | ||
notifyDir := filepath.Join(c.bundlePath(), "notify") | ||
logrus.Debugf("checking notify %q dir", notifyDir) | ||
if err := os.MkdirAll(notifyDir, 0755); err != nil { | ||
if !os.IsExist(err) { | ||
return errors.Wrapf(err, "unable to create notify %q dir", notifyDir) | ||
} | ||
} | ||
if err := label.Relabel(notifyDir, c.MountLabel(), true); err != nil { | ||
return errors.Wrapf(err, "relabel failed %q", notifyDir) | ||
} | ||
logrus.Debugf("add bindmount notify %q dir", notifyDir) | ||
if _, ok := c.state.BindMounts["/run/notify"]; !ok { | ||
c.state.BindMounts["/run/notify"] = notifyDir | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly here, we need the OCI runtime to take care of it. If we pass down the notify socket directly to the container, there is a race condition between the notification being sent and systemd reading it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@giuseppe We're proxying. PODMAN is now in charge of notification. NOT the OCI runtime. That's the point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PODMAN creates /run/notify/
PODMAN tells OCI runtime to bind-mount /run/notify to the container's setup folder
CONMON forwards /run/notify/notify.sock ( in container, by way of bind mount ) to --sdnotify-socket on host
OCI runtime is NOT taking care of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: giuseppe, rhatdan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Not sure what is needed from me. I peeked at the testing results and they all appear related to the changes here, no? |
We may need new VM images with a newer Conmon. |
Yes @cevich new vms need latest conmon. |
And newer Ubuntu |
Note, "No" is acceptable: in that case I'll roll up new images separate from my Ubuntu work, and they can be brought in via this PR. |
@cevich depends on when I can get new Ubuntu. A couple of days, I am fine with it. weeks, I have a problem. |
Sounds fair, if we get into next week and I'm still stuck, I'll make a new PR to just refresh the current images. There is a risk it will fail with Ubuntu 19, as we're on borrowed-time WRT the package repositories (19 is EOL) remaining available. Otherwise/also, I've just finished building fresh and am trying those out right now in #8312... |
...yeah, we've got problems with the new VMs that will probably take a while to solve. But maybe we have some luck here, b/c IIRC brent built new VMs to incorporate docker compose. It's not yet in master, @rhatdan try this see if it works: diff --git a/.cirrus.yml b/.cirrus.yml
index 0fa51be63..be4c0e459 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -30,7 +30,7 @@ env:
PRIOR_UBUNTU_NAME: "ubuntu-19"
# Google-cloud VM Images
- IMAGE_SUFFIX: "c5402398833246208"
+ IMAGE_SUFFIX: "c4704091098054656"
FEDORA_CACHE_IMAGE_NAME: "fedora-${IMAGE_SUFFIX}"
PRIOR_FEDORA_CACHE_IMAGE_NAME: "prior-fedora-${IMAGE_SUFFIX}"
UBUNTU_CACHE_IMAGE_NAME: "ubuntu-${IMAGE_SUFFIX}" |
Oh! I can check what version of conmon is in ubuntu-c4704091098054656...looks like |
I think we need 2.0.21 |
okay. Lokesh has a new OBS repo with libseccomp 2.5.1 and other updated packages for us. Building images now... |
...testing new images. |
Update: I'm down to ~2-5 failing tests on the new images. These failures seem to be caused by "too old" package versions. We're asking a lot of Lokesh to maintain all these special Ubuntu repos. in the first place, so I'm afraid to ask for updates to the (now defunct) 19.10 repo, in addition to 20.04 and 20.10. At the same time, I don't want to give you a steaming pile to work with, so I'm doing by best to get all the tests passing (PR #8312) first. |
Update: Getting very close with the new images, though I've had to add skips to workaround a few broken tests/situations. |
Update: The suite of podman CI tests should be passing now, so this PR is waiting on review and merge of #8312 |
ba737a4
to
7c193c2
Compare
Not going to make the 3.0 release. Will have to wait until 3.1 |
5d2d987
to
e3aa257
Compare
1beb54d
to
bab9e74
Compare
A friendly reminder that this PR had no activity for 30 days. |
This leverages conmon's ability to proxy the SD-NOTIFY socket. This prevents locking caused by OCI runtime blocking, waiting for SD-NOTIFY messages, and instead passes the messages directly up to the host. [NO TESTS NEEDED] existing tests cover this use case. Signed-off-by: Joseph Gooch <[email protected]> Signed-off-by: Daniel J Walsh <[email protected]>
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: giuseppe, rhatdan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What's the status here? I see we have a commit changing the default type of |
I have no idea of the state. I grabbed this PR to try to push it over the edge a while back. but I have no idea what is going on at this state. If @mheon or someone else wants to finish this, I am open to them taking it. |
What is missing besides of a bunch of failing tests? These seem to fail on many pull requests, however. 🤷♂️ Don't know why.. |
@@ -1254,6 +1244,12 @@ func (r *ConmonOCIRuntime) sharedConmonArgs(ctr *Container, cuuid, bundlePath, p | |||
args = append(args, rFlags...) | |||
} | |||
|
|||
if ctr.config.SdNotifyMode == define.SdNotifyModeContainer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we also have to unset NOTIFY_SOCKET to prevent runc and crun from blocking? Not sure if I understood it correctly though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well we don't want to pass these on to the OCI Runtimes, yes, since we don't want them handling it.
@rhatdan suggested that I take over this PR to kick it over the finish line, closing. |
This leverages conmon's ability to proxy the SD-NOTIFY socket.
This prevents locking caused by OCI runtime blocking, waiting for
SD-NOTIFY messages, and instead passes the messages directly up
to the host.
Signed-off-by: Joseph Gooch [email protected]
Signed-off-by: Daniel J Walsh [email protected]