-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rootless 'podman exec' stopped working with containers created with '--uidmap' somewhere between 1b2f8679b864b882 (good) and 9b21f14eefae8ab7 (bad) #2673
Comments
Bisecting further with the actual Git tree, instead of the RPM snapshots, I think this was caused by PR #2569: commit 4a02713
|
We cannot use the RunDir for writing the conmon.pid file as we might not be able to read it before we join a namespace, since it is owned by the root in the container which can be a different uid when using uidmap. To avoid completely the issue, we will just write it to the static dir which is always readable by the unprivileged user. Closes: containers#2673 Signed-off-by: Giuseppe Scrivano <[email protected]>
thanks for the bisect. The patch helped to find the issue, which is a different one: |
also, fedora-toolbox is a very interesting use case and it is stretching rootless containers to their limits :-) once we have dedicated rootless tests, @debarshiray would you be interested to write some tests for Podman for the features you are using in fedora-toolbox? So we will avoid to introduce regressions in the future |
Nice. Although, if I am not mistaken, doesn't this mean that rootless containers created with earlier versions of podman can no longer be used with podman exec? I wonder if we can soften the blow somehow. One option that came to my mind was to not hard depend on the ConmonPidFile. Maybe we can fallback to getting the PID of the container's PID 1 through ctr.PID(), and then walking back up the process tree to find the conmon process using some heuristics? Although, I don't know what could happen if someone tricks the heuristics into a false positive. |
Sure, I will be happy to. :) |
Shouldn't break existing containers - we store the path in the database, so
older containers should still use the old default they were created with,
while new containers get the new, fixed default.
…On Mon, Mar 18, 2019, 06:49 Debarshi Ray ***@***.***> wrote:
also, fedora-toolbox is a very interesting use case and it is stretching
rootless
containers to their limits :-)
once we have dedicated rootless tests, @debarshiray
<https://github.com/debarshiray> would you be interested to
write some tests for Podman for the features you are using in
fedora-toolbox? So
we will avoid to introduce regressions in the future
Sure, I will be happy to. :)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2673 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHYHCM0TtZtBu0i3fBEuUbnYz5BOgv04ks5vX4togaJpZM4b3GLN>
.
|
Umm... but this isn't about the presence of the path to the I had earlier checked with the last known good snapshot from master (ie. commit 1b2f867), and now confirmed that it's true with the 1.1.2 release too. :) If we can't soften the blow, then it would be good to have a narrower exit code that stands for "container needs to be re-created" so that scripts like the Silverblue toolbox can throw a less generic error message. Maybe we should even standardize such an exit code across all commands that work with an already created container? |
Ahhh. I would have called that not fixing existing containers instead of breaking them, but you're 100% correct. I don't know how we can easily detect this specific case; it could just as easily be someone accidentally running chown -r on a directory we use. |
One could try to guess the cause of the failure by checking if the Is it possible to find out the path to the |
If we're not already reporting it in |
And by |
Ok, submitted #2698 to export |
Podman commit 4a02713c57d874c4 [1] broke 'podman exec' for existing containers. Note that support for ConmonPidFile in 'podman inspect' was added in the same version of Podman as the one that has the breakage caused by commit 4a02713c57d874c4 [1]. Therefore, there's no need to bump the minimum required version of Podman. Interim Git snapshots between the two will only cause the extra hint in the error message to be hidden. [1] containers/podman@4a02713c57d874c4 containers/podman#2673
I added a more targetted hint to the error message from |
It seems precisely like breaking existing containers to me - as I understand the status after #2675 , if you create a container with podman < 1.2.0 [edit, fixed comparison], then unless this is resolved, upgrading to 1.2.0 makes this working container no longer work properly.
What about simply recreating ConmonPidFile when starting a stopped container? |
The problem is that #2675 writes the default I wonder if we can avoid writing the default path to the configuration when the user hadn't specified one explicitly, and keep plugging it in dynamically at run-time? If we can do that, then it might open up some possibilities that aren't as extreme as "re-create the container and start afresh". :) I don't know if it will work, but something like "you have to re-start the container" is a lot nicer. |
Not keeping the default in the database is arguably worse - we won't even
be able to find the file for running containers that haven't restarted.
This at least preserves consistency, in that we know where the file is with
certainty, even if the location has incorrect permissions.
…On Mon, Mar 18, 2019, 14:19 Debarshi Ray ***@***.***> wrote:
I don't know how we can easily detect this specific case; it could just as
easily be someone accidentally running chown -r on a directory we use.
What about simply recreating ConmonPidFile when starting a stopped
container?
The problem is that #2675 <#2675>
writes the default ConmonPidFile path to the container's configuration if
the user hadn't explicitly specified one. That makes things a bit difficult
in terms of falling back to softer recovery modes.
I wonder if we can avoid writing the default path to the configuration
when the user hadn't specified one explicitly, and keep plugging it in
dynamically at run-time? If we can do that, then it might open up some
possibilities that aren't as extreme as "re-create the container and start
afresh". :)
I don't know if it will work, but something like "you have to re-start the
container" is a lot nicer.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2673 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHYHCLmKhVqma_zYOE-DgUOX-IlhLK77ks5vX_TUgaJpZM4b3GLN>
.
|
@owtaylor This changes nothing for containers created for previous versions of Podman - the new default only applies to new containers. Previous containers will use the old path. I don't see how that breaks old containers - if they are broken, they were broken before the patch. |
As I understand it, since #2569 the old path doesn't work any more - #2675 fixes that for new containers, but old containers keep using the conmon default location (in the CWD of the conmon execution) Putting the new default directly into the args passed to conmon rather than writing them into the config file would seem better... then it would work for any container. |
Ah, I guess that wouldn't help locate the pid file for an existing running container instance - but I still think that a) the current code state will break the ability of people to keep using rootless containers they created with older versions of podman b) with some rearrangement of code, that isn't necessary. |
I'm still convinced that's a bad idea - we risk losing track of where the PID file lives. It also won't fix your old containers - their configs will still have the old default path, so we'll still use it. We can't rewrite existing container configs, either - that's a limitation of libpod's immutable container model. Our best bet might be trying to make the c/storage rundir accessible ( The bigger takeaway from this is that our rootless testing is massively inadequate, because this snuck in and nobody realized until toolbox started hitting it. I see two PRs open right not to run the full test suite as rootless, which is a good start. |
The old path was only written into the config since 1.1.0 - but yeah, 1.1.2 did end up in Fedora stable two weeks ago :-( so that makes it hard to handle all old containers. You could just check if the pidfile in the config file is inside the runtime dir... The upgrade issue wouldn't be caught by more testing in the test suite, but certainly that would be a great start! |
it changes only the default value. Existing containers will keep using their configuration that was created with the older podman version. |
Yes, containers created with podman <= 1.1.2 will keep using their existing configurations |
Given the permissions issues with |
we can add a workaround to fallback to the previous behaviour when we cannot read the conmon pid file. It is still broken with --user, but it is not a regression since it was already broken in the previous version. What do you think? |
PR for the workaround: #2762 |
fallback to the previous behavior of joining only the user namespace, when we cannot join the conmon userns+mount namespaces. Closes: containers#2673 Signed-off-by: Giuseppe Scrivano <[email protected]>
Makes sense to me! |
/kind bug
Description
Rootless
podman exec
stopped working somewhere between 1b2f867 (good) and 9b21f14 (bad) as can be seen when using the Silverblue toolbox. Starting from a clean state (ie., no images or containers present), this is what happens:Note that in case you are running the toolbox script directly from Git, make sure that you have a
/run/media
directory. Thetoolbox
RPM takes care of this.The command that's failing is:
Output of
podman version
:Output of
podman info --debug
:The text was updated successfully, but these errors were encountered: