-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
podman-2.0.0 (and previous) generate large numbers of untagged containers #6801
Comments
Are you building the images with Buildah, or Podman? @TomSweeneyRedHat PTAL, this sounds concerning |
All via podman, I don't have a separate buildah installation. |
We do expect a fair number of However, the inability to remove them because of leftover build containers definitely is. I would expect that a |
@srcshelton Multistage builds? (Multiple froms in a single build) I'd expect podman to cache layers created during multistage build. Some of layers created during multistage build are untagged top level layers. Those probably show up in podman images ls. Podman (buildah) should prevent you from removing such layer while build is still in progress. Perhaps there is an issue with this logic (if it exists) I won't go much further with my guessing 🙂 |
A fair number of multi-stage builds, yes. Something certainly appears to be leaving locks/references behind, though - it there a way I can confirm the linkages which podman believes to exist next time I get a stuck image? Also, is the expected behaviour that running I've actually seen up to five invocations of |
That |
I suspect that it's related, though - it doesn't always behave like this, but the frequency of occurrence does seem related to 'stuck' images apparently linked to a non-existent container. |
Edit: Not sure, if there is anything wrong with prune. I was able to simulate similar results when I created images using buildah. I'm not sure if my observation is any way related to this considering @srcshelton isn't using buildah directly. But since podman is internally using buildah, I'd imagine that it's possible that if for any given reason build fails to delete container created during build, it might not be visible in podman. I created few images with buildah and intentionally didn't clean containers that were created for building. Results were similar to what @srcshelton observed. As you might expect, prune can't clean up images that have references:
Logs show following:
Again, as you might expect, with buildah, I can see that there are still containers:
Now if I remove those and, podman prune works properly:
I think this a bit confusing from user perspective. Image shouldn't be flagged as dangling, if there's still a container referencing to it. Considering that podman is able to figure out that it isn't allowed to delete the image, it should also understand that image isn't dangling. The fact that podman and buildah share image lists, but they don't share container view can be quite confusing from user perspective. Would it be possible somehow to mark which applications are referencing to images? This would allow you to produce a log entry that tells user why image can't be removed. Better yet, images list could be enhanced to include this information |
I was able to replicate this issue confirm my suspicions. Temporary containers created during build process are not visible to To reproduce issue:
Console output from my test:
Example Dockerfile:
I think that you need to make these temporary containers created with |
In my build-scripts, I've added traps around any podman invocations which commit or remove images - these processes seem fragile, and interrupting podman during an Since podman integrates buildah, I've not (to date) had buildah separately installed... can these 'invisible' buildah images be managed in any way via podman's other commands or directly on the filesystem, or is having the separate buildah install pretty much a pre-requisite (... and is that intentionally so?) |
@TomSweeneyRedHat or @ashley-cui Could you look into this. I think we have a long term issue to manage buildah images from within Podman. |
I'm honestly more concerned about the missing manifests problem that was mentioned initially - if we could get more details (error messages, a reproducer) there, it would be greatly appreciated. The lack of Buildah integration is something we've known about for a while and is on our list of things to fix, but not being able to fully delete images is very bad. |
I'll post here as soon as I can reproduce... although the locking I've added does (coincidentally?) seem to have cut-down on the frequency of occurrence. To summarise my thinking: podman should be interruptible, safely. If any action can't be safely interrupted, then signals should be ignored until the critical action is completed. Alternatively, changes should be performed by staging updates and then atomically committing them - so that if the process is killed during staging then it can subsequently be cleaned-up, and once the change is committed then further processing is again safe. (This becomes more complicated where podman is invoking separate or third-party components... but assuming that these are fragile until proven not is not necessarily the worst of plans...) |
I spent an hour and a half trying to break system manually in a way that would not let me remove images with just podman. I simply couldn't get system in that kind of state. It could be that I'm using podman 2.0.0 - but I doubt that. @srcshelton There isn't anything out of the ordinary with your system? I noticed that you are storing images in non-default location, is that local filesystem? Also, is your run root a tmpfs system? And finally, are you running podman in parallel? |
I'm currently still on My setup is probably a little odd, yes - I'm migrating from an old 32-bit system image to a 64-bit one I'm building, with as many services containerised as possible. As such, the One notable anomaly about this setup, which I'm assuming is due to starting from a gaol, is that if I (... this does seem to make for a novel break-out solution though, faced with being 'root' in a chroot() gaol...) All filesystems are local (although some elements which get mounted into containers are themselves NFS-mounted) and I'm not running multiple podman instances in parallel. |
I've been unable to reproduce the missing manifest problem in the past few days, so perhaps the issue was resolved sometime between the various I'm totally happy for this issue to be closed pending a recurrence, or left open a little longer in case I can get it to happen again... |
Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
I'm assuming that this isn't intended behaviour, but podman generates and maintains a large number of (presumably) intermediate containers:
... many of which claim to be in use by a container when attempting to delete, even if no containers are running.
Sometimes these can be force-deleted, but often this results in podman then having stuck images for which it complains that manifests are missing - and the only solution I've found it to clear Podman's state and start again.
Steps to reproduce the issue:
Construct container images (including a mixture of 'build' and 'run/commit' stages)
podman image ls
Observe the number of none/none images
Describe the results you received:
Many untagged temporary(?) images listed
Describe the results you expected:
Untagged images to automatically be removed (without using 'prune') - or, if pruned or deleted manually, this should be a safe operation (even if forced) and shouldn't result in stuck/unreadable images which cannot be further processed.
Additional information you deem important (e.g. issue happens only occasionally):
Numerous untagged images are generated every time images are constructed. Often these claim to be associated with a container even if none exist. Sometimes these become corrupted on (forced) deletion, and appear to require podman state erased in order to remove entirely.
So there are effectively three related issues:
Untagged, presumably temporary images are kept after the successful completion of
build
commands. This may be intentional or designed to mirror docker behaviour, but also:Often, attempting to prune or manually delete these temporary images incorrectly results in a 'image is in use by container', even if
podman ps -a
shows no running containers;Force-deleting images apparently associated with a non-existent (or hidden?) container results in state corruption with
podman
reporting that the image manifest is missing or corrupt (with the overlay graph driver).Output of
podman version
:Output of
podman info --debug
:Package info (e.g. output of
rpm -q podman
orapt list podman
):Additional environment details (AWS, VirtualBox, physical, etc.):
n/a
The text was updated successfully, but these errors were encountered: