-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure that a broken OCI spec does not break inspect #15799
Conversation
The process of saving the OCI spec is not particularly reboot-safe. Normally, this doesn't matter, because we recreate the spec every time a container starts, but if one was to reboot (or SIGKILL, or otherwise fatally interrupt) Podman in the middle of writing the spec to disk, we can end up with a malformed spec that sticks around until the container is next started. Some Podman commands want to read the latest version of the spec off disk (to get information only populated after a container is started), and will break in the case that a partially populated spec is present. Swap to just ignoring these errors (with a logged warning, to let folks know something went wrong) so we don't break important commands like `podman inspect` in these cases. [NO NEW TESTS NEEDED] Provided reproducer involves repeatedly rebooting the system Signed-off-by: Matthew Heon <[email protected]>
Should go green now, @containers/podman-maintainers PTAL |
Will this cause panics when the spec is nil? |
We're returning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I wonder if we should remove the spec from disc if it is invalid to avoid spamming this message?
Hm. We'd have to check if the container is running first, I imagine? Removing an invalid OCI spec while the container does not exist seems perfectly valid; removing an invalid OCI spec that has somehow been used to create a valid, running container seems like a bad idea. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: giuseppe, mheon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cherry-pick v4.2.0-rhel |
Bot is broklen, doing it manually |
The process of saving the OCI spec is not particularly reboot-safe. Normally, this doesn't matter, because we recreate the spec every time a container starts, but if one was to reboot (or SIGKILL, or otherwise fatally interrupt) Podman in the middle of writing the spec to disk, we can end up with a malformed spec that sticks around until the container is next started. Some Podman commands want to read the latest version of the spec off disk (to get information only populated after a container is started), and will break in the case that a partially populated spec is present. Swap to just ignoring these errors (with a logged warning, to let folks know something went wrong) so we don't break important commands like `podman inspect` in these cases. [NO NEW TESTS NEEDED] Provided reproducer involves repeatedly rebooting the system Backport of GH PR containers#15799 to v4.1.1-rhel branch per RHBZ 2130237 Signed-off-by: Matthew Heon <[email protected]>
[v4.1.1-rhel] Backport #15799
The process of saving the OCI spec is not particularly reboot-safe. Normally, this doesn't matter, because we recreate the spec every time a container starts, but if one was to reboot (or SIGKILL, or otherwise fatally interrupt) Podman in the middle of writing the spec to disk, we can end up with a malformed spec that sticks around until the container is next started. Some Podman commands want to read the latest version of the spec off disk (to get information only populated after a container is started), and will break in the case that a partially populated spec is present. Swap to just ignoring these errors (with a logged warning, to let folks know something went wrong) so we don't break important commands like
podman inspect
in these cases.[NO NEW TESTS NEEDED] Provided reproducer involves repeatedly rebooting the system