Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that a broken OCI spec does not break inspect #15799

Merged
merged 1 commit into from
Sep 14, 2022

Conversation

mheon
Copy link
Member

@mheon mheon commented Sep 14, 2022

The process of saving the OCI spec is not particularly reboot-safe. Normally, this doesn't matter, because we recreate the spec every time a container starts, but if one was to reboot (or SIGKILL, or otherwise fatally interrupt) Podman in the middle of writing the spec to disk, we can end up with a malformed spec that sticks around until the container is next started. Some Podman commands want to read the latest version of the spec off disk (to get information only populated after a container is started), and will break in the case that a partially populated spec is present. Swap to just ignoring these errors (with a logged warning, to let folks know something went wrong) so we don't break important commands like podman inspect in these cases.

[NO NEW TESTS NEEDED] Provided reproducer involves repeatedly rebooting the system

NONE

@openshift-ci openshift-ci bot added release-note-none approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Sep 14, 2022
The process of saving the OCI spec is not particularly
reboot-safe. Normally, this doesn't matter, because we recreate
the spec every time a container starts, but if one was to reboot
(or SIGKILL, or otherwise fatally interrupt) Podman in the middle
of writing the spec to disk, we can end up with a malformed spec
that sticks around until the container is next started. Some
Podman commands want to read the latest version of the spec off
disk (to get information only populated after a container is
started), and will break in the case that a partially populated
spec is present. Swap to just ignoring these errors (with a
logged warning, to let folks know something went wrong) so we
don't break important commands like `podman inspect` in these
cases.

[NO NEW TESTS NEEDED] Provided reproducer involves repeatedly
rebooting the system

Signed-off-by: Matthew Heon <[email protected]>
@mheon
Copy link
Member Author

mheon commented Sep 14, 2022

Should go green now, @containers/podman-maintainers PTAL

@Luap99
Copy link
Member

Luap99 commented Sep 14, 2022

Will this cause panics when the spec is nil?

@mheon
Copy link
Member Author

mheon commented Sep 14, 2022

We're returning c.config.Spec which is still a valid spec - just not the invalid one from disk.

Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I wonder if we should remove the spec from disc if it is invalid to avoid spamming this message?

@mheon
Copy link
Member Author

mheon commented Sep 14, 2022

Hm. We'd have to check if the container is running first, I imagine? Removing an invalid OCI spec while the container does not exist seems perfectly valid; removing an invalid OCI spec that has somehow been used to create a valid, running container seems like a bad idea.

Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 14, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 14, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe, mheon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 4fc18d0 into containers:main Sep 14, 2022
@mheon
Copy link
Member Author

mheon commented Sep 23, 2022

/cherry-pick v4.2.0-rhel

@mheon
Copy link
Member Author

mheon commented Sep 23, 2022

Bot is broklen, doing it manually

rhatdan added a commit that referenced this pull request Sep 23, 2022
mheon added a commit to mheon/libpod that referenced this pull request Sep 28, 2022
The process of saving the OCI spec is not particularly
reboot-safe. Normally, this doesn't matter, because we recreate
the spec every time a container starts, but if one was to reboot
(or SIGKILL, or otherwise fatally interrupt) Podman in the middle
of writing the spec to disk, we can end up with a malformed spec
that sticks around until the container is next started. Some
Podman commands want to read the latest version of the spec off
disk (to get information only populated after a container is
started), and will break in the case that a partially populated
spec is present. Swap to just ignoring these errors (with a
logged warning, to let folks know something went wrong) so we
don't break important commands like `podman inspect` in these
cases.

[NO NEW TESTS NEEDED] Provided reproducer involves repeatedly
rebooting the system

Backport of GH PR containers#15799 to v4.1.1-rhel branch per RHBZ 2130237

Signed-off-by: Matthew Heon <[email protected]>
openshift-merge-robot added a commit that referenced this pull request Sep 28, 2022
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 20, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note-none
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants