-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mount snapshots on Windows (Isolated PR) #4425
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Build succeeded.
|
67a6182
to
4c41722
Compare
Build succeeded.
|
CGroups failure is nothing to do with my changes; the vagrant image can't resolve DNS:
|
4c41722
to
245bf6a
Compare
Build succeeded.
|
The only way to get a parentless WCOW layer is to apply a diff for a base layer into a new snapshot. In all the current cases, this means a closed-cycle of Prepare->Apply->Commit, so the only thing that will see the "bind" mount will be the `windowsDiffer`. In all other cases, a parentless layer is actually just an empty directory, and so we can deal with it without needing to use any of the "windows-layer" management tools in hcsshim. Signed-off-by: Paul "TBBle" Hampson <[email protected]>
Using symlinks for bind mounts means we are not protecting an RO-mounted layer against modification. Windows doesn't currently appear to offer a better approach though, as we cannot create arbitrary empty WCOW scratch layers at this time. For windows-layer mounts, Unmount does not have access to the mounts used to create it. So we store the relevant data in an Alternate Data Stream on the mountpoint in order to be able to Unmount later. Based on approach in containerd#2366, with sign-offs recorded as 'Based-on-work-by' trailers below. Signed-off-by: Paul "TBBle" Hampson <[email protected]> Based-on-work-by: Michael Crosby <[email protected]> Based-on-work-by: Darren Stahl <[email protected]>
245bf6a
to
f820d2c
Compare
Build succeeded.
|
Anything further on this will be part of #4419, this draft has served its purpose. |
samuelkarp
added a commit
to samuelkarp/containerd
that referenced
this pull request
Oct 7, 2024
diff: opencontainers/runc@v1.1.14...v1.1.15 Release notes: - The -ENOSYS seccomp stub is now always generated for the native architecture that runc is running on. This is needed to work around some arguably specification-incompliant behaviour from Docker on architectures such as ppc64le, where the allowed architecture list is set to null. This ensures that we always generate at least one -ENOSYS stub for the native architecture even with these weird configs. (containerd#4391) - On a system with older kernel, reading /proc/self/mountinfo may skip some entries, as a consequence runc may not properly set mount propagation, causing container mounts leak onto the host mount namespace. (containerd#2404, containerd#4425) - In order to fix performance issues in the "lightweight" bindfd protection against [CVE-2019-5736], the temporary ro bind-mount of /proc/self/exe has been removed. runc now creates a binary copy in all cases. (containerd#4392, containerd#2532) Signed-off-by: Samuel Karp <[email protected]>
k8s-infra-cherrypick-robot
pushed a commit
to k8s-infra-cherrypick-robot/containerd
that referenced
this pull request
Oct 8, 2024
diff: opencontainers/runc@v1.1.14...v1.1.15 Release notes: - The -ENOSYS seccomp stub is now always generated for the native architecture that runc is running on. This is needed to work around some arguably specification-incompliant behaviour from Docker on architectures such as ppc64le, where the allowed architecture list is set to null. This ensures that we always generate at least one -ENOSYS stub for the native architecture even with these weird configs. (containerd#4391) - On a system with older kernel, reading /proc/self/mountinfo may skip some entries, as a consequence runc may not properly set mount propagation, causing container mounts leak onto the host mount namespace. (containerd#2404, containerd#4425) - In order to fix performance issues in the "lightweight" bindfd protection against [CVE-2019-5736], the temporary ro bind-mount of /proc/self/exe has been removed. runc now creates a binary copy in all cases. (containerd#4392, containerd#2532) Signed-off-by: Samuel Karp <[email protected]>
k8s-infra-cherrypick-robot
pushed a commit
to k8s-infra-cherrypick-robot/containerd
that referenced
this pull request
Oct 8, 2024
diff: opencontainers/runc@v1.1.14...v1.1.15 Release notes: - The -ENOSYS seccomp stub is now always generated for the native architecture that runc is running on. This is needed to work around some arguably specification-incompliant behaviour from Docker on architectures such as ppc64le, where the allowed architecture list is set to null. This ensures that we always generate at least one -ENOSYS stub for the native architecture even with these weird configs. (containerd#4391) - On a system with older kernel, reading /proc/self/mountinfo may skip some entries, as a consequence runc may not properly set mount propagation, causing container mounts leak onto the host mount namespace. (containerd#2404, containerd#4425) - In order to fix performance issues in the "lightweight" bindfd protection against [CVE-2019-5736], the temporary ro bind-mount of /proc/self/exe has been removed. runc now creates a binary copy in all cases. (containerd#4392, containerd#2532) Signed-off-by: Samuel Karp <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a rebase of #4419 to master, isolating it from #4415 and antecedants (#4399 and #4395). Original description follows:
Full credit to @darstahl and @crosbymichael, on whose shoulders I stood to implement this, as well as any other contributors to #2366 and #2287.
This reimplements the bare-minimum of #2366 to support Mount and Unmount under Windows, with the following approach:
bind
mount, exposed via symlink. We can't make these read-only usefully, AFAIK.windows-layer
mount, exposed by HCS as a volume and mounted from there. We support read-only mounts by creating a temporary scratch layer on top of the desired read-only layer. So you can write to the mount, but it won't affect the underlying data.In order to unmount a windows-layer without access to the original
[]mount.Mount
, we store the original layer path in an Alternative Data Stream on the mount-point. Happily, we do not require the parents list to tear down the stack.It's a shame, really. If HCS exposed an inverse for
hcsshim.GetLayerMountPath
, then we wouldn't need the ADS storage, as Windows exposes an API to get the Volume back from a Volume mount, and that would let me find the Layer in-question.I haven't brought in the tests from #2366 yet, or really tested this beyond
ctr image mount
and hammering on it with BuildKit to make progress towards moby/buildkit#616. I'm looking for directional feedback on this (and the whole stack, really) before I start trying to wrangle the tests into place.Note that I have not tested this rebase in isolation, it's a fallback measure in case issues are found with the other PRs, and so I can get a CI run to triage #4419 (comment) (Edit: Now resolved)
See also #4419 (comment) and #4419 (comment) which are applicable here.