-
Notifications
You must be signed in to change notification settings - Fork 787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rootless overlay whiteouts throw error when stored #1709
Comments
@giuseppe Is this a problem with fuse-overlayfs? |
It is trying to use overlay without fuse-overlays. It works on Ubuntu kernels but something seems to be failing. @rlifshay could you try using fuse-overlays as we do by default on Fedora? |
If I use fuse-overlayfs it works without issue. I am hoping that this can be fixed so native overlay works. That way it theoretically has better performance and I also don't have to manually download and install fuse-overlayfs on Ubuntu. I was looking through some code a day or two ago in an attempt to debug this, and this looks like it might be somewhat related: |
I couldn't find a related issue before when I posted this, but I just now ran across this issue: containers/podman#2998 |
Sorry for the repeated posts. It appears as if the kernel overlayfs itself is the only one that can do mknod or set xattrs in a layer for whiteouts or opaque directories when running in rootless mode. Currently we try and directly add the whiteouts with mknod, which only works with root permissions. Possibly we could solve this by detecting this case and using a different strategy. Rather than making the nodes ourselves, we could just delete the corresponding files or directories in the correct layer to cause overlayfs to create the whiteout nodes itself, thus bypassing the permissions issue. |
We need to create the whiteouts in the same format the kernel (or the FUSE program) expects them to be, so we cannot tweak it to be in a different format. If we use native overlay we need to create the whiteout files either with I don't think the extra complexity is worth to to enable a custom Ubuntu kernel patch. If you still want to use it, could you open a PR? It will help to fully understand what are the costs and the complexity for supporting it |
I would like to find a feasible way to solve this if possible, because even if it's only usable in Ubuntu (or possibly everywhere, see third option below), it enables a significant performance increase. I saw a nearly 20% improvement in build time for a large container when using native overlay vs fuse-overlayfs. I have several ideas below and would try implementing one of them myself, but I am not familiar enough with go or the internals of these projects to be able to do so.
If none of these are feasible, buildah (and podman, etc) should probably display a warning when using native overlayfs in rootless mode. In addition it would be nice to have fuse-overlayfs included in the buildah/podman ubuntu packages and enabled by default, rather than having to manually install it. |
what container are you trying to build? There should not be such big difference, unless you are trying to do many parallel readdirs or copyups.
I don't think we should use setuid programs to circumvent kernel restrictions. |
It's a big container that installs WINE and some other things (https://github.com/rlifshay/dng-converter-ctr), although I would guess that even the smaller performance difference with a more typical container could add up in a CI environment or something.
I can definitely understand why someone might be hesitant to do that, especially if it was the default. However personally I think it would be alright if it was left up to the administrator to explicitly choose to enable and specify such a program. Also, what I was envisioning wouldn't be the same as giving unlimited access to mknod, a definite security risk (and the reason it is restricted to root). Instead, it would be a minimal program that would only make nodes usable as whiteouts and that's it, thus alleviating the security risk. It could even go beyond that, and verify the caller and/or paths to make sure that it was being used as expected. |
I would prefer the distributions and upstream kernel to make those decisions. If Ubuntu is kernel patch is blocking the creation of the whiteout device nodes, then you need to work with them to get it in. We don't want to add setuid programs to Podman to get around security restrictions, especially when dealing with a patch that even the upstream User Namespace team has not allowed into the upstream kernel. Please open an issue with Ubuntu to allow the creation of the device nodes. As far as performance issues, I would prefer to fix fuse-overlayfs if possible to make it handle you workload better. |
it seems most of the cost in running apt is coming from I am working on it here: containers/fuse-overlayfs#88 @rlifshay, the PR seems to improve significantly your test script. Would it be possible for you to try it out and let me know? |
Yes it seems to help a lot, at least when I also enable the mount options you mention in the PR.
It seems as if after our discussions this has really turned into several separate issues:
|
@rlifshay thanks a lot for the tests. This is very helpful feedback. Do you also have a measure for native overlay (as root) on the same host?
@lsm5 could we add fuse-overlayfs as a dependency to the Ubuntu package? |
@giuseppe I ran it with native overlayfs in rootless mode. I was going to do native as root too, but I kept having internet issues. I will probably try again tomorrow and will update you if it's much different.
(I just changed my username from rlifshay if anyone is confused) |
I did some more testing with more stable conditions (mostly internet) and a lot less variance:
|
I believe this is fixed in master, reopen if I am mistaken. |
This still seems to be an issue, I'm getting the same error on
It works when using fuse-overlayfs. |
Seconding @tobwen's report, this also appears during a $ buildah --log-level debug from python:3.9-slim
...
Writing manifest to image destination
Storing signatures
DEBU Applying tar in /mnt/ephemeral/containers/user-storage/overlay/9958376e618d1ca5e7d2d1ba0d48c6994913dee57315f6fda6ac98f387b01165/diff
DEBU error copying src image ["docker://python:3.9-slim"] to dest image ["docker.io/library/python:3.9-slim"] err: Error committing the finished image: error adding layer with blob "sha256:74f86becb84a3ee98802e154c3117092e23fffae5d97369eaf6cacd679362972": Error processing tar file(exit status 1): operation not permitted
DEBU error pulling image "docker://python:3.9-slim": Error committing the finished image: error adding layer with blob "sha256:74f86becb84a3ee98802e154c3117092e23fffae5d97369eaf6cacd679362972": Error processing tar file(exit status 1): operation not permitted
DEBU unable to pull and read image "docker.io/library/python:3.9-slim": Error committing the finished image: error adding layer with blob "sha256:74f86becb84a3ee98802e154c3117092e23fffae5d97369eaf6cacd679362972": Error processing tar file(exit status 1): operation not permitted
Error committing the finished image: error adding layer with blob "sha256:74f86becb84a3ee98802e154c3117092e23fffae5d97369eaf6cacd679362972": Error processing tar file(exit status 1): operation not permitted
DEBU shutting down the store
ERRO exit status 125 |
what kernel version? |
We have this problem in a CI job, running on Kubernetes - running on Ubuntu 20.04 nodes.. Has this been resolved in Ubuntu 22.04 - or is ubuntu still using some problematic combination of kernel options? |
Description
Deleting existing files in lower layers causes errors when running buildah/podman in rootless mode and using the overlay storage driver. Deleting the files works fine, but it seems to have issues when it later tries to use the whiteout files and fails with operation not permitted. It also has the same issue when pulling a container where a file from a lower layer is deleted. All of these have no issues when using the vfs driver instead.
Steps to reproduce the issue:
You can see some relevant strace output here from pulling/importing another container with the same issue. You can see it seems to have issues when it tries to create the whiteout node:
buildah info
:The text was updated successfully, but these errors were encountered: