Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gRPC-FUSE reports inaccurate executable permissions #5509

Open
3 tasks done
myw opened this issue Mar 26, 2021 · 19 comments
Open
3 tasks done

gRPC-FUSE reports inaccurate executable permissions #5509

myw opened this issue Mar 26, 2021 · 19 comments

Comments

@myw
Copy link

myw commented Mar 26, 2021

  • I have tried with the latest version of Docker Desktop
  • I have tried disabling enabled experimental features
  • I have uploaded Diagnostics
  • Diagnostics ID: 61A7AFC6-411E-4184-B8AE-A79CA0239084/20210326003855

Summary

gRPC-FUSE volumes seem to be incorrectly reporting some permissions. Namely, python2.7 seems to think non-executable files are executable, when mounted via gRPC-FUSE volumes. I present a minimal test case below.

Expected behavior

A host directory is mounted inside my container with a bind mount. When I test the access of non-executable file in that directory with Python, I expect it to tell me that it is non-executable. i.e. if stat tells me file foo has mode 0644, os.access('foo', os.X_OK) should return False.

When I try this with gRPC-FUSE turned off, this is what happens.

Actual behavior

When gRPC-FUSE is enabled, os.access('foo', os.X_OK) returns True, even though the file has mode 0644.

Information

This is quite reproducible.

  • macOS Version: Catalina 10.15.7 (19H524)
  • Docker Desktop Version: 3.2.2 (61853)

A minimal test case to highlight the issue is described below. For the sake of brevity, I am not posting further detailed examples, bit I have also verified that the erroneous behavior happens with a non-root user in the container. I have not tested with python3 or with other python2 base images.

Steps to reproduce the behavior

0. Control: Show that in a container without a bind mount, python correctly identifies a file with mode 0644 as non-executable.

bash-3.2$ docker run --rm python:2.7-slim-buster bash -c "
cd /tmp
rm -f testfile.tmp
touch testfile.tmp
stat --format='%a' testfile.tmp
python <<EOF
import os
print oct(os.stat('testfile.tmp').st_mode)
print os.access('testfile.tmp', os.X_OK)
EOF
"
644
0100644
False

Because the result of the Python expression is False, python correctly identifies that it does not have execute permissions on the file.
This is the expected behavior and is true regardless of whether or not Use gRPC FUSE for file sharing is enabled, because the file is not on a bind mount.

1. Expected Behavior With Use gRPC FUSE for file sharing DISABLED, run the same code as above, but have the file be on a bind mount. Note the addition of --volume="$(pwd):/tmp" is the only change to the command.

bash-3.2$ docker run --rm --volume="$(pwd):/tmp" python:2.7-slim-buster bash -c "
cd /tmp
rm -f testfile.tmp
touch testfile.tmp
stat --format='%a' testfile.tmp
python <<EOF
import os
print oct(os.stat('testfile.tmp').st_mode)
print os.access('testfile.tmp', os.X_OK)
EOF
"
644
0100644
False

The result is the same as the control: the expected behavior.

2. Actual Behavior Now, ENABLE Use gRPC FUSE for file sharing, and run the exact same code as in 1. above:

bash-3.2$ docker run --rm --volume="$(pwd):/tmp" python:2.7-slim-buster bash -c "
cd /tmp
rm -f testfile.tmp
touch testfile.tmp
stat --format='%a' testfile.tmp
python <<EOF
import os
print oct(os.stat('testfile.tmp').st_mode)
print os.access('testfile.tmp', os.X_OK)
EOF
"
644
0100644
True

Now, even though Python correctly sees the mode of the file, os.access incorrectly returns True. One consequence of this behavior is that nosetests ignores all files by default because it thinks they are executable.

Happy to provide additional information to help debug.

Thanks!

@myw myw changed the title gRPC-FUSE reports inaccurate executable permissions to Python gRPC-FUSE reports inaccurate executable permissions Mar 26, 2021
@myw
Copy link
Author

myw commented Mar 26, 2021

Just made an even simpler reproducible test case entirely in bash:

bash-3.2$ docker run --rm --volume="$(pwd):/tmp" python:3 bash -c "
          cd /tmp
          rm -f testfile.tmp
          touch testfile.tmp
          stat --format='%a' testfile.tmp
 [ -x testfile.tmp ] && echo 'access'"
644
access

@myw
Copy link
Author

myw commented Mar 26, 2021

Further testing:
Some base images do not exhibit this behavior: alpine, busybox, and cirros, when running the equivalent test in sh, exhibit correct behavior:

docker run --rm --volume="$(pwd):/tmp" cirros sh -c "
cd /tmp
rm -f testfile.tmp
touch testfile.tmp
stat -c '%a' testfile.tmp
[ -x testfile.tmp ] && echo 'access'"
644

Interestingly, python:alpine allows us to test both python and sh on the same OS. When doing that, the [ -x ] test in sh works correctly, but the os.access test in python fails.

Likely, the version of sh on the alpine, cirrus, and busybox distros works as expected with gRPC-FUSE, but Python and/or bash/sh on other systems do not.

@myw
Copy link
Author

myw commented Mar 26, 2021

More testing: opensuse/leap with sh: fails. opensuse/tumbleweed with sh: passes.

@normanmaurer
Copy link

I see exactly the same problem when I try to compile netty :/

@myw
Copy link
Author

myw commented Apr 1, 2021

Seems that the distributions that do not fail this test mostly use busybox, whose shell's access test function specifically mentions not "mak[ing] the mistake of telling root that any file is executable."

This makes me think that gRPC-FUSE is doing something where the access to file is being tested as root, which triggers a common edge-case behavior in the standard POSIX access system call. This logic does seem to have been resolved in osxfs, so there's probably a workable fix.

@normanmaurer
Copy link

It saw this failing on centos...

@myw
Copy link
Author

myw commented Apr 28, 2021

Still exists as of 3.3.1.
Note that this does not depend on the user inside the docker container being root.

docker run -u nobody --rm --volume="$(pwd):/tmp" debian sh -c "
cd /tmp
rm -f testfile.tmp
touch testfile.tmp
stat -c '%a' testfile.tmp
[ -x testfile.tmp ] && echo 'access'"
644
access

@thaJeztah
Copy link
Member

/cc @djs55

@docker-robott
Copy link
Collaborator

Issues go stale after 90 days of inactivity.
Mark the issue as fresh with /remove-lifecycle stale comment.
Stale issues will be closed after an additional 30 days of inactivity.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle stale

@myw
Copy link
Author

myw commented Jul 27, 2021 via email

@chrisvest
Copy link

Noticed this as well on macOS Big Sur (not using the new virt. framework), Docker Desktop Version 4.0.0 (4.0.0.12) with CentOS 6.10 image.

@djs55
Copy link
Contributor

djs55 commented Sep 29, 2021

I think this is the same as #5029 and has the same root cause as #5944 (comment) . Quoting from there:

We use Linux FUSE to mount the host filesystem. There are 2 permission models: https://elixir.bootlin.com/linux/latest/source/fs/fuse/dir.c#L1197 . We delegate the permission checks to the server (by not setting default_permissions) because we want to avoid the situation where a user has a writable file on the host but can't get Linux to write to it because Linux believes the file owner/group is different. In this mode access(path, W_OK) will invoke the fuse_access API https://elixir.bootlin.com/linux/latest/source/fs/fuse/dir.c#L1160 .

However we also care about performance. Many filesystem options performed by Linux will be prefixed by a fuse_access call, which doubles the numbers of RPCs to the host in these workloads. These access results are really just hints, as the real access check has to be done within the write / open / rm call, since the permissions can change between the calls. Therefore we also disable FUSE_ACCESS on the server side by returning ENOSYS so no_access is set here https://elixir.bootlin.com/linux/latest/source/fs/fuse/dir.c#L1168 . The result is that Linux assumes access returns success, then (hopefully) tries the real operation to see whether it succeeds or not. The host does the access control against the real file permissions.

So it's a combination of

  • wanting to prevent spurious user/group permission errors in Linux; and
  • wanting to improve performance

results in the inaccuracy of the access call. Fixing this is possible, but it would reduce performance.

@myw
Copy link
Author

myw commented Sep 29, 2021

@djs55 Fascinating and incredibly helpful context. Thank you!

I am not familiar with the filesystem management on that level, but your proposal for the root cause makes sense to me—I haven't observed any behavior that would contradict it.

I do think that this behavior is far enough outside the bounds of expectation that it should be possible to disable it without disabling all of gRPC-FUSE (maybe with a config-file-only setting?), which would provide most of the existing benefits and presumably still offer some performance benefit, even with the extra access calls. But whether or not it's worth it to work on a fix like that would depend on the performance impact tradeoff.

Conversely, would it somehow be possible to disable the fuse_access call only when we know it's coming as an access hint from the Linux filesystem that's about to be immediately followed by a write/open/rm call? That is, if we know it's something like python code making the call explicitly from userspace, rather than the filesystem itself checking, could we let the call go through? I doubt the answer is yes, but I think it would effectively resolve the issue with less performance impact.

Finally, for the sake of any others others following this this post, I do also want to share the two workarounds you mentioned in the rest of that comment that do not involve turning off gRPC-FUSE, which might be helpful in some use-cases:

… if you would like 100% native Linux access control checks, you can store your data in a "named volume" which resides inside the Linux filesystem. For example:

docker volume create my-code
docker run -v my-code:/mnt alpine ls /mnt

Another possibility is to use "dev environments" https://docs.docker.com/desktop/dev-environments/ which store the code in Linux (so 100% native filesystem semantics) while also allowing you to seamlessly access everything from your IDE (as well as push/pull the environment to share it with colleagues etc)

In addition to these workarounds, I'm wondering if there's any chance using the new BigSur virtualization framework could either resolve the issue, or otherwise improve performance to mitigate the impact of fixing it?

Thanks again for looking into this.

@djs55
Copy link
Contributor

djs55 commented Sep 29, 2021

@myw thanks for the quick reply! For what it's worth, I'm not satisfied with the current state either. There are some improvements coming in macOS Monterey in the virtualization.framework which may help speed things up and improve the semantics of access: we're investigating those. We'll let you know if/when we have something interesting to try.

@GKTheOne
Copy link

I came here after finding #5007.

This impacts the database initialisation scripts I want to use with postgres (& mysql) images.
I have some scripts that I want sourced by the initialisation instead of executed.
However, because of this issue, the entrypoint script tries to execute the scripts (because -x file test succeeds) which fails (with permission denied) because the execute bit is not actually set.

@docker-robott
Copy link
Collaborator

Issues go stale after 90 days of inactivity.
Mark the issue as fresh with /remove-lifecycle stale comment.
Stale issues will be closed after an additional 30 days of inactivity.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle stale

@myw
Copy link
Author

myw commented Jan 24, 2022

/remove-lifecycle stale

@thaJeztah
Copy link
Member

/lifecycle frozen

@razzed
Copy link

razzed commented Oct 23, 2024

This still exists in 4.34.3 (170107) Engine 27.2.0 on Mac OS X 15.0.1 (24A348)

Surprised this bug has been in existence for so long - [ -x plainFile ] on a plain, non executable file on volume mounted inside a container returns true (exit 0).

The difference between the alpine and other containers is that alpine containers actually copy the volume so it is not shared and therefore behaves correctly. You can detect this by simply adding a file inside the container and seeing if the local copy changes. However, ubuntu container can replicate this bug easily. Simple case October 2024:

mkdir test
touch test/notx.md
docker run -v "$(pwd)/test:/root/test" -it ubuntu:latest
root@5948a905ab89:/# cd /root/test
root@5948a905ab89:~/test# ls -la
total 4
drwxr-xr-x 3 root root   96 Oct 23 14:10 .
drwx------ 1 root root 4096 Oct 23 14:11 ..
-rw-r--r-- 1 root root    0 Oct 23 14:10 notx.md
root@5948a905ab89:~/test# if [ -x notx.md ]; then echo "is executable"; else echo "works correctly"; fi
is executable
root@5948a905ab89:~/test#

Is there any reason this can't be fixed? Seems like a pretty major issue.

You can workaround this by simply not using mounted volumes and copy your files into the target container but this means you lose development speed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants