Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container will not start when deployed to OpenShift 3.11 with chmod /dev/stdout /dev/stderr permission error #14138

Closed
ToniCipriani opened this issue Nov 22, 2020 · 14 comments
Assignees
Labels
bug stale stalebot believes this issue/PR has not been touched recently

Comments

@ToniCipriani
Copy link

ToniCipriani commented Nov 22, 2020

Description:

When deploying the Envoy container to OpenShift 3.11, the pod goes into a crash loop when starting with a permission denied error. Pod is trying to chmod on /dev/stderr and /dev/stdout. Issue seems to be due to the chmod command in the docker-entrypoint.sh.

Repro steps:

Deploy demo config to OpenShift

Admin and Stats Output:

N/A
Config:
access_log:
- name: envoy.access_loggers.file
typed_config:
"@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
path: /dev/stdout

Logs:

chown: /dev/stdout: Permission denied
  chown: /dev/stderr: Permission denied

@ToniCipriani ToniCipriani added bug triage Issue requires triage labels Nov 22, 2020
@phlax
Copy link
Member

phlax commented Nov 22, 2020

/assign phlax

@phlax
Copy link
Member

phlax commented Nov 22, 2020

hi @ToniCipriani can you tell me the user you are starting the container as

@ToniCipriani
Copy link
Author

ToniCipriani commented Nov 22, 2020

On OpenShift, it runs as a randomly generated user ID. I haven't specified ENVOY_UID or GID.

/ $ ps aux
PID   USER     TIME  COMMAND
    1 10001100  0:32 envoy -c /etc/envoy/envoy.yaml
  126 10001100  0:00 /bin/sh
  134 10001100  0:00 /bin/sh
  146 10001100  0:00 ps aux
/ $ whoami
whoami: unknown uid 1000110000
/ $ 

@ToniCipriani
Copy link
Author

I also tried downgrading to the 1.15 and 1.14 containers, and managed to get 1.14.5 working without the error. However I don't think this is ideal.

@phlax
Copy link
Member

phlax commented Nov 22, 2020

it needs to run as root - the entrypoint script drops permission to the envoy user

@phlax
Copy link
Member

phlax commented Nov 22, 2020

@ToniCipriani
Copy link
Author

ToniCipriani commented Nov 23, 2020

Running as root is not an option here, this will be flagged by the company's security. I do see in the gateway-proxy of Gloo (which I believe is based on Envoy) has an option of using the container as a floating user? I think that is the issue here. Is it possible to configure Envoy to run as such?

Gloo provides support for running the gateway-proxy (i.e. Envoy) as an unprivileged container and without needing the NET_BIND_SERVICE capability (note that this means the proxy can not bind to ports below 1024).

https://docs.solo.io/gloo-edge/latest/installation/platform_configuration/cluster_setup/#openshift

And AFAIK it is not necessary to run the chown command on /dev/stdout and /dev/stderr for it to write properly, I'm not aware of any other container that does this.

@ToniCipriani
Copy link
Author

I also tried taking your advice to set ENVOY_UID/GID as 0, now the container starts but no logs are observed in /dev/stdout. Ideally we want these logs to be there to be picked up by ELK.

@phlax
Copy link
Member

phlax commented Nov 23, 2020

hi - so to clarify envoy/user id

Envoy has an entrypoint that drops permissions from root to envoy inside the container.

This means to use this container it must be started as root - ie non-optional

If you do not want to start the container as root you will need to change the entrypoint - that would work fine - you should be able to run as the envoy uid (101) and things will mostly work. you may have some issue with stdout/err still

@phlax
Copy link
Member

phlax commented Nov 23, 2020

and to be clear - im not suggesting setting ENVOY_UID which would actually make the container run as root (edit: if set to 0)

@phlax
Copy link
Member

phlax commented Nov 23, 2020

@ToniCipriani see #14141

@zuercher zuercher removed the triage Issue requires triage label Nov 24, 2020
@phlax
Copy link
Member

phlax commented Nov 27, 2020

@ToniCipriani there is an underlying problem here i think in that if you start the container as non-root it doesnt work and doesnt notify you why not

Its not an uncommon pattern to drop root in the entrypoint (otherwise gosu and suexec would not exist)

It is also possible to run this container (with some big caveats/limitations) as non-root without the entrypoint - but its not currently set up to do that (which is why i opened #14141 )

In the meantime, while that issue is considered/discussed/resolved, if you wish to run the container as non-root you will have to hack the entrypoint yourself.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Dec 27, 2020
@github-actions
Copy link

github-actions bot commented Jan 3, 2021

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.

@github-actions github-actions bot closed this as completed Jan 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug stale stalebot believes this issue/PR has not been touched recently
Projects
None yet
Development

No branches or pull requests

3 participants