Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Init container runs as root #357

Closed
Zerpet opened this issue Sep 23, 2020 · 11 comments · Fixed by #731
Closed

Init container runs as root #357

Zerpet opened this issue Sep 23, 2020 · 11 comments · Fixed by #731
Labels
sync-up Issue to discuss during sync-up

Comments

@Zerpet
Copy link
Collaborator

Zerpet commented Sep 23, 2020

Creating a RabbitmqCluster in a cluster with Pod Security Policies enabled that disable to run as root result in the Pods
not being created because the init container tries to run as root.

Snippet of the events in the StatefulSet:

  Warning  FailedCreate  3s (x12 over 13s)  statefulset-controller  create Pod bunny-rabbitmq-server-0 in StatefulSet bunny-rabbitmq-server failed error: pods "bunny-rabbitmq-server-0" is forbidden: unable to validate against any pod security policy: [spec.initContainers[0].securityContext.runAsUser: Invalid value: 0: running with the root UID is forbidden spec.initContainers[0].securityContext.capabilities.add: Invalid value: "CHOWN": capability may not be added spec.initContainers[0].securityContext.capabilities.add: Invalid value: "FOWNER": capability may not be added]

We should verify if there is any requirements to run as root in the init container and adapt the security context
accordingly. If it is strictly necessary to run as root in the init container, we should document this.

This was tested with Cluster Operator 0.46.0.

@ChunyiLyu
Copy link
Contributor

Running initCon was part of the PR which adds Openshift support. I believe what happened is that Openshift PVCs have a different permission model, and we need to manually set the group owner of mnesia to rabbitmq-server 999. We should document this requirement in README.md and provide an updated PSP in docs.

@ChunyiLyu
Copy link
Contributor

On a second thought, we may consider making this an optional feature. I can see asking users to always grant the operator with a PSP of root access problematic and unnecessary for people not on Openshift. Thoughts? @Zerpet @mkuratczyk

@Zerpet
Copy link
Collaborator Author

Zerpet commented Sep 24, 2020

we need to manually set the group owner of mnesia to rabbitmq-server 999

I thought the Pod Security Context .spec.securityContext.fsGroup was doing that for us already?

Does the Pod Security Context work on Open Shift? If it doesn't then perhaps the OpenShift equivalent needs to be configured?

@ansd
Copy link
Member

ansd commented Oct 1, 2020

I thought the Pod Security Context .spec.securityContext.fsGroup was doing that for us already?

The pod security context fsGroup is not setting the correct group ID for the persistence volume. It checked that on both OpenShift and kind cluster.

The docs state that the volume ownership is only changed for some volume types.

(kubernetes/examples#260 is another example where fsGroup doesn't apply.)

@Zerpet
Copy link
Collaborator Author

Zerpet commented Oct 6, 2020

Context

The fsGroup is not always applied; I suspect this might depend on the storage driver. We decided to keep the init container as it is right now because this guarantees that we work with every flavour of Kubernetes and between different storage drivers. If we add specific cases for one IaaS/driver, we might end up having many different cases for different IaaS/driver combinations and this might become confusing as of what to expect from the init container in terms of privileges. We have decided to close this issue for now, until we hear user feedback or receive a request for this change in the init container.

@Zerpet Zerpet closed this as completed Oct 6, 2020
@nachiket-lab
Copy link

Hi Team,

Not sure how to proceed now, seems like this is a design choice. I have a cluster that enforces PSP with the restricted policy. That means, I cannot run any pod with the root user.

Events:
  Type     Reason            Age                From                    Message
  ----     ------            ----               ----                    -------
  Normal   SuccessfulCreate  21s                statefulset-controller  create Claim persistence-cdc-server-0 Pod cdc-server-0 in StatefulSet cdc-server success
  Warning  FailedCreate      0s (x13 over 21s)  statefulset-controller  create Pod cdc-server-0 in StatefulSet cdc-server failed error: pods "cdc-server-0" is forbidden: PodSecurityPolicy: unable to admit pod: [spec.initContainers[0].securityContext.runAsUser: Invalid value: 0: running with the root UID is forbidden]

Does this mean I cannot use the operator to deploy the cluster? Is there no work around to this?

@irperez
Copy link

irperez commented Apr 22, 2021

I'm having the same issue as @NerdSec but on AKS. We have policy that enforces no root users, read only root file system and no privilege escalation. I need guidance.

@nachiket-lab
Copy link

@irperez I ended up using the bitnami helm chart to get an HA deployment for rabbitmq working. It is working fine for me, and have faced no issues so far.
This has left me wondering, if the helm chart can get the cluster up, what is the limitation for the operator and why does the init container need root access?

@Zerpet
Copy link
Collaborator Author

Zerpet commented Apr 26, 2021

Hey folks, thanks for raising this up. We will revisit this and try to improve the situation.

@Zerpet Zerpet reopened this Apr 26, 2021
@Zerpet Zerpet added the sync-up Issue to discuss during sync-up label Jun 8, 2021
@coro
Copy link
Contributor

coro commented Jun 16, 2021

Just to keep this up to date: we are looking into this, hence the linked PR. We're just double checking that such a change doesn't affect Openshift users.

In the meantime, one workaround would be to use the override feature to manually set the user of the initContainer to non-root:

apiVersion: rabbitmq.com/v1beta1
kind: RabbitmqCluster
metadata:
  name: non-root
spec:
  override:
    statefulSet:
      spec:
        template:
          spec:
            containers: []
            initContainers:
            - name: setup-container
              securityContext:
                runAsUser: 999
                runAsGroup: 999

@coro coro closed this as completed in #731 Jun 17, 2021
@coro
Copy link
Contributor

coro commented Jun 17, 2021

@NerdSec @irperez @raviranjanelastisys Just to keep you in the loop: we did some testing of this today. We were concerned about the effect this would have on Openshift clusters, however it seems that we are able to spin up clusters without root successfully, so we have merged the PR.

This is available in the latest commit to main, and the next release of the operator. Please do let us know if this solves the problem you were facing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sync-up Issue to discuss during sync-up
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants