Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What are the minimum permissions needed in a pod security policy in order for an argo workflow to run successfully? #2239

Closed
4 tasks done
uipo78 opened this issue Feb 14, 2020 · 24 comments · Fixed by #4251
Closed
4 tasks done
Labels
type/security Security related

Comments

@uipo78
Copy link

uipo78 commented Feb 14, 2020

Summary

I want to know how to successfully run an argo workflow on a cluster with pod security policies enabled. What are the minimum permissions needed in a PSP in order for an argo workflow to run successfully?

Motivation

See this bug report formatted motivation—

Checklist:

  • I've included the version.
  • I've included reproduction steps.
  • I've included the workflow YAML.
  • I've included the logs.

What happened:
I submitted this workflow to a kubernetes cluster with pod security policies enabled. Since that image runs with root by default, I added this to the manifest: spec.securityContext.runAsUser: 999. The pod associated with the first task in this workflow spun up and completed successfully ; however, something went wrong in the wait container (see details below), suggesting that the pod security policy associated with this service account doesn't have all the permissions that it needs. The subsequent jobs in the DAG were never spun up.

What you expected to happen:

I expected the workflow to finish successfully.

How to reproduce it (as minimally and precisely as possible):

  1. kubectl apply the following pod security policy:
apiVersion: extensions/v1beta1
kind: PodSecurityPolicy
metadata:
  name: argo-workflow
spec:
  allowPrivilegeEscalation: false
  fsGroup:
    ranges:
    - max: 65535
      min: 1
    rule: MustRunAs
  requiredDropCapabilities:
  - ALL
  runAsUser:
    rule: MustRunAsNonRoot
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    ranges:
    - max: 65535
      min: 1
    rule: MustRunAs
  volumes:
  - configMap
  - emptyDir
  - projected
  - secret
  - downwardAPI
  - persistentVolumeClaim
  - hostPath
  1. kubectl create sa my-argo-workflow
  2. kubectl apply the following role:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: my-argo-workflow
rules:
- apiGroups:
  - extensions
  resourceNames:
  - argo-workflow
  resources:
  - podsecuritypolicies
  verbs:
  - use
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
  - watch
  - patch
- apiGroups:
  - ""
  resources:
  - pods/log
  verbs:
  - get
  - watch
  1. kubectl create rolebinding my-argo-workflow --role my-argo-workflow --serviceaccount=$my_namespace:my-argo-workflow

  2. argo submit the following workflow

# The following workflow executes a diamond workflow
# 
#   A
#  / \
# B   C
#  \ /
#   D
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: dag-diamond-
spec:
  serviceAccountName: my-argo-workflow
  securityContext:
    runAsUser: 999
  entrypoint: diamond
  templates:
  - name: diamond
    dag:
      tasks:
      - name: A
        template: echo
        arguments:
          parameters: [{name: message, value: A}]
      - name: B
        dependencies: [A]
        template: echo
        arguments:
          parameters: [{name: message, value: B}]
      - name: C
        dependencies: [A]
        template: echo
        arguments:
          parameters: [{name: message, value: C}]
      - name: D
        dependencies: [B, C]
        template: echo
        arguments:
          parameters: [{name: message, value: D}]

  - name: echo
    inputs:
      parameters:
      - name: message
    container:
      image: alpine:3.7
      command: [echo, "{{inputs.parameters.message}}"]

Environment:

  • Argo version:
$ argo version
argo: v2.4.3
  BuildDate: 2019-12-06T03:36:01Z
  GitCommit: cfe5f377bc3552fba90afe6db7a76edd92c753cd
  GitTreeState: clean
  GitTag: v2.4.3
  GoVersion: go1.11.5
  Compiler: gc
  Platform: linux/amd64
  • Kubernetes version :
$ kubectl version -o yaml
clientVersion:
  buildDate: "2019-11-13T11:23:11Z"
  compiler: gc
  gitCommit: b3cbbae08ec52a7fc73d334838e18d17e8512749
  gitTreeState: clean
  gitVersion: v1.16.3
  goVersion: go1.12.12
  major: "1"
  minor: "16"
  platform: linux/amd64
serverVersion:
  buildDate: "2020-01-31T20:09:49Z"
  compiler: gc
  gitCommit: c83d931fb9bece427bc63a02349755e0f8696d3e
  gitTreeState: clean
  gitVersion: v1.15.7
  goVersion: go1.12.12
  major: "1"
  minor: "15"
  platform: linux/amd64

Logs

  • workflow result:
argo get dag-diamond-$SOMETHING
Name:                dag-diamond-mrcz7
Namespace:           default
ServiceAccount:      my-argo-workflow
Status:              Error
Created:             Thu Feb 13 20:06:38 -0600 (5 minutes ago)
Started:             Thu Feb 13 20:06:38 -0600 (5 minutes ago)
Finished:            Thu Feb 13 20:06:42 -0600 (5 minutes ago)
Duration:            4 seconds

STEP                            PODNAME                       DURATION  MESSAGE
 ⚠ dag-diamond-mrcz7 (diamond)                                          
 └-⚠ A (echo)                   dag-diamond-mrcz7-1964845457  3s        failed to save outputs: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.39/containers/b6530f069d28557d60e0dc307f2a0fe246e000143070a9d4db212fca755e8c5b/wait?condition=: dial unix /var/run/docker.sock: connect: permission denied
  • pod logs:
kubectl logs <failedpodname> -c wait
time="2020-02-14T02:06:40Z" level=info msg="Creating a docker executor"
time="2020-02-14T02:06:40Z" level=info msg="Executor (version: v2.4.3, build_date: 2019-12-06T03:35:39Z) initialized (pod: default/dag-diamond-mrcz7-1964845457) with template:\n{\"name\":\"echo\",\"arguments\":{},\"inputs\":{\"parameters\":[{\"name\":\"message\",\"value\":\"A\"}]},\"outputs\":{},\"metadata\":{},\"container\":{\"name\":\"\",\"image\":\"alpine:3.7\",\"command\":[\"echo\",\"A\"],\"resources\":{}}}"
time="2020-02-14T02:06:40Z" level=info msg="Waiting on main container"
time="2020-02-14T02:06:41Z" level=info msg="main container started with container ID: b6530f069d28557d60e0dc307f2a0fe246e000143070a9d4db212fca755e8c5b"
time="2020-02-14T02:06:41Z" level=info msg="Starting annotations monitor"
time="2020-02-14T02:06:41Z" level=info msg="docker wait b6530f069d28557d60e0dc307f2a0fe246e000143070a9d4db212fca755e8c5b"
time="2020-02-14T02:06:41Z" level=info msg="Starting deadline monitor"
time="2020-02-14T02:06:41Z" level=error msg="`docker wait b6530f069d28557d60e0dc307f2a0fe246e000143070a9d4db212fca755e8c5b` failed: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.39/containers/b6530f069d28557d60e0dc307f2a0fe246e000143070a9d4db212fca755e8c5b/wait?condition=: dial unix /var/run/docker.sock: connect: permission denied\n"
time="2020-02-14T02:06:41Z" level=warning msg="Failed to wait for container id 'b6530f069d28557d60e0dc307f2a0fe246e000143070a9d4db212fca755e8c5b': Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.39/containers/b6530f069d28557d60e0dc307f2a0fe246e000143070a9d4db212fca755e8c5b/wait?condition=: dial unix /var/run/docker.sock: connect: permission denied"
time="2020-02-14T02:06:41Z" level=error msg="executor error: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.39/containers/b6530f069d28557d60e0dc307f2a0fe246e000143070a9d4db212fca755e8c5b/wait?condition=: dial unix /var/run/docker.sock: connect: permission denied\ngithub.com/argoproj/argo/errors.New\n\t/go/src/github.com/argoproj/argo/errors/errors.go:49\ngithub.com/argoproj/argo/errors.InternalError\n\t/go/src/github.com/argoproj/argo/errors/errors.go:60\ngithub.com/argoproj/argo/workflow/common.RunCommand\n\t/go/src/github.com/argoproj/argo/workflow/common/util.go:406\ngithub.com/argoproj/argo/workflow/executor/docker.(*DockerExecutor).Wait\n\t/go/src/github.com/argoproj/argo/workflow/executor/docker/docker.go:95\ngithub.com/argoproj/argo/workflow/executor.(*WorkflowExecutor).Wait.func1\n\t/go/src/github.com/argoproj/argo/workflow/executor/executor.go:892\nk8s.io/apimachinery/pkg/util/wait.ExponentialBackoff\n\t/go/src/k8s.io/apimachinery/pkg/util/wait/wait.go:265\ngithub.com/argoproj/argo/workflow/executor.(*WorkflowExecutor).Wait\n\t/go/src/github.com/argoproj/argo/workflow/executor/executor.go:891\ngithub.com/argoproj/argo/cmd/argoexec/commands.waitContainer\n\t/go/src/github.com/argoproj/argo/cmd/argoexec/commands/wait.go:40\ngithub.com/argoproj/argo/cmd/argoexec/commands.NewWaitCommand.func1\n\t/go/src/github.com/argoproj/argo/cmd/argoexec/commands/wait.go:16\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/src/github.com/spf13/cobra/command.go:766\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/src/github.com/spf13/cobra/command.go:852\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/src/github.com/spf13/cobra/command.go:800\nmain.main\n\t/go/src/github.com/argoproj/argo/cmd/argoexec/main.go:17\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:201\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1333"
time="2020-02-14T02:06:41Z" level=info msg="No output parameters"
time="2020-02-14T02:06:41Z" level=info msg="No output artifacts"
time="2020-02-14T02:06:41Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2020-02-14T02:06:41Z" level=info msg="Killing sidecars"
time="2020-02-14T02:06:41Z" level=info msg="Annotations monitor stopped"
time="2020-02-14T02:06:41Z" level=info msg="Alloc=5095 TotalAlloc=11588 Sys=71102 NumGC=4 Goroutines=9"
time="2020-02-14T02:06:41Z" level=fatal msg="Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.39/containers/b6530f069d28557d60e0dc307f2a0fe246e000143070a9d4db212fca755e8c5b/wait?condition=: dial unix /var/run/docker.sock: connect: permission denied\ngithub.com/argoproj/argo/errors.New\n\t/go/src/github.com/argoproj/argo/errors/errors.go:49\ngithub.com/argoproj/argo/errors.InternalError\n\t/go/src/github.com/argoproj/argo/errors/errors.go:60\ngithub.com/argoproj/argo/workflow/common.RunCommand\n\t/go/src/github.com/argoproj/argo/workflow/common/util.go:406\ngithub.com/argoproj/argo/workflow/executor/docker.(*DockerExecutor).Wait\n\t/go/src/github.com/argoproj/argo/workflow/executor/docker/docker.go:95\ngithub.com/argoproj/argo/workflow/executor.(*WorkflowExecutor).Wait.func1\n\t/go/src/github.com/argoproj/argo/workflow/executor/executor.go:892\nk8s.io/apimachinery/pkg/util/wait.ExponentialBackoff\n\t/go/src/k8s.io/apimachinery/pkg/util/wait/wait.go:265\ngithub.com/argoproj/argo/workflow/executor.(*WorkflowExecutor).Wait\n\t/go/src/github.com/argoproj/argo/workflow/executor/executor.go:891\ngithub.com/argoproj/argo/cmd/argoexec/commands.waitContainer\n\t/go/src/github.com/argoproj/argo/cmd/argoexec/commands/wait.go:40\ngithub.com/argoproj/argo/cmd/argoexec/commands.NewWaitCommand.func1\n\t/go/src/github.com/argoproj/argo/cmd/argoexec/commands/wait.go:16\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/src/github.com/spf13/cobra/command.go:766\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/src/github.com/spf13/cobra/command.go:852\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/src/github.com/spf13/cobra/command.go:800\nmain.main\n\t/go/src/github.com/argoproj/argo/cmd/argoexec/main.go:17\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:201\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1333"

Message from the maintainers:

If you are impacted by this bug please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

@uipo78 uipo78 changed the title What are the minimum permissions needed in a PSP in order for an argo workflow to run successfully? What are the minimum permissions needed in a pod security policy in order for an argo workflow to run successfully? Feb 14, 2020
@tghartland
Copy link

I've just seen this as well after switching the node OS from Fedora Atomic to Fedora CoreOS (deployed by Openstack Magnum).

The solution we came up with was this

   entrypoint: whalesay
   templates:
   - name: whalesay
+    securityContext:
+      seLinuxOptions:
+        type: "spc_t"
     container:
       image: docker/whalesay:latest
       command: [cowsay]

(applied to the hello-world example), but I don't know if this is the minimal required permissions.

Can this be documented somewhere?

@uipo78
Copy link
Author

uipo78 commented Feb 18, 2020

@tghartland that worked? when i tried to do what you posted, that securityContext isn't propagated to the pod running whalesay at all.

@tghartland
Copy link

Yes that fixed it for me.

There are two places that the security context could go, either in the spec of the workflow itself:

 apiVersion: argoproj.io/v1alpha1
 kind: Workflow
 metadata:
   generateName: hello-world-
 spec:
+  securityContext:
+    seLinuxOptions:
+      type: "spc_t"
   entrypoint: whalesay
   templates:
   - name: whalesay
     container:
       image: docker/whalesay:latest
       command: [cowsay]
       args: ["hello world"]

Or in the template for the pod:

 apiVersion: argoproj.io/v1alpha1
 kind: Workflow
 metadata:
   generateName: hello-world-
 spec:
   entrypoint: whalesay
   templates:
   - name: whalesay
+    securityContext:
+      seLinuxOptions:
+        type: "spc_t"
     container:
       image: docker/whalesay:latest
       command: [cowsay]
       args: ["hello world"]

Give both a try, maybe one will work better.

@simster7
Copy link
Member

Closing, feel free to reopen if necessary

@uipo78
Copy link
Author

uipo78 commented Feb 25, 2020

The original question remains unanswered.

@uipo78
Copy link
Author

uipo78 commented Feb 25, 2020

I'm also not able to reopen this issue.

@simster7 simster7 reopened this Feb 25, 2020
@simster7
Copy link
Member

simster7 commented Mar 8, 2020

@uipo78 Take a look at Workflow RBAC?

@uipo78
Copy link
Author

uipo78 commented Mar 9, 2020

RBAC and pod security policies are two separate things.

@alexec
Copy link
Contributor

alexec commented Mar 9, 2020

@sarabala1979 ?

@uipo78
Copy link
Author

uipo78 commented May 6, 2020

Part of the reason why I asked this is because we plan to deploy Argo onto our PSP-enabled clusters. The other reason, however, is because these pod security policies call attention to potential risks that need to be addressed before deploying to production, which is something that's particularly important for organizations with more stringent security requirements.

@stale
Copy link

stale bot commented Jul 5, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jul 5, 2020
@ibexmonj
Copy link

ibexmonj commented Jul 9, 2020

I came here looking for the same answers that @uipo78 seeks on PodSecurityPolicy. I am willing to help where i can as im about to take argo workflow for a test drive.

@stale stale bot removed the wontfix label Jul 9, 2020
@athornton
Copy link
Contributor

It's a little alarming that I need to turn a regular user pod into a super_privileged_container for Argo to be happy.

@athornton
Copy link
Contributor

athornton commented Jul 14, 2020

Also didn't help:

        "securityContext": {
            "runAsGroup": 53127,
            "runAsUser": 53127,
            "seLinuxOptions": {
                "type": "spc_t"
            },
            "supplementalGroups": [
                1726,
                33489280,
                1363,
                1618,
                1710,
                1383,
                1592,
                1374,
                1709,
                1555,
                1708,
                1554,
                1707,
                1706,
                1705,
                1616,
                1704,
                1098,
                1703,
                1367,
                1702,
                1701,
                1301,
                1103,
                1002,
                1744
            ]
        },

@athornton
Copy link
Contributor

My service account has a role which is a superset of what should be needed.

adam@ixitxachitl-wired:~/git/nublado/jupyterlab$ kubectl get role athornto-svcacct -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  annotations:
    argocd.argoproj.io/compare-options: IgnoreExtraneous
    argocd.argoproj.io/sync-options: Prune=false
  creationTimestamp: "2020-07-13T22:10:37Z"
  labels:
    argocd.argoproj.io/instance: nublado-users
  name: athornto-svcacct
  namespace: nublado-athornto
  resourceVersion: "10312967"
  selfLink: /apis/rbac.authorization.k8s.io/v1/namespaces/nublado-athornto/roles/athornto-svcacct
  uid: 07a2924b-c7be-44f7-b2c7-6bf0802f8fb3
rules:
- apiGroups:
  - argoproj.io
  resources:
  - workflows
  - workflows/finalizers
  verbs:
  - get
  - list
  - watch
  - update
  - patch
  - create
  - delete
- apiGroups:
  - argoproj.io
  resources:
  - workflowtemplates
  - workflowtemplates/finalizers
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - secrets
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - pods
  - pods/exec
  - services
  - configmaps
  verbs:
  - get
  - list
  - watch
  - create
  - delete
  - update
  - patch
- apiGroups:
  - ""
  resources:
  - pods/log
  - serviceaccounts
  verbs:
  - get
  - list
  - watch

@alexec
Copy link
Contributor

alexec commented Jul 14, 2020

Is this related to #3415 ?

@alexec
Copy link
Contributor

alexec commented Jul 14, 2020

@jessesuen it seems reasonable to run pod without super-user privileges. That is the case with the Docker executor.

In is not the case with other executors, perhaps you might try the PNS or K8SAPI executors?

https://argoproj.github.io/argo/workflow-executors/#process-namespace-sharing-pns

@athornton
Copy link
Contributor

The k8sapi executor does the trick for me (scalability is not a huge concern at this point).

@ibexmonj
Copy link

having the same issue here with trying to get a workflow deployed in a PSP enabled cluster.

The workflow is unable to mount the docker.sock hostpath. Can i have more details around the pros and cons of switching the containerRuntimeExecutor to k8sapi ?

I see the docs mention Least scalable since log retrieval and container operations are performed against the kubernetes api does this imply every workflow step is a API call and in turn extra load on the API server ?

@alexec
Copy link
Contributor

alexec commented Jul 27, 2020

The workflow is unable to mount the docker.sock hostpath. Can i have more details around the pros and cons of switching the containerRuntimeExecutor to k8sapi ?

https://argoproj.github.io/argo/workflow-executors/

@alexec
Copy link
Contributor

alexec commented Jul 27, 2020

The core team runs PNS executor successfully. If anyone has come up with a solution to using PSP, then can you please share it?

@stale
Copy link

stale bot commented Sep 26, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Sep 26, 2020
@alexec
Copy link
Contributor

alexec commented Sep 28, 2020

This is a popular issue .Pinning and removing wontfix

@stale stale bot removed the wontfix label Sep 28, 2020
@alexec alexec pinned this issue Sep 28, 2020
@alexec alexec unpinned this issue Sep 29, 2020
@alexec alexec added the type/security Security related label Oct 2, 2020
@alexec
Copy link
Contributor

alexec commented Oct 9, 2020

#4251 should contain enough information on how to runAsNonRoot. TLDR: use the k8sapi executor.

alexcapras pushed a commit to alexcapras/argo that referenced this issue Nov 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/security Security related
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants