-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(backend): Rootless and secure leightweight python component pipelines with k8sapi #4645
Conversation
Change necessary for secure k8sapi executor and leightweight python components. The k8sapi cannot extract from /tmp but mount an empytdir to /outputs and extract from there.
Hi @juliusvonkohout. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
One of the reasons
Can an emptyDir be mounted to
The proper place for such customizations would be DSL ( What do you think? |
The directory mount is done by kubernetes i think. That also explains why it is working in my rootless containers and is writeable by any container user. So there is no reason to use /tmp. We could also move /tmp/inputs to /inputs, i guess.
Yes, but k8sapi and PNS cannot extract artifacts if it is mounted over the base docker image like /tmp/... . it must be a new directory like /inputs or /outputs
Well i can also add it there, just tell me where exactly. And the most ideal place would be the Argo orchestrator: argoproj/argo#2679 It would be great to fix the issue there.
Well this might take ages. If argo implements it after some months, we also have to switch to the latest argo. I also think that one emptydir per artifact might be overkill. just using one emptydir /outputs and maybe even another /inputs is a fast, simple and clean solution for the current executors (PNS + K8sapi). If argo really changes it and we have the latest officially released version in a year, then we can always reconsider. But i think this small change in exchange for secure and rootless containers is definitely sensible. |
But that only works when something is mounted there, right? |
So yes also PNS without root but capabilities needs that mount too. i also found out that a newer argoexec is needed for pipelines that output string/int/float instead of files or namedtuples e.g.
|
Just to make it clear: I was talking about the internal container user. Some container images have
That's really strange. There should not be any difference between |
The docker executor accesses the docker sock and is therefore insecure by design. So you should just run your images a root. You can configure the user of the image inside workflow template. But if you really want this strange scenario, well then we should
With those three changes all of my python pipelines run secure and rootless with k8sapi. This should also work for PNS and Docker where the main container is rootless.
The bug is here bug argoproj/argo-workflows#1445 and fixed in newer argoexec images. I guess i just did not test namedtuple before, but with the newer argoexec everything is working |
I thought the PNS has a way to do that: https://github.com/argoproj/argo/blob/220ac736c1297c566667d3fb621a9dadea955c76/workflow/executor/pns/pns.go#L175
The main container is not privileged though.
It's not my choice really. Some authors of official container images (e.g. Airflow or Jupyter) do that. They release images where the user is not a root. The KFP SDK can do nothing about that.
It's pretty strange that you got issues depending on how you produce the outputs. They're supposed to be produced the same way (as files) on the low level.
It might be possible to add emptyDir volumes around here:
template['outputs']['artifacts'] to learn the artifact paths.
|
https://github.com/argoproj/argo/blob/master/docs/workflow-executors.md says: PNS Cannot capture artifacts from a base layer which has a volume mounted under it
Yes, but the host is compromised by accessing docker.sock. If the host is compromised, what is the point on rootless containers?
An important note: you have to set the user inside the securitycontext via currently i am using
But you are right, that if you run the strange scenario of compromised host via docker.sock but rootless containers, then this will be a problem. So we just always add the emptydir
Yes, i think it was a mistake on my side. NamedTuples and Outputpath should be files. Reagarding 'function() -> int:' its not a file i think. But it does not matter anymore since we need the newer argoexec image anyway.
Thank you, i will look into it. I will just add it unconditionally here
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: juliusvonkohout The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I have finally found the right place to insert the volume and volume mount. Way earlier than in _op_to_template.py. I used /argo because later on we might be able to also move inputs to /argo/inputs. I still encounter a general rootless k8sapi issue:
which is a RessourceOP only works if i click on retry, which is very strange. on first try i get the following error I have created a bug report argoproj/argo-workflows#4367 |
I upgraded the workflow-controller and argoexec images to 2.11.6 and added the missing CRDs and serviceaccount permission for the new version from https://github.com/argoproj/argo/blob/master/manifests/install.yaml . I tested successfully with rootless k8sapi and normal pns. Would you mind running the CI/CD tests for verification? |
@Ark-kun Both usecases are covered by my approach. They are now able to write there, because now there is an emptydir underneath. @numerology do you have any objections about running the automated tests? |
/ok-to-test |
@juliusvonkohout: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Alright, i looked at the test results and saw that the pipelines ran successfully, but of course the compiled output changed for container ops. So i do not see problems with my changes, but the expected reference output for some tests has to be updated because, there is an additional volume and mount now. Should i do it in this pull request? Does someone else want do do it? |
@Ark-kun should i work on that? |
/cc @chensun @numerology |
This might make this pull request obsolete argoproj/argo-workflows#4766 |
7542f0e
to
44d22a6
Compare
is this still interesting or do you want to update argo to 3.11 and use the emissary executor instead? |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@juliusvonkohout: The following tests failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@Ark-kun @Bobgy the tests are written in such a way that i cannot satisfy them. They just fail if there is an additional mount because they check container_dict["volume_mounts"][0] instead of properly checking for the volume_mount with the right name. pipelines/sdk/python/tests/dsl/extensions/test_kubernetes.py Lines 21 to 35 in 54ac9a6
Should i change them, or is the focus on the emissary executor (argo 3.1) for rootless pipelines? |
Hi @juliusvonkohout, there were other reasons we want to upgrade to argo v3 emissionary executor now. I think that will resolve your requirements too. Can you confirm? |
Yes, i confirm that this solves the requirements. I have created a new tracking issue for argo 3.1 here #5718 |
Change necessary for secure k8sapi executor and leightweight python components. The k8sapi cannot extract from /tmp but mount an empytdir to /outputs and extract from there.
I have tested it with k8sapi and PNS executor on kubernetes 1.18. It supports also metrics, mlpipeline-ui-metadata etc.
I can achieve the same with
Example pipeline snippet:
It would be amazing to add additional parameters to
func_to_container_op()
that automatically adds an emptydir and specific user if desiredPlease tell me if i should add this too. For me it seems necessary because e.g. packages_to_install only works with the correct user