Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipelines pods FailedMount #3407

Closed
sachua opened this issue Apr 1, 2020 · 13 comments
Closed

Pipelines pods FailedMount #3407

sachua opened this issue Apr 1, 2020 · 13 comments
Assignees

Comments

@sachua
Copy link

sachua commented Apr 1, 2020

I tried the example from https://github.com/kubeflow/examples/tree/master/pipelines/simple-notebook-pipeline to test the functionality of my Kubeflow Pipelines.

I was unable to create the pods for the pipelines.
Below are the event logs from the pods:

Events:                                                                                                                              
Type     Reason       Age                        From                                           Message                                                                                                                          
----     ------       ----                       ----                                           -------                                                                                                                                    
Normal   Scheduled    5m4s                       default-scheduler                              Successfully assigned kubeflow/calculation-pipeline-2jtc6-1659746977 to e9874f39-19a7-40ed-8a69-c9e1b6582c6f                               
Warning  FailedMount  <invalid> (x13 over 5m3s)  kubelet, e9874f39-19a7-40ed-8a69-c9e1b6582c6f  MountVolume.SetUp failed for volume "docker-sock" : hostPath type check failed: /var/run/docker.sock is not a socket file
Warning  FailedMount  <invalid> (x5 over 3m)     kubelet, e9874f39-19a7-40ed-8a69-c9e1b6582c6f  Unable to mount volumes for pod "calculation-pipeline-2jtc6-1659746977_kubeflow(5f41deaf-e082-4bbb-b01b-1662e8688d55)": timeout expired waiting for volumes to attach or mount for pod "kubeflow"/"calculation-pipeline-2jtc6-1659746977". list of unmounted volumes=[docker-sock]. list of unattached volumes=[podmetadata docker-sock mlpipeline-minio-artifact pipeline-runner-token-s4vdb]

Could it be due to the PVC requiring ReadWriteMany access?
I am using VMWare PKS to deploy my Kubernetes cluster.

If my cluster does not support ReadWriteMany access, are there any suggestions on how I can solve this problem?

@Ark-kun
Copy link
Contributor

Ark-kun commented Apr 1, 2020

failed for volume "docker-sock" : hostPath type check failed: /var/run/docker.sock is not a socket file

This is likely caused by a non-Docker Kubernetes cluster.
There is a way to change Argo executor from docker to something else to fix the issue (but there might be additional issues). See #1654 and #561

@Ark-kun Ark-kun self-assigned this Apr 1, 2020
@sachua
Copy link
Author

sachua commented Apr 2, 2020

@Ark-kun I checked that my Kubernetes cluster uses Docker as its container-runtime.
Nevertheless, I tried changing the Argo executor to k8sapi and kubelet, and ended up with this
error:
invalid spec: templates.add-op.outputs.artifacts.add-op-output: k8sapi executor does not support outputs from base image layer. must use emptyDir

I also tried running https://github.com/argoproj/argo/blob/master/examples/artifact-passing.yaml using KFP but i received the same error:
invalid spec: templates.whalesay.outputs.artifacts.hello-art: k8sapi executor does not support outputs from base image layer. must use emptyDir

Changing the output directory to a directory that does not exist in the base image results in the same error as well.

@Ark-kun
Copy link
Contributor

Ark-kun commented Apr 4, 2020

I also tried running https://github.com/argoproj/argo/blob/master/examples/artifact-passing.yaml using KFP but i received the same error:

Can you try changing the executor back to Docker and submitting that Argo example just using kubectl create -f https://github.com/argoproj/argo/blob/master/examples/artifact-passing.yaml ?

If it fails, then it looks like it's an upstream issue that's related to your cluster configuration. What kind of Kubernetes cluster are you using?

invalid spec: templates.whalesay.outputs.artifacts.hello-art: k8sapi executor does not support outputs from base image layer. must use emptyDir

If you really mount emptyDir: {} under the artifact file, this error will probably disappear. But it's not a good solution.

@sachua
Copy link
Author

sachua commented Apr 8, 2020

@Ark-kun I got the same original error if i change the executor back to Docker and submitting the Argo example.
My Kubernetes cluster is an on-prem, air-gapped Kubernetes cluster deployed using VMWare PKS.

@Ark-kun
Copy link
Contributor

Ark-kun commented Apr 8, 2020

Can you please try to create an issue in the Argo repo?
https://github.com/argoproj/argo/issues I think that the 'must use emptyDir' error can be effectively solved on the Argo side benefiting both Argo and KFP.

@sachua
Copy link
Author

sachua commented Apr 14, 2020

@Ark-kun Indeed, when I really mount emptyDir:{} under the artifact file, I was able to get pipelines running. I will try to create an issue in the Argo repo. Thanks for your help!

@sachua sachua closed this as completed Apr 14, 2020
@pvgbabu
Copy link

pvgbabu commented May 1, 2020

Hi Sachau, I am having same problem using PKS kubeflow. Could you plz elaborate how you solved this issue, Appreciate all your help.

@pvgbabu
Copy link

pvgbabu commented May 1, 2020

I am getting error like

invalid spec: templates.kale-marshal-volume.outputs.parameters.kale-marshal-volume-manifest: k8sapi executor does not support outputs from base image layer. must use emptyDir

@sachua
Copy link
Author

sachua commented May 2, 2020

@pvgbabu if you are creating workflow pipelines by applying a yaml file you will have to manually mount an emptyDir (see the docs)

If you are creating workflow pipelines using jupyter notebook (e.g. Kubeflow's example simple-pipeline), you will have to add a few lines of code when you are creating the operation arguements to add the emptyDir and mount it to the containter-op. See #1654 (comment)

Alternatively, I would suggest that you use PNS executor instead, as that was what I settled on after trying all the available executors. I opened an issue in the Argo repo regarding the mounting of emptyDir and it seems to be not possible currently for k8s api and kubelet executors to support outputs from the base image layer argoproj/argo-workflows#2679 (comment)

I was able to find a concised pros vs cons of each executor, hope it will be able to help you decide better on the executor you want to use argoproj/argo-workflows#1256 (comment) (it seems that Argo is using PNS themselves)

Hope I was able to help!

@pvgbabu
Copy link

pvgbabu commented May 6, 2020

@sachua Thnaks for the information. My kubeflow pipelines generated using KALE. Is there any setting or config yaml to update in KALE to generate #1654 (comment) automatically.

@sachua
Copy link
Author

sachua commented May 8, 2020

@pvgbabu I have not tried Kale, but I would think that they have a way to add a volume mount? Or you could edit the code generated by Kale?

I highly recommend you try the PNS executor first, it would solve the emptyDir issue without the need for you to constantly need to mount emptyDir to every container operation.

@sachua
Copy link
Author

sachua commented May 18, 2020

@pvgbabu I just tried Kale, it works with PNS executor, but I had to edit the pipeline yaml's access mode from ReadWriteMany to ReadWriteOnce, since PKS doesn't support ReadWriteMany currently.

@pvgbabu
Copy link

pvgbabu commented May 22, 2020

@sachu my KALE working with PNS executor. Thanks for all your help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants