Skip to content
This repository has been archived by the owner on Sep 19, 2022. It is now read-only.

MPI backend examples launch processes independently in each pod #89

Open
jwwandy opened this issue Oct 24, 2018 · 9 comments
Open

MPI backend examples launch processes independently in each pod #89

jwwandy opened this issue Oct 24, 2018 · 9 comments

Comments

@jwwandy
Copy link
Contributor

jwwandy commented Oct 24, 2018

https://github.com/kubeflow/pytorch-operator/blob/master/examples/ddp/mnist/gpu/v1alpha2/job_mnist_DDP_GPU.yaml

When launching MPI backend jobs examples above with ENTRYPOINT ["mpirun", "-n", "4", "--allow-run-as-root", "python", "-u", "/opt/pytorch_dist_mnist/mnist_ddp_gpu.py"] in Dockerfile,I expected to do distributed training where it launched 1 process on each pod(totally 4, with 1 master and 3 workers).

However, it seems like it launched 4 processes on each pod and trained independently.
Is there anything I misunderstood of this examples?

@johnugeorge
Copy link
Member

@Akado2009

@Akado2009
Copy link
Contributor

@jwwandy, greetings. For now, as far as I know -n option in mpirun species the number of copies of process to run, not just the number of containers/pods.

@jwwandy
Copy link
Contributor Author

jwwandy commented Nov 20, 2018

@Akado2009 Nice to hear from you. Exactly as you mentioned, the -n option in mpirun species the number of copies of process to run and should schedule them on different MPI nodes(no matter how many there are).

What I'm confused is that it seems like there's no mechanism for current examples to discover pods for openMPI as a MPI cluster for launching process across pods. Each pod functioned as an independent MPI cluster launching its own group of processes.

If the examples are just aimed to launch multiple process independently on each pod, rather than do distributed training across pods, then the current example works well. However, I assume part of usage of kubeflow is to do distributed training across multiple pods, which I can only achieve by adding ssh key after pod created as openmpi documents https://www.open-mpi.org/faq/?category=rsh for now.

@Akado2009
Copy link
Contributor

@jwwandy Sorry for the late response, was busy working. But yeah, you're right, this example treats each pod as a separate openMPI cluster.

I was thinking about making an upgraded version of this example, so that it treats you k8s cluster as an openmpi cluster, then you're job is gonna be real distributed.

@jwwandy
Copy link
Contributor Author

jwwandy commented Nov 20, 2018

@Akado2009 Thanks to make it clear.

Although my current workaround is quite dirty by writing a shell script and downward API to setup all of ssh stuff after pods creation, I think they could(and should) be done by controller.

  1. Generating private ssh key for each pod and broadcast public key to all pods as authorized keys
  2. Adding ssh known_host (Also could disable the feature in ssh_config)
  3. Having a hostfile(with hostname from yaml) for mpirun --hostfile option

Wish these short steps might help some.

@Akado2009
Copy link
Contributor

@jwwandy Yes, I agree, that it should be done by the controller.
Thank you for your workaround, I am going try to implement this logic inside the controller :)

@ilchemla
Copy link

ilchemla commented Jul 8, 2019

Any news about this issue?

@johnugeorge
Copy link
Member

Can mpi-operator solve your issue? What is your use case?

@jtfogarty
Copy link

/area operator
/kind feature
/priority p2

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants