Do we need paddlectl client once we have the kubernetes custom controller? #383

typhoonzero · 2017-10-11T07:01:14Z

Once we have TPR/CRD declared resource:

apiVersion: paddlepaddle.org/v1
kind: TrainingJob
metadata:
  name: job-1
spec:
  image: "paddlepaddle/paddlecloud-job"
  trainer:
    entrypoint: "python train.py"
    workspace: "/home/job-1/"
    min-instance: 3
    max-instance: 6
    resources:
      limits:
        alpha.kubernetes.io/nvidia-gpu: 1
        cpu: "800m"
        memory: "1Gi"
      requests:
        cpu: "500m"
        memory: "600Mi"
  pserver:
    min-instance: 3
    max-instance: 3
    resources:
      limits:
        cpu: "800m"
        memory: "1Gi"
      requests:
        cpu: "500m"
        memory: "600Mi"

Run kubectl create -f job.yaml is exactly equal to the current paddlectl submit -jobname xxx -gpu xxx ...

The only difference is that paddlectl client is able to upload and download training data files.

The text was updated successfully, but these errors were encountered:

Yancey1989 · 2017-10-11T07:24:40Z

Cool! Maybe we can use kubectl instead of paddlectl? I have some ideas about this:

Advantage
- Users can use some kubectl features directly, such as kubectl logs, kubectl get pods..., we don't need to implement these features on the cloud server.
- We can use RBAC instead of Django admin to manage the users.
Disadvantage
- kubectl use YAML as the configuration file, it's hard to use the command-line parameters.

typhoonzero · 2017-10-11T07:35:36Z

Plus disadvantage:
kubectl exposed too much details of kubernetes that users may never use.

Yancey1989 · 2017-10-11T07:59:39Z

An extra suggestion, shall we change the resource name from TrainingJob to Paddle? Maybe it makes more sense.

gongweibao · 2017-10-11T08:07:15Z

Plus disadvantage:
If a YAML's format is not right, it's hard to find where it is, so it's not convenient for the user to use it.

typhoonzero · 2017-10-11T08:09:58Z

@Yancey1989 thought TrainingJob is more general, not only paddle training.

putcn · 2017-10-11T18:27:02Z

this is an interesting thinking. 👍
my 2 cents are: Can we make paddlectl kind of proxy to kubectl? so that we can do some filtering on the features we don't want to expose to end user before the parameters actually hit kubectl and still keep the same command pattern?

helinwang · 2017-10-11T18:41:16Z

Maybe our local command line can take the yaml as the input. So we don't have to map user's input to the ymal again.

I am more inclined not allowing our user to use kubectl, since what we want to support is just a subset of kubectl (e.g., do we want to allow the user create any Pod?), maybe we can use @putcn 's idea, "make paddlectl kind of proxy to kubectl, so that we can do some filtering"

typhoonzero · 2017-10-12T07:09:34Z

Support @putcn 's idea! Proxing and filter is simple enough and easy!

Yancey1989 · 2017-10-12T07:27:05Z

From @helinwang

do we want to allow the user create any Pod

I don't think so, it's not safely and out of our control.

From @putcn

make paddlectl kind of proxy to kubectl, so that we can do some filtering

It's a good idea! We can use cloud server as a proxy, paddlectl convert command-line parameters to YAML and cloud server submit the YAML to kubernetes.

Yancey1989 · 2017-10-12T09:01:59Z

Maybe I can develop this feature, how about push to the controller branch, so that we can publish a complete feature(auto-scaling) when we merge to the develop branch.

helinwang · 2017-10-12T22:20:30Z

@Yancey1989 Sure, that would be awesome!

pineking · 2017-10-13T02:43:23Z

That's a great idea, I have one more question, @Yancey1989 why we need cloud server to submit the YAML to kubernetes, could the paddlectl submit the YAML directly?

Yancey1989 · 2017-10-13T04:56:16Z

Hi @pineking , As the design #378 , PaddleCloud has its own account management , RBAC in kubernetes is too simple, so we can not submit the YAML directly, and I think this is the main reason.

pineking · 2017-10-13T05:01:15Z

@Yancey1989 , thanks, I will read the design.

helinwang · 2017-10-18T03:19:23Z

Today's discussion result:

We still need server since it knows about cloud storage. Command line will be backward compatible (internally convert to yaml), support use submit yaml directly. Client will send yaml to server.
Eventually controller will start / scale / kill training job (now controller is only scaling job).

typhoonzero added the need be discussed label Oct 11, 2017

typhoonzero assigned Yancey1989, helinwang, gongweibao and putcn Oct 11, 2017

Yancey1989 mentioned this issue Oct 14, 2017

[DO NOT MERGE]Submit TrainingJob #390

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do we need paddlectl client once we have the kubernetes custom controller? #383

Do we need paddlectl client once we have the kubernetes custom controller? #383

typhoonzero commented Oct 11, 2017

Yancey1989 commented Oct 11, 2017 •

edited

Loading

typhoonzero commented Oct 11, 2017

Yancey1989 commented Oct 11, 2017 •

edited

Loading

gongweibao commented Oct 11, 2017 •

edited

Loading

typhoonzero commented Oct 11, 2017

putcn commented Oct 11, 2017

helinwang commented Oct 11, 2017 •

edited

Loading

typhoonzero commented Oct 12, 2017

Yancey1989 commented Oct 12, 2017

Yancey1989 commented Oct 12, 2017

helinwang commented Oct 12, 2017

pineking commented Oct 13, 2017

Yancey1989 commented Oct 13, 2017

pineking commented Oct 13, 2017

helinwang commented Oct 18, 2017 •

edited

Loading

Do we need paddlectl client once we have the kubernetes custom controller? #383

Do we need paddlectl client once we have the kubernetes custom controller? #383

Comments

typhoonzero commented Oct 11, 2017

Yancey1989 commented Oct 11, 2017 • edited Loading

typhoonzero commented Oct 11, 2017

Yancey1989 commented Oct 11, 2017 • edited Loading

gongweibao commented Oct 11, 2017 • edited Loading

typhoonzero commented Oct 11, 2017

putcn commented Oct 11, 2017

helinwang commented Oct 11, 2017 • edited Loading

typhoonzero commented Oct 12, 2017

Yancey1989 commented Oct 12, 2017

Yancey1989 commented Oct 12, 2017

helinwang commented Oct 12, 2017

pineking commented Oct 13, 2017

Yancey1989 commented Oct 13, 2017

pineking commented Oct 13, 2017

helinwang commented Oct 18, 2017 • edited Loading

Yancey1989 commented Oct 11, 2017 •

edited

Loading

Yancey1989 commented Oct 11, 2017 •

edited

Loading

gongweibao commented Oct 11, 2017 •

edited

Loading

helinwang commented Oct 11, 2017 •

edited

Loading

helinwang commented Oct 18, 2017 •

edited

Loading