Pass K8sPluginConfig to spark driver and executor pods #patch #271

fg91 · 2022-06-14T18:37:03Z

TL;DR

Currently, when running Spark tasks, some of the k8s plugin config values configured in the helm values (such as for instance DefaultTolerations, NodeSelector, HostNetwork, SchedulerName, ...) are not carried over to the SparkApplication and, thus, not to the driver and executor pods.

In my specific case this is limiting because we run Flyte itself and the Spark operator on cheap nodes while giving workflows the ability to start high-powered nodes via default tolerations.

This PR fixes this issue.

Type

Bug Fix
Feature
Plugin

Are all requirements met?

Complete description

Comparing the custom SparkPodSpec with Flyte’s K8sPluginConfig shows that the following configurations are already carried over to the SparkPodSpec of the driver and the executor:

DefaultAnnotations
DefaultLabels
InterruptibleTolerations
DefaultNodeSelector and InterruptibleNodeSelector
DefaultPodSecurityContext (called SecurityContext in SparkPodSpec)
DefaultPodDNSConfig (called DNSConfig in SparkPodSpec)

This PR adds logic to carry over the following configurations:

DefaultTolerations
SchedulerName
DefaultNodeSelector
EnableHostNetworkingPod
DefaultEnvVarsFromEnv
DefaultAffinity

Follow-up issue

The Spark operator passes the tolerations from the SparkApplication along to the pods only if the operator itself is installed with the --set webhook.enable=true value to activate Mutating Admission Webhooks. I feel I should document this somewhere. Should I make a PR to note this here or would you recommend another place?

welcome · 2022-06-14T18:37:06Z

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
Sign off your commits (Reference: DCO Guide).

Signed-off-by: fg91 <[email protected]>

kumare3 · 2022-06-15T05:37:08Z

cc @hamersaw

codecov · 2022-06-15T05:41:23Z

Codecov Report

Merging #271 (7884de7) into master (902b902) will increase coverage by 0.39%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #271      +/-   ##
==========================================
+ Coverage   62.97%   63.37%   +0.39%     
==========================================
  Files         142      145       +3     
  Lines        8970     9324     +354     
==========================================
+ Hits         5649     5909     +260     
- Misses       2799     2872      +73     
- Partials      522      543      +21

Flag	Coverage Δ
unittests	`62.79% <100.00%> (+0.46%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
go/tasks/pluginmachinery/flytek8s/pod_helper.go	`79.13% <100.00%> (-2.16%)`	⬇️
go/tasks/plugins/k8s/spark/spark.go	`79.29% <100.00%> (+0.83%)`	⬆️
go/tasks/plugins/array/awsbatch/executor.go	`38.05% <0.00%> (-4.93%)`	⬇️
go/tasks/plugins/k8s/pod/container.go	`66.66% <0.00%> (-4.77%)`	⬇️
go/tasks/plugins/k8s/pod/sidecar.go	`78.02% <0.00%> (-2.20%)`	⬇️
go/tasks/plugins/array/k8s/subtask.go	`30.35% <0.00%> (-0.56%)`	⬇️
...o/tasks/plugins/k8s/kfoperators/pytorch/pytorch.go	`71.62% <0.00%> (-0.38%)`	⬇️
...s/plugins/k8s/kfoperators/tensorflow/tensorflow.go	`75.86% <0.00%> (-0.28%)`	⬇️
go/tasks/pluginmachinery/encoding/encoder.go	`100.00% <0.00%> (ø)`
... and 11 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Signed-off-by: fg91 <[email protected]>

hamersaw · 2022-06-23T12:15:44Z

Filed an issue for this so we can better track. Hopefully will improve visibility.

hamersaw · 2022-10-10T14:42:21Z

@fg91 I think we discussed scoping this PR down to just applying the existing configuration? We can push upgrading the k8s-on-spark-operator once we have out-of-core plugins implemented (on the roadmap for 1.3 release). Do you have any bandwidth to make these changes? Otherwise we may be able to contribute as well.

Signed-off-by: Fabio Grätz <[email protected]>

fg91 · 2022-10-12T19:10:03Z

@hamersaw I removed the commits in which I tried to upgrade K8s and Spark-on-K8s. Then I compared the custom SparkPodSpec from the currently pinned version with Flyte’s K8sPluginConfig again and additionally carried over EnableHostNetworkingPod, DefaultEnvVarsFromEnv, and DefaultAffinity. (I adapted the PR description.)

I'm interested in your opinion how you would treat:

DefaultCPURequest
DefaultMemoryRequest
GpuResourceName
ResourceTolerations

In principle one could map them to SparkPodSpec.Cores, SparkPodSpec.Memory, ... however I'm not convinced of this because of the following reason:

In a Spark task in flyte one typically configures the resources for the ephemeral spark cluster this way (source):

@task(
    task_config=Spark(
        # this configuration is applied to the spark cluster
        spark_conf={
            "spark.driver.memory": "1000M",
            "spark.driver.cores": "1",
        }
    ),
)
def hello_spark(partitions: int) -> float:

Considering DefaultCPURequest in addition to spark_conf feels like doing the following which according to my understanding doesn't make sense:

@task(
    task_config=Spark(
        # this configuration is applied to the spark cluster
        spark_conf={
            "spark.driver.memory": "1000M",
            "spark.driver.cores": "1",
        }
    ),
    requests=Resources(                  # <-- new
        mem="1G",
    ),
)
def hello_spark(partitions: int) -> float:

If you agree, I'd propose to not carry the above mentioned values over and mark this PR ready for review.

go/tasks/plugins/k8s/spark/spark_test.go

Signed-off-by: Fabio Grätz <[email protected]>

fg91 · 2022-10-12T19:34:43Z

@hamersaw can you please validate that the behaviour I documented in this commit is the desired one?

hamersaw · 2022-10-13T18:06:11Z

If you agree, I'd propose to not carry the above mentioned values over and mark this PR ready for review.

I totally agree. The ResourceTolerations is interesting, presumably we could use resources set in the spark_conf (ex. spark.driver.memory), but this seems very error prone. This disjoint between Flyte task resource requests and Spark resources (ie. heterogeneously set between driver and executors) makes this a difficult problem. IMO the proposed implementation is the least error prone.

hamersaw · 2022-10-13T18:44:06Z

@hamersaw can you please validate that the behaviour I documented in this commit is the desired one?

It looks like in other k8s resources setting the task to interruptible appends the interruptible tolerations and node selectors rather than overriding them. I think it makes sense to mimic this functionality here, thoughts?

go/tasks/plugins/k8s/spark/spark.go

Signed-off-by: Fabio Grätz <[email protected]>

fg91 · 2022-10-15T16:34:17Z

It looks like in other k8s resources setting the task to interruptible appends the interruptible tolerations and node selectors rather than overriding them. I think it makes sense to mimic this functionality here, thoughts?

tolerations: I agree and this is also currently done here.
nodeSelector: The link you posted doesn't tackle node selectors and I think appending wouldn't work:

The docs say:

Kubernetes only schedules the Pod onto nodes that have each of the labels you specify.

When trying to schedule this pod
```
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    env: test
spec:
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent
  nodeSelector:
    foo: bar
    foofoo: barbar
```
on a node that has only been labeled kubectl label nodes k3d-sandbox-server-0 foo=bar, it remains pending:
```
Warning  FailedScheduling  10s   default-scheduler  0/1 nodes are available: 1 node(s) didn't match Pod's node affinity/selector.
```
I think in order to append, one would have to use affinities:

The affinity/anti-affinity language is more expressive. nodeSelector only selects nodes with all the specified labels. Affinity/anti-affinity gives you more control over the selection logic.

Here in K8sPluginConfig it says that InterruptibleNodeSelector is deprecated anyways and that one should use InterruptibleNodeSelectorRequirement instead.

Neither InterruptibleNodeSelectorRequirement nor NonInterruptibleNodeSelectorRequirement are currently handled for Spark tasks. I added functionality to carry over the default affinity and I'm now considering whether the correct behaviour would be to add the InterruptibleNodeSelectorRequirement or NonInterruptibleNodeSelectorRequirement to the nodeSelectorRequirements in the default affinity. This would be basically the same as done here with the only disadvantage that SparkApplication doesn't use v1.PodSpec but a custom SparkPodSpec (see here).

We could probably do this without duplicating the code by putting most of the logic into a function that takes (interruptible bool, podSpec *v1.Affinity, nodeSelectorRequirement *v1.NodeSelectorRequirement) as arguments and then using this function in the exiting function that adds the node selector requirement to a PodSpec as well as in a new function that adds it to a SparkPodSpec.

Did I miss something in my interpretation of nodeSelector and do you agree that this should be handling of the NodeSelectorRequirements?

hamersaw · 2022-10-17T15:32:01Z

Did I miss something in my interpretation of nodeSelector and do you agree that this should be handling of the NodeSelectorRequirements?

Thanks for correcting me on the current NodeSelector application!

I absolutely agree. Let's go ahead with the updates you proposed. I think this is the last issue right?

Signed-off-by: Fabio Grätz <[email protected]>

welcome · 2022-11-01T15:16:17Z

Congrats on merging your first pull request! 🎉

* Pass default tolerations to spark driver and executor Signed-off-by: fg91 <[email protected]> * Test passing default tolerations to spark driver and executor Signed-off-by: fg91 <[email protected]> * Pass scheduler name to driver and executor SparkPodSpec Signed-off-by: fg91 <[email protected]> * Carry DefaultNodeSelector from k8s plugin config to SparkPodSpec Signed-off-by: fg91 <[email protected]> * Carry over EnableHostNetworkingPod Signed-off-by: Fabio Grätz <[email protected]> * Test carrying over of default env vars Signed-off-by: Fabio Grätz <[email protected]> * Carry over DefaultEnvVarsFromEnv Signed-off-by: Fabio Grätz <[email protected]> * Carry over DefaultAffinity Signed-off-by: Fabio Grätz <[email protected]> * Doc behaviour of default and interruptible NodeSelector and Tolerations Signed-off-by: Fabio Grätz <[email protected]> * Don't carry over default env vars from env and fix test Signed-off-by: Fabio Grätz <[email protected]> * Lint Signed-off-by: Fabio Grätz <[email protected]> * Apply node selector requirement to pod affinity Signed-off-by: Fabio Grätz <[email protected]> Signed-off-by: fg91 <[email protected]> Signed-off-by: Fabio Grätz <[email protected]> Co-authored-by: Fabio Grätz <[email protected]>

fg91 added 2 commits June 14, 2022 20:39

Pass default tolerations to spark driver and executor

1413eda

Signed-off-by: fg91 <[email protected]>

Test passing default tolerations to spark driver and executor

87988c9

Signed-off-by: fg91 <[email protected]>

fg91 force-pushed the fg91-spark-tolerations branch from 6b0c825 to 87988c9 Compare June 14, 2022 18:39

fg91 marked this pull request as draft June 19, 2022 18:55

fg91 changed the title ~~Pass default tolerations to spark driver and executor pods #patch~~ Pass K8sPluginConfig to spark driver and executor pods #patch Jun 19, 2022

fg91 added 2 commits June 19, 2022 22:23

Pass scheduler name to driver and executor SparkPodSpec

9fea66e

Signed-off-by: fg91 <[email protected]>

Carry DefaultNodeSelector from k8s plugin config to SparkPodSpec

59905ec

Signed-off-by: fg91 <[email protected]>

fg91 force-pushed the fg91-spark-tolerations branch from 22b3438 to 59905ec Compare June 19, 2022 20:23

Merge branch 'flyteorg:master' into fg91-spark-tolerations

4346d51

fg91 force-pushed the fg91-spark-tolerations branch from 4816c17 to 46244c2 Compare October 12, 2022 18:05

Fabio Grätz added 4 commits October 12, 2022 20:56

Carry over EnableHostNetworkingPod

7fb13e5

Signed-off-by: Fabio Grätz <[email protected]>

Test carrying over of default env vars

4b99700

Signed-off-by: Fabio Grätz <[email protected]>

Carry over DefaultEnvVarsFromEnv

283f283

Signed-off-by: Fabio Grätz <[email protected]>

Carry over DefaultAffinity

e5873ed

Signed-off-by: Fabio Grätz <[email protected]>

fg91 force-pushed the fg91-spark-tolerations branch from ec1146f to e5873ed Compare October 12, 2022 18:56

fg91 commented Oct 12, 2022

View reviewed changes

go/tasks/plugins/k8s/spark/spark_test.go Show resolved Hide resolved

Doc behaviour of default and interruptible NodeSelector and Tolerations

ebdd8ac

Signed-off-by: Fabio Grätz <[email protected]>

fg91 force-pushed the fg91-spark-tolerations branch from 3afefec to ebdd8ac Compare October 12, 2022 19:33

fg91 marked this pull request as ready for review October 12, 2022 19:50

fg91 requested a review from hamersaw October 12, 2022 19:50

hamersaw reviewed Oct 13, 2022

View reviewed changes

go/tasks/plugins/k8s/spark/spark.go Outdated Show resolved Hide resolved

Don't carry over default env vars from env and fix test

63fd182

Signed-off-by: Fabio Grätz <[email protected]>

fg91 force-pushed the fg91-spark-tolerations branch from a1678d3 to 63fd182 Compare October 15, 2022 15:34

Lint

d843ed8

Signed-off-by: Fabio Grätz <[email protected]>

fg91 force-pushed the fg91-spark-tolerations branch from fea047a to d843ed8 Compare October 15, 2022 15:53

Apply node selector requirement to pod affinity

7884de7

Signed-off-by: Fabio Grätz <[email protected]>

fg91 force-pushed the fg91-spark-tolerations branch from 7099d14 to 7884de7 Compare October 26, 2022 21:22

fg91 requested a review from hamersaw October 26, 2022 21:27

hamersaw approved these changes Oct 31, 2022

View reviewed changes

hamersaw merged commit a87ca40 into flyteorg:master Nov 1, 2022

fg91 deleted the fg91-spark-tolerations branch November 1, 2022 16:38

hamersaw mentioned this pull request Nov 14, 2022

[BUG] IAM roles are not passed on to MPI Launcher and worker pods flyteorg/flyte#3043

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass K8sPluginConfig to spark driver and executor pods #patch #271

Pass K8sPluginConfig to spark driver and executor pods #patch #271

fg91 commented Jun 14, 2022 •

edited

Loading

welcome bot commented Jun 14, 2022

kumare3 commented Jun 15, 2022

codecov bot commented Jun 15, 2022 •

edited

Loading

hamersaw commented Jun 23, 2022

hamersaw commented Oct 10, 2022

fg91 commented Oct 12, 2022

fg91 commented Oct 12, 2022

hamersaw commented Oct 13, 2022 •

edited

Loading

hamersaw commented Oct 13, 2022 •

edited

Loading

fg91 commented Oct 15, 2022 •

edited

Loading

hamersaw commented Oct 17, 2022

welcome bot commented Nov 1, 2022

Pass K8sPluginConfig to spark driver and executor pods #patch #271

Pass K8sPluginConfig to spark driver and executor pods #patch #271

Conversation

fg91 commented Jun 14, 2022 • edited Loading

TL;DR

Type

Are all requirements met?

Complete description

Follow-up issue

welcome bot commented Jun 14, 2022

kumare3 commented Jun 15, 2022

codecov bot commented Jun 15, 2022 • edited Loading

Codecov Report

hamersaw commented Jun 23, 2022

hamersaw commented Oct 10, 2022

fg91 commented Oct 12, 2022

fg91 commented Oct 12, 2022

hamersaw commented Oct 13, 2022 • edited Loading

hamersaw commented Oct 13, 2022 • edited Loading

fg91 commented Oct 15, 2022 • edited Loading

hamersaw commented Oct 17, 2022

welcome bot commented Nov 1, 2022

fg91 commented Jun 14, 2022 •

edited

Loading

codecov bot commented Jun 15, 2022 •

edited

Loading

hamersaw commented Oct 13, 2022 •

edited

Loading

hamersaw commented Oct 13, 2022 •

edited

Loading

fg91 commented Oct 15, 2022 •

edited

Loading