Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SparkKubernetesOperator not rendering template correctly #37261

Closed
1 of 2 tasks
zlosim opened this issue Feb 8, 2024 · 3 comments · Fixed by #37271
Closed
1 of 2 tasks

SparkKubernetesOperator not rendering template correctly #37261

zlosim opened this issue Feb 8, 2024 · 3 comments · Fixed by #37271
Labels
area:providers kind:bug This is a clearly a bug provider:cncf-kubernetes Kubernetes provider related issues

Comments

@zlosim
Copy link

zlosim commented Feb 8, 2024

Apache Airflow Provider(s)

cncf-kubernetes

Versions of Apache Airflow Providers

7.14.0

Apache Airflow version

2.7.2

Operating System

aws

Deployment

Amazon (AWS) MWAA

Deployment details

No response

What happened

Im trying to run SparkKubernetesOperator passing template as python dict via template_spec , in this map Im trying to pass {{ds}} as one of the argument for the app. when I check the airflow UI i can see it was rendered correctly
gAJORwT
but when i check the app in K8s directly i can see it was not submitted with rendered argument

$ kubectl get sparkapplications.sparkoperator.k8s.io -n spark  xyz-ipc2kunh -o yaml

apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
  creationTimestamp: "2024-02-08T19:58:25Z"
  generation: 1
  name: xyz-ipc2kunh
  namespace: spark
  resourceVersion: "203894782"
  uid: 52cbdba9-0269-4dce-b7d1-d14e822dcde4
spec:
  arguments:
  - '{{ds}}'
  driver:
    coreLimit: 1200m
    cores: 1
    labels:
      version: 3.5.0
    memory: 1g
    serviceAccount: spark-operator-spark
  executor:
    cores: 1
    labels:
      version: 3.5.0
    memory: 8g
  hadoopConf: {}
  image: spark:3.5.0
  imagePullPolicy: Always
  mainApplicationFile: local:///opt/spark/examples/jars/spark-examples_2.12-3.5.0.jar
  mainClass: org.apache.spark.examples.SparkPi
  mode: cluster
  restartPolicy:
    type: Never
  sparkConf:
    spark.jars.ivy: /tmp
    spark.kubernetes.authenticate.driver.serviceAccountName: zeppelin-server
    spark.kubernetes.driver.service.deleteOnTermination: "false"
  sparkVersion: 3.5.0
  timeToLiveSeconds: 1800
  type: Scala
status:
  applicationState:
    errorMessage: 'driver container failed with ExitCode: 1, Reason: Error'
    state: FAILED
  driverInfo:
    podName: xyz-ipc2kunh-driver
    webUIAddress: 10.100.194.61:0
    webUIPort: 4040
    webUIServiceName: xyz-ipc2kunh-ui-svc
  executionAttempts: 1
  executorState:
    spark-pi-4732d98d8a4cf813-exec-1: COMPLETED
  lastSubmissionAttemptTime: "2024-02-08T19:58:30Z"
  sparkApplicationId: spark-d87f2f6f028943c2910968a709719ec1
  submissionAttempts: 1
  submissionID: ca78be37-e059-4863-9259-1dfd2758aaa7
  terminationTime: "2024-02-08T19:58:43Z"

What you think should happen instead

airflow should submit spark app with rendered template

How to reproduce

you can use this dag and then check the spark app created

from datetime import datetime, timedelta
from airflow import DAG
from airflow.providers.cncf.kubernetes.operators.spark_kubernetes import SparkKubernetesOperator


DAG_ID = "data-migration3"
spec = {'apiVersion': 'sparkoperator.k8s.io/v1beta2',
        'kind': 'SparkApplication',
        'metadata': {'namespace': 'spark'},
        'spec': {
            'arguments': ['{{ds}}'],
            'driver': {
                'coreLimit': '1200m',
                'cores': 1,
                'labels': {'version': '3.5.0'},
                'memory': '1g',
                'serviceAccount': 'spark-operator-spark',
            },
            'executor': {
                'cores': 1,
                'labels': {'version': '3.5.0'},
                'memory': '8g',
            },
            'hadoopConf': {},
            'image': 'spark:3.5.0',
            'imagePullPolicy': 'Always',
            'mainApplicationFile': 'local:///opt/spark/examples/jars/spark-examples_2.12-3.5.0.jar',
            'mainClass': 'org.apache.spark.examples.SparkPi',
            'mode': 'cluster',
            'restartPolicy': {'type': 'Never'},
            'sparkConf': {
                'spark.jars.ivy': '/tmp',
                'spark.kubernetes.authenticate.driver.serviceAccountName': 'zeppelin-server',
                'spark.kubernetes.driver.service.deleteOnTermination': 'false',
            },
            'sparkVersion': '3.5.0',
            'timeToLiveSeconds': 1800,
            'type': 'Scala'}}
with DAG(
        DAG_ID,
        default_args={"max_active_runs": 1},
        description="migrate data from backups to new tables",
        schedule="5 4 2 * *",
        start_date=datetime(2023, 11, 22),
        end_date=datetime(2023, 12, 22),
        catchup=True,
) as dag:
    SparkKubernetesOperator(
        base_container_name="spark-kubernetes-driver",
        retries=0,
        retry_exponential_backoff=True,
        max_retry_delay=timedelta(minutes=3),
        retry_delay=timedelta(seconds=10),
        depends_on_past=False,
        task_id="{}".format("xyz"),
        namespace="spark",
        delete_on_termination=False,
        reattach_on_restart=True,
        get_logs=True,
        log_events_on_failure=True,
        name="xyz",
        template_spec=spec,
        kubernetes_conn_id='dwh_eks',
        dag=dag
    )

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@zlosim zlosim added area:providers kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Feb 8, 2024
Copy link

boring-cyborg bot commented Feb 8, 2024

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@hussein-awala hussein-awala added provider:cncf-kubernetes Kubernetes provider related issues and removed needs-triage label for new issues that we didn't triage yet labels Feb 8, 2024
@hussein-awala
Copy link
Member

I can reproduce it in unit tests, it seems related to #22253.

@nikhilkarve
Copy link

How was this related to that and what is the resolution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers kind:bug This is a clearly a bug provider:cncf-kubernetes Kubernetes provider related issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants