Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SparkSubmit Connection Extras can be overridden #36151

Merged
merged 11 commits into from
Dec 13, 2023

Conversation

pateash
Copy link
Contributor

@pateash pateash commented Dec 10, 2023

closes: #35911

Description

Currently there are some arguments which are being provided using Spark Connection,
but there is no way to override them in SparkSubmitOperator and SparkSubmitHook
eg.
--queue: option specifies the YARN queue to which the application should be submitted.
--deploy-mode: option specified deploy mode client/cluster

more - https://spark.apache.org/docs/3.2.0/running-on-yarn.html

Use case/motivation

These use-cases are particularly useful in a multi-tenant environment where different users or groups have allocated resources in specific YARN queues, or want to use different deploy mode in each spark submit job which might be different from option provided in Spark Connection Extras.

@pateash
Copy link
Contributor Author

pateash commented Dec 10, 2023

cc @eladkal

@pateash pateash force-pushed the pateash/spark-operator-add-yarn-queue branch 4 times, most recently from 8337473 to 8c8d7bf Compare December 11, 2023 18:36
@@ -204,8 +213,8 @@ def _resolve_connection(self) -> dict[str, Any]:

# Determine optional yarn queue from the extra field
extra = conn.extra_dejson
conn_data["queue"] = extra.get("queue")
conn_data["deploy_mode"] = extra.get("deploy-mode")
conn_data["queue"] = self._queue if self._queue else extra.get("queue")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Override the values if passed otherwise fetch from Spark's Connection Extras

@pateash pateash force-pushed the pateash/spark-operator-add-yarn-queue branch from 8c8d7bf to b65d985 Compare December 11, 2023 18:38
@pateash pateash changed the title SparkSubmit Operator fixes SparkSubmit Operator Connection Extras should be allowed to be overriden Dec 11, 2023
@pateash pateash changed the title SparkSubmit Operator Connection Extras should be allowed to be overriden SparkSubmit Connection Extras should be allowed to be overriden Dec 11, 2023
@pateash pateash force-pushed the pateash/spark-operator-add-yarn-queue branch 2 times, most recently from ed36a39 to 681002a Compare December 12, 2023 04:16
@pateash pateash changed the title SparkSubmit Connection Extras should be allowed to be overriden SparkSubmit Connection Extras can be overridden Dec 12, 2023
@pateash
Copy link
Contributor Author

pateash commented Dec 12, 2023

cc. @eladkal @hussein-awala

@pateash pateash force-pushed the pateash/spark-operator-add-yarn-queue branch 2 times, most recently from 4108233 to 38c45e6 Compare December 12, 2023 15:28
Copy link
Contributor

@josh-fell josh-fell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Might be worth a rebase to see if that failing integration is flaky, an actual issue, or resolved by another commit.

@pateash pateash force-pushed the pateash/spark-operator-add-yarn-queue branch from 38c45e6 to f7bf58b Compare December 13, 2023 05:54
@pateash pateash force-pushed the pateash/spark-operator-add-yarn-queue branch from 377cf34 to f7bf58b Compare December 13, 2023 13:13
@pateash pateash force-pushed the pateash/spark-operator-add-yarn-queue branch from f7bf58b to 820b2fa Compare December 13, 2023 13:13
@josh-fell josh-fell merged commit 1b4a7ed into apache:main Dec 13, 2023
50 checks passed
@pateash pateash deleted the pateash/spark-operator-add-yarn-queue branch December 13, 2023 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adding Support for Yarn queue and other extras in SparkSubmit Operator and Hook
3 participants