[ZEPPELIN-5760] fix passing configs to spark in k8s #4398

zlosim · 2022-07-07T13:15:55Z

What is this PR for?

Some important spark configs can`t be passed via SparkConf in k8s and in client mode including spark.jars.packages and spark.driver.extraJavaOptions
This fix checks if these are set in interpreter configuration and passes them to spark-submit command

What type of PR is it?

Bug Fix

What is the Jira issue?

https://issues.apache.org/jira/browse/ZEPPELIN-5760

How should this be tested?

updated test to test this fix

Questions:

Does the licenses files need to update? No
Is there breaking changes for older versions? No
Does this needs documentation? No

Reamer · 2022-07-10T17:53:38Z

I think most Spark configuration values can be included in the environment variable SPARK_SUBMIT_OPTIONS. Please note that the K8sProcess works for all Zeppelin interpreters.

zlosim · 2022-07-11T12:19:06Z

Hi @Reamer I'm sorry but I am not sure if I understand you correctly.
As you stated, spark configs can be included in the SPARK_SUBMIT_OPTIONS but they were NOT. This fix specifically adds these configs to SPARK_SUBMIT_OPTIONS
there are also two spark configs that CAN'T be passed in SPARK_SUBMIT_OPTIONS because of some functionality happening in 'interpreter.sh' thus they are passed in 'SPARK_DRIVER_EXTRAJAVAOPTIONS_CONF'

also this fix only changes behavior for spark only

Reamer · 2022-07-11T13:29:47Z

HI @zlosim
As you can see in the following lines, SPARK_SUBMIT_OPTIONS is concatenated with the spark configurations specifically for Kubernetes.

zeppelin/zeppelin-plugins/launcher/k8s-standard/src/main/java/org/apache/zeppelin/interpreter/launcher/K8sRemoteInterpreterProcess.java

Lines 323 to 325 in 8995ff2

    
             k8sEnv.put("SPARK_SUBMIT_OPTIONS", 
        
                 getEnv().getOrDefault("SPARK_SUBMIT_OPTIONS", "") + buildSparkSubmitOptions(userName)); 
        
           }

Your configuration values like spark.driver.extraClassPath and spark.driver.extraLibraryPath should take no effect, because Zeppelin starts Spark on Kubernetes always in client mode.

zeppelin/zeppelin-plugins/launcher/k8s-standard/src/main/java/org/apache/zeppelin/interpreter/launcher/K8sRemoteInterpreterProcess.java

Line 410 in 8995ff2

options.append(" --deploy-mode client");

You can also set SPARK_DRIVER_EXTRAJAVAOPTIONS_CONF directly in your Spark Zeppelin interpreter configuration if you want to override the (possibly empty) default setting.

In general, I can recommend you to create a config map that contains all Spark configurations and then include them in the containers via a custom interpreter-spec.yaml in your Spark Zeppelin interpreter.

zlosim · 2022-07-11T17:59:13Z

Hi @Reamer , thanks for the clarification.
Yes, one can set env vars and create custom interpreter-spec but this is not the same as this fix is trying to do and is not on feature parity with spark on yarn mode. Let me try to provide an example to better explain myself:
when using zeppelin in multi-tenant environment with spark on hadoop cluster. one user can set their own dependencies and driver configurations via inline configuration or interpreter setting page and everything works as expected
on the other hand, when setting dependencies or driver configs in k8s, these are ignored and only chance is to set them via SPARK_SUBMIT_OPTIONS as you mentioned, BUT

this will be fixed for every user and sometimes users would like to have different set of dependencies and driver configurations
as stated in zeppelin docs To be noticed, SPARK_SUBMIT_OPTIONS is deprecated and will be removed in future release.
in my opinion with this fix we can provide users with the same experience with spark running in k8s as running in hadoop cluster or stand alone mode

Your configuration values like spark.driver.extraClassPath and spark.driver.extraLibraryPath should take no effect, because Zeppelin starts Spark on Kubernetes always in client mode.

Yes, it is running in client mode, but driver is not yet started so there is a possibility to pass them to the driver on start so they can have effect. We are doing the same in SparkInterpreterLauncher

You can also set SPARK_DRIVER_EXTRAJAVAOPTIONS_CONF directly in your Spark Zeppelin interpreter configuration if you want to override the (possibly empty) default setting.

I'm sorry I was not aware of that. I know we can set few params with env vars but I can't find how to set env var from interpreter configuration

Reamer · 2022-07-15T06:57:53Z

I understand your motivation. I just don't like the list of Spark configuration at this point, because it can become very long. Spark has very many configuration options and we can't map them all again here in Zeppelin. I am looking for a more generic approach.
@zjffdu If SPARK_SUBMIT_OPTIONS is deprecated, how can we fill in the function?

zjffdu · 2022-07-19T02:39:26Z

@zlosim SparkInterpreterLauncher supports all the spark configurations. It detects all the spark configuration and concatenate them via env variable ZEPPELIN_SPARK_CONF.
I think we can do similar thing in K8sRemoteInterpreterProcess.

The reason why we deprecate SPARK_SUBMIT_OPTIONS is we would like to encourage user to configure spark in Zeppelin via a unified approach like spark (just configure each property one by one), otherwise it may cause inconsistency (e.g. Zeppelin admin may configure spark.driver.memory via SPARK_SUBMIT_OPTIONS in zeppelin-env.sh, but another user also define it via inline configuration, It is hard for Zeppelin to detect this kind of conflicts). So even for ZEPPELIN_SPARK_CONF in SparkInterpreterLauncher, it is for internal usage, it is not exposed to zeppelin users.

zlosim · 2022-07-19T08:19:32Z

Hi @zjffdu thanks for the response.
To be honest, I think this fix is doing exactly what you are saying you want to achieve. Right now setting e.g. spark dependencies via zeppelin interpreter settings is not working in K8s because we are trying to set it in runtime which is too late as driver is already started. This PR picks settings from interpreter settings passed by user that have to be passed directly to spark submit command when creating k8s pod with driver and puts them in env var so interpreter.sh script will start driver correctly. For end user this is what we are trying to achieve - he sets his settings via zeppelin interepreter settings and everything works as expected.

As for SPARK_SUBMIT_OPTIONS and possibility of zeppelin admin alter configs - we can alter the append and ignore value effectively leaving SPARK_SUBMIT_OPTIONS unexposed to zeppelin users - same as ZEPPELIN_SPARK_CONF.

zjffdu · 2022-07-19T08:46:34Z

@zlosim How about updating buildSparkSubmitOptions
to what we did in SparkInterpreterLauncher. Just collect all the spark properties (start with prefix spark.) and concatenate them into ZEPPELIN_SPARK_CONF and pass it to interpreter.sh, instead of collecting each individual spark driver property explicitly.

zlosim · 2022-07-25T08:50:10Z

Hi @zjffdu , sorry for the late response. I rewrote it so we are now using ZEPPELIN_SPARK_CONF. Please let me know if there is anything I can do better.

...dard/src/main/java/org/apache/zeppelin/interpreter/launcher/K8sRemoteInterpreterProcess.java

Reamer

I think I have found a bug. Have you tested the changes locally?

...dard/src/main/java/org/apache/zeppelin/interpreter/launcher/K8sRemoteInterpreterProcess.java

zjffdu · 2022-08-10T01:51:56Z

@zlosim Any update？

zlosim · 2022-08-11T16:53:03Z

Hi @zjffdu , I think this PR is ready to merge unless there are any other requests

Reamer · 2022-08-12T06:21:48Z

I would like to test this beforehand in my local K8s cluster, but I am still on vacation right now. Please have a little patience.

Reamer

I think I have found a mistake. Let me check this. The purpose of the pipe is that the individual config values can be cleanly separated from other config values. A separation in the config values itself is not desired.

Reamer · 2022-09-01T14:12:23Z

...dard/src/main/java/org/apache/zeppelin/interpreter/launcher/K8sRemoteInterpreterProcess.java

    if (isUserImpersonated() && !StringUtils.containsIgnoreCase(userName, "anonymous")) {
-      options.append(" --proxy-user ").append(userName);
+      sparkConfSJ.add("--proxy-user");
+      sparkConfSJ.add(userName);


Possible error.
This creates "--proxy-user|user". desired is "--proxy-user|user"

Hi @Reamer I'm not sure I can see the bug here, right now the output of this snippet is --proxy-user|user and the pipe would be replaced in interpreter.sh with space
as far as i can see same code snippet can be found in SparkInterpreterLauncher

Reamer · 2022-09-01T14:13:11Z

...dard/src/main/java/org/apache/zeppelin/interpreter/launcher/K8sRemoteInterpreterProcess.java

+    for (String key : properties.stringPropertyNames()) {
+      String propValue = properties.getProperty(key);
+      if (isSparkConf(key, propValue)) {
+        sparkConfSJ.add("--conf");


Here the same error.
desired is "--conf spark=123|--conf spark.1=123"

similar to the previous one, in SparkInterpreterLauncher desired output is --conf|spark=123|--conf|spark.1=123 but can fix it to
--conf spark=123|--conf spark.1=123 if that is really what we want, please let me know

Reamer · 2022-09-23T14:50:34Z

Hi @zlosim
Sorry for the late reply. I tried to test your branch. Currently, I am getting a startup error because I have already fully transitioned my environment to Spark-3.3. Could you please do a rebase to the master.

…hem up

- fix tests

zlosim · 2022-09-27T08:57:46Z

Hi @Reamer
no prob, should be rebased

Reamer

LGTM.
Thank you for your pull request, with this it is possible to remove the deprecated SPARK_SUBMIT option in the K8s launcher.
I have successfully tested your patch in my cluster.
As indicated, it might cause problems in the future if the pipe is also between "--conf" and the attachment. But since this is also done in the other start command, this should not block the merge.

* passing static arguments to spark-submit command so driver can pick them up * fixed static names * removed duplicate driver memory setting * fixed driver extra java opts * extend test * use ZEPPELIN_SPARK_CONF env var to pass spark configurations * fix import wildmark * fix separator * remove redundant concatenation * - remove redundant concatenation - fix tests

Reamer assigned zjffdu Jul 18, 2022

jongyoul reviewed Jul 25, 2022

View reviewed changes