-
Notifications
You must be signed in to change notification settings - Fork 34
[DCOS-38138] Update Spark CLI for shell-escape fix #388
Conversation
manifest.json
Outdated
"hadoop_version": "2.6", | ||
"uri": "https://downloads.mesosphere.com/spark/assets/spark-2.2.1-2-bin-2.6.tgz" | ||
"hadoop_version": "2.7", | ||
"uri": "svt-dev.s3.amazonaws.com/spark/spark-2.2.1-bin-2.7.3-dcos-38138.tgz" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once PR is in good shape and approved, I'll update the downloads.mesosphere
uri to include the updated tgz
tests/test_spark.py
Outdated
app_args="", | ||
expected_output="spark.executor.extraJavaOptions,-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dparam3=val3", | ||
service_name=service_name, | ||
args=["--conf spark.driver.extraJavaOptions='-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dparam3=val3'", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is giving me trouble. This same command works via Spark CLI but not with pytests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the TeamCity output, the error seems to be thrown by the Parse
function called here: https://github.com/mesosphere/spark-build/blob/DCOS-38138-shell-escape/cli/dcos-spark/submit_builder.go#L411. It might be a good idea to check whether that is behaving as expected.
Also, confirming that the following command:
dcos spark --name=spark run --verbose --submit-args="--conf spark.driver.extraJavaOptions='-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dparam3=val3' --conf spark.mesos.containerizer=mesos --class MultiConfs --conf spark.mesos.role=* --conf spark.driver.memory=2g http://infinity-artifacts-ci.s3.amazonaws.com/autodelete7d/spark/test-20180802-062841-eHc7mdB3GD1WgGtj/dcos-spark-scala-tests-assembly-0.2-SNAPSHOT.jar "
exactly as it is generated by the tests works from the CLI would also be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can confirm that the exact same command works when submitted via command line and the stdout returns the expected args:
from CLI:
Translated spark-submit arguments: '--conf=spark.driver.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dparam3=val3, --conf=spark.mesos.containerizer=mesos, --class=MultiConfs, --conf=spark.mesos.role=*, --conf=spark.driver.memory=2g, http://infinity-artifacts-ci.s3.amazonaws.com/autodelete7d/spark/test-20180802-062841-eHc7mdB3GD1WgGtj/dcos-spark-scala-tests-assembly-0.2-SNAPSHOT.jar'
...
Run job succeeded. Submission id: driver-20180802152929-0002
stdout
1.448: [GC (Allocation Failure) [PSYoungGen: 64512K->8448K(75264K)] 64512K->8456K(247296K), 0.0073157 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
Running MultiConfs app. Printing out all config values:
(spark.driver.extraJavaOptions,-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dparam3=val3)
(spark.jars,file:/mnt/mesos/sandbox/dcos-spark-scala-tests-assembly-0.2-SNAPSHOT.jar)
...
The issue seems to be going through CI requires another layer of Python string parsing which causes the Go parsing to fail (my guess is that additional quotes are added)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work on this @samvantran. The CI failure is a little disconcerting, but I do think that it's something we could narrow down in unit tests here.
The error comes from the call to _, err := submit.Parse(submitArgs)
after the call to cleanUpSubmitArgs
.
I noted that the unit tests drop the quotes around the multi-args, but in the output for CI these are still present which seems unexpected.
cli/dcos-spark/submit_builder.go
Outdated
if boolVal.flagName == current[2:] { | ||
sparkArgs = append(sparkArgs, current) | ||
i++ | ||
continue LOOP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we pass --supervise false
or something similar? The code doesn't seem to handle this case explicitly (It seems as if this will be joined in the default
case though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, false
will get joined with --supervise
and the job will fail. I just tried it and we get a parse error
Error when parsing --submit-args: unknown long flag '--supervise false'
In this case, it might make sense to check for boolean flags and if found, check the next argument. If it's a boolean statement like true/false
, we could just move the pointer forward and skip it (i+=2
).
More generally though, there are a lot of ways to mess up confs. Does it make sense to have some kind of blacklist that guards against 'user error'?
func (suite *CliTestSuite) TestCleanUpSubmitArgsWithMulitpleValues() { | ||
_, args := sparkSubmitArgSetup() | ||
inputArgs := "--conf spark.driver.extraJavaOptions='-XX:+PrintGC -Dparam1=val1 -Dparam2=val2' main.py 100" | ||
expected := "--conf=spark.driver.extraJavaOptions=-XX:+PrintGC -Dparam1=val1 -Dparam2=val2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the problem. The quoting is stripped from the argument. In the CI, it is most likely the second -XX
that is failing, but I would expect the -D
to fail here aswell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do confirm, we may want to include tests that call buildSubmitJson
with the relevant arguments too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i've tested it on CI where even just setting extraJavaOptions='val1'
results in error Cannot read file val1
or something like that.
I think this is Python doing its own kind of parsing (wrapping everything in single-quotes?) before getting to Go parsing and that causes the issue.
In commit 7c294c7 I added that same config passed to buildSubmitJson
and it still works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I might have missed it in the cleanupSubmitArgs logic: when do the single quotes get stripped off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in our logic, it happens in shellwords.parse()
args, err := shellwords.Parse(argsStr)
@elezar Thanks for taking a look! Yes, the quotes are dropped from the multi-args which works when it gets to Kingpin parsing. The problem is CI tests use Python and there are a bunch of places where |
Hm, surprisingly when I run *edit: see below run test:
test output:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this, @samvantran ! The code is more readable now, too.
cli/dcos-spark/submit_builder.go
Outdated
@@ -280,75 +284,61 @@ func parseApplicationFile(args *sparkArgs) error { | |||
return nil | |||
} | |||
|
|||
// we use Kingpin to parse CLI commands and options | |||
// spark-submit by convention uses '--arg val' while kingpin only supports --arg=val | |||
// cleanUpSubmitArgs transforms the former to the latter | |||
func cleanUpSubmitArgs(argsStr string, boolVals []*sparkVal) ([]string, []string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this function is misnamed - "cleanup" sounds like simply removing extra spaces or special characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, how about renaming to transformSubmitArgs?
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced that this is even needed. I just tested the quota command -- also defined using kingpin and the output is as follows:
➜ dcos-spark git:(DCOS-38138-shell-escape) ✗ dcos --log-level debug spark quota create --cpus 10 foo
Using CLI config file: /Users/elezar/.dcos/clusters/c173c4f2-c398-43c6-ab72-d46b99cb326a/dcos.toml
HTTP Query (96 byte payload): POST https://mvdrf110s.scaletesting.mesosphe.re/mesos/quota
{"role":"foo","guarantee":[{"name":"cpus","type":"SCALAR","scalar":{"value":10}}],"force":false}
^CUser interrupted command with Ctrl-C
➜ dcos-spark git:(DCOS-38138-shell-escape) ✗ dcos --log-level debug spark quota create --cpus=10 foo
Using CLI config file: /Users/elezar/.dcos/clusters/c173c4f2-c398-43c6-ab72-d46b99cb326a/dcos.toml
HTTP Query (96 byte payload): POST https://mvdrf110s.scaletesting.mesosphe.re/mesos/quota
{"role":"foo","guarantee":[{"name":"cpus","type":"SCALAR","scalar":{"value":10}}],"force":false}
^CUser interrupted command with Ctrl-C
Note that kingping handles both the --arg=value
and --arg value
forms the same way. We could just drop this function entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is true, it will be a major
*edit
thinking it over, maybe its original intent was to simply separate the spark-submit args from the application args but it may have morphed since then.
It might be worth testing deleting the cleanSubmitArgs function and seeing if all the tests still pass (or maybe adding a --app-args flags since that's needed for the application itself... but that would be a breaking API change). I also have a feeling that it does not handle long strings of args.
cli/dcos-spark/submit_builder.go
Outdated
current := strings.TrimSpace(args[i]) | ||
switch { | ||
// if current is a spark jar/app, we've processed all flags; add jar to sparkArgs and append the rest to appArgs | ||
case strings.HasSuffix(current, ".jar") || strings.HasSuffix(current, ".r") || strings.HasSuffix(current, ".py"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't know if it matters but previously the check for an R app is capital R: https://github.com/mesosphere/spark-build/blob/3026e88dae1df008705905ac38cfde164b660220/cli/dcos-spark/submit_builder.go#L251, here it's lower case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! It probably does matter so I'll change it back
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact. How about having an isAppArtifact
or function or something? (with unit tests to ensure that we don't subtly change the behaviour)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done 6b32e5
i += 1 | ||
default: | ||
// if not a flag or jar, current is a continuation of the last arg and should not have been split | ||
// eg extraJavaOptions="-Dparam1 -Dparam2" was parsed as [extraJavaOptions, -Dparam1, -Dparam2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this isn't handled by shellwords? (It sounded like it might based on the README.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No unfortunately the library splits the string on non-escaped spaces so this default
case glues the string back together.
tests/test_spark.py
Outdated
app_args="", | ||
expected_output="spark.executor.extraJavaOptions,-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dparam3=val3", | ||
service_name=service_name, | ||
args=["--conf spark.driver.extraJavaOptions='-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dparam3=val3'", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a few special characters here as well? Maybe a -Dparam3=\"val3 val4\"
like in the unit test?
println("Running MultiConfs app. Printing out all config values:") | ||
val APPNAME = "MultiConfs App" | ||
val conf = new SparkConf().setAppName(APPNAME) | ||
conf.getAll.foreach(println) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One additional check that tests the whole thing end-to-end: pass in -Dparam1=\"valA valB\"
as one of the extraJavaOptions, and then inside this app, you can check the value of System.getProperty("param1") to make sure it was set correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I'll update.
I think this may be the culprit: sdk_cmd.py#L138 Here's what
Here's what the arguments look like in the CI job:
Interestingly, two lines up in the CI build log, you see that the dcos command about to be run looks normal:
which means somewhere in this call, something goes wrong. I did some digging and it looks like Recommended approaches are to instead But really, what that means is making this Spark CLI fix work in CI requires passing a shell-escaped string through whole slew of languages Maybe calling |
Since python's subprocess.run is using the underlying /bin/sh we want to put the submit-args in a file so that it doesn't do its own escaping before reaching our Go CLI.
No need for an additional level of indirection. It's better if it's all self-contained within the actual test since there are slight changes that are required in different areas of the utils.
9ae7045
to
9f86664
Compare
What an adventure... for about a week I tried testing running a dcos spark run command via subprocess.run with cat cmd, entire file, sleeps, syncs, and chmods all in an attempt to satisfy CI and then finally found out CI was misconfigured and those stubs never made it to test cluster. Once the PR to fix all the master branch failures goes in this will likely be all that's needed to run the test.
f86b73f
to
a3e5a9a
Compare
- add code for checking booleans with test - rename cleanupSubmitArgs -> transformSubmitArgs
a3e5a9a
to
36faae7
Compare
Due to the recent discovery of CI misconfiguration, it looks like this test may not have been actually working/testing what we suspected. In CI it does not look like the --auth-token is being included in the CreateSubmissionRequest for Spark and thus is not triggered in the mesos-integration-tests. This needs to be fixed but should not be in this PR.
@elezar @susanxhuynh hallelujah! CI finally passes. Please review hopefully one last time 🙏 Quick recap:
|
test.sh
Outdated
@@ -10,12 +10,13 @@ | |||
# Exit immediately on errors | |||
set -e | |||
|
|||
timestamp="$(date +%y%m%d-%H%M%S)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR #398 covers this file change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for keeping up with this @samvantran.
I have made some comments with regards to the boolean flags and also think it's important to create a ticket to better understand the need for the translation between kingpin
spark submit args.
// collapse two or more spaces to one. | ||
argsCompacted := collapseSpacesPattern.ReplaceAllString(argsStr, " ") | ||
// we use Kingpin to parse CLI commands and options | ||
// spark-submit by convention uses '--arg val' while kingpin only supports --arg=val |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we create a ticket to address this. Kingpin supports both --arg val
and --arg=val
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here you go: https://jira.mesosphere.com/browse/DCOS-41107
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Thanks.
cli/dcos-spark/submit_builder.go
Outdated
isBool = true | ||
next := args[idx+1] | ||
if _, err := strconv.ParseBool(next); err == nil { | ||
skipNext = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not correct. What if someone calls the function with --supervise false
that should have different behaviour than --supervise true
, but this is not captured here.
Also, as far as I know named return values have fallen out of favour in idiomatic Go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah... that's true. It may make sense to remove it from the list in the false
case since that is default Spark behavior.
As for point 2, I didn't know. I was just operating off the Go reference and thought it was a nifty feature.
I did some googling and found this as the latest reference but it was in favor of named return values: https://blog.minio.io/golang-internals-part-2-nice-benefits-of-named-return-values-1e95305c8687
Can you point me to resources that argue the opposite? *edit looks like there's a bit of discussion here: golang/go#20859
Anyway, as I mentioned below maybe we shouldn't cover true/false case anyhow since that would be a user not following spark-cli instructions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may have just been recalling a recording of a gophercon talk that I watched. In this case, I don't think it's the named return values so much as the naked return that is the issue. I stand corrected.
Here is a PR review that makes the changes to the Go standard library and style guide:
- https://go-review.googlesource.com/c/go/+/20024
- https://github.com/golang/go/wiki/CodeReviewComments#named-result-parameters
For clarity here:
Named returned values should only be used on public funcs and methods
when it contributes to the documentation.Named return values should not be used if they're only saving the
programmer a few lines of code inside the body of the function,
especially if that means there's stutter in the documentation or it
was only there so the programmer could use a naked return
statement. (Naked returns should not be used except in very small
functions)
in this case, I would say that we are not stuttering so the named values themselves are ok, but we should consider dropping the naked return -- although we could consider this a very small function.
With regards to:
Anyway, as I mentioned below maybe we shouldn't cover true/false case anyhow since that would be a user not following spark-cli instructions
Yes, that would be ideal. With that said, this is exactly the reason why we should use a library to parse the arguments instead of doing it ourselves. Since you've already created a ticket to address this, we should instead focus on getting this PR done before tackling that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing. That all sounds good to me. When I first started learning Go and saw the naked returns (in other repos), I was confused. It wasn't until I read Effective Go that I learned its purpose.
I'm glad the community values readability so I'm +1 with you on this.
assert.True(suite.T(), isBool, "isBool should be true since --supervise is a Boolean flag") | ||
assert.True(suite.T(), skipNext, "skipNext should be true since 'true' is a Boolean value") | ||
|
||
inputArgs = []string{"--supervise", "false", "--conf", "spark.mesos.containerizer=mesos"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about --supervise
with NO args?
I also don't see a test for the equivalence of:
--supervise
and--supervise true
- (missing
--supervise
) and--supervise false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test above is for --supervise with no args: https://github.com/mesosphere/spark-build/pull/388/files/0ca555a0a3b9b95302d9635191732b1c68948407#diff-1ed61d6b454a1d6565ff2def51ff2487R100
As for equivalence, this is trickier. I can test for true
bc we set it on this line https://github.com/mesosphere/spark-build/pull/388/files#diff-4b81b743bcae14a20623a9e94b14a92bR455 but false is Spark's default config so if there is no supervise
flag, then Spark takes care of it when the driver starts up which is outside the scope of CLI parsing.
This kind of goes back to an earlier question, should we even need to handle cases for --supervise false
? This is not a typical use case and submitting that flag value in a vanilla Spark job (not even including DCOS) would be strange
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm all for not supporting --bool-flag BOOL_VAL
options, but then we need to change the handling of the bool args. The thing to confirm would then be that we correctly handle the different cases:
--supervise
should pass--supervise true
should error--supervise false
should error--supervise https://some.jar
should pass
Are these all the test cases that we need to cover?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and the CLI already provides such errors:
$ dcos spark run --name=spark-test --submit-args="\
--supervise false \
--class org.apache.spark.examples.SparkPi \
https://s3-us-west-2.amazonaws.com/infinity-artifacts/spark/spark-examples_2.11-2.2.1.jar 100"
2018/08/30 16:17:20 Error when parsing --submit-args: unknown long flag '--supervise false'
$ dcos spark run --name=spark-test --submit-args="\
--supervise true \
--class org.apache.spark.examples.SparkPi \
https://s3-us-west-2.amazonaws.com/infinity-artifacts/spark/spark-examples_2.11-2.2.1.jar 100"
2018/08/30 16:17:26 Error when parsing --submit-args: unknown long flag '--supervise true'
$ dcos spark run --name=spark-tes --submit-args="\
--supervise \
--class org.apache.spark.examples.SparkPi \
https://s3-us-west-2.amazonaws.com/infinity-artifacts/spark/spark-examples_2.11-2.2.1.jar 100"
Using image 'mesosphere/spark-dev:7b2bb6c6a4e7d0a79d4e44e15076904d2adc6ab1' for the driver and the executors (from dispatcher: container.docker.image).
To disable this image on executors, set spark.mesos.executor.docker.forcePullImage=false
Run job succeeded. Submission id: driver-20180830231733-0005
@@ -2,7 +2,7 @@ | |||
"spark_version": "2.2.1", | |||
"default_spark_dist": { | |||
"hadoop_version": "2.6", | |||
"uri": "https://downloads.mesosphere.com/spark/assets/spark-2.2.1-2-bin-2.6.tgz" | |||
"uri": "svt-dev.s3.amazonaws.com/spark/spark-2.2.1-bin-2.6.5-DCOS-38138.tgz" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this will be updated once the next release is cut?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, or do you think I should first merge the mesosphere/spark PR (and revert the --proxy-user commit) off custom-branch-2.2.1-X
? I could then create a new tgz and upload it to
downloads.mesosphere.com/spark/assets/spark-2.2.1-3-bin-2.6.tgz
(note the -3
) and change this PR. That way I wouldn't have to do it twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I don't think we should keep overwriting the "2.2.1-3" tgz with different versions. Until we actually make a release, how about something like "2.2.1-3-SNAPSHOT" to signal that it's not an officially released version, but a dev one that will keep changing up until the release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SNAPSHOT sounds best
If a user submits a Spark job with --supervise true/false, they weren't actually following instructions and we shouldn't be covering for them.
cli/dcos-spark/submit_builder.go
Outdated
func checkIfBooleanFlag(boolVals []*sparkVal, args []string, idx int) (isBool, skipNext bool) { | ||
// check if current is a boolean flag (eg --supervise) but skip any accompanying boolean values (eg --supervise true) | ||
current := args[idx] | ||
for _, boolVal := range boolVals { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this again this morning. We should break out of this loop as soon as we have found the matching flag -- either by returning, or by calling break
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here you go: d0aae3c
cli/dcos-spark/submit_builder.go
Outdated
) | ||
|
||
var keyWhitespaceValPattern = regexp.MustCompile("(.+)\\s+(.+)") | ||
var backslashNewlinePattern = regexp.MustCompile("\\s*\\\\s*\\n\\s+") | ||
var collapseSpacesPattern = regexp.MustCompile(`[\s\p{Zs}]{2,}`) | ||
|
||
// AcceptedSparkAppExtensions lists the accepted extensions for Spark application artifacts | ||
var AcceptedSparkAppExtensions = []string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this have to be public? Also, we could make the list local to the only function where it is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry had a Scala slip of the brain there, forgot caps make go code public. Was aiming to denote it as a constant 🙃
fixed in 7b2bb6c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @samvantran! Glad we can put this to bed.
7c9f1a9
to
7b2bb6c
Compare
This commit makes it possible to pass multi-argument strings configuration strings in the spark-cli.
Resolves DCOS-38138 and DCOS-38462
This PR builds off mesosphere/spark#33 and updates the Spark CLI to make it possible to pass multiple --conf args in a single string eg.
Previously, only one value was accepted in a typical
--conf arg=val1
and would fail if there were spaces included.An example spark run command: