Skip to content

Commit

Permalink
[SPARK-20756][YARN] yarn-shuffle jar references unshaded guava
Browse files Browse the repository at this point in the history
and contains scala classes

## What changes were proposed in this pull request?
This change ensures that all references to guava from within the yarn shuffle jar pointed to the shaded guava class already provided in the jar.

Also, it explicitly excludes scala classes from being added to the jar.

## How was this patch tested?
Ran unit tests on the module and they passed.
javap now returns the expected result - reference to the shaded guava under `org/spark_project` (previously this was referring to `com.google...`
```
javap -cp common/network-yarn/target/scala-2.11/spark-2.3.0-SNAPSHOT-yarn-shuffle.jar -c org/apache/spark/network/yarn/YarnShuffleService | grep Lists
      57: invokestatic  alteryx#138                // Method org/spark_project/guava/collect/Lists.newArrayList:()Ljava/util/ArrayList;
```

Guava is still shaded in the jar:
```
jar -tf common/network-yarn/target/scala-2.11/spark-2.3.0-SNAPSHOT-yarn-shuffle.jar | grep guava | head
META-INF/maven/com.google.guava/
META-INF/maven/com.google.guava/guava/
META-INF/maven/com.google.guava/guava/pom.properties
META-INF/maven/com.google.guava/guava/pom.xml
org/spark_project/guava/
org/spark_project/guava/annotations/
org/spark_project/guava/annotations/Beta.class
org/spark_project/guava/annotations/GwtCompatible.class
org/spark_project/guava/annotations/GwtIncompatible.class
org/spark_project/guava/annotations/VisibleForTesting.class
```
(not sure if the above META-INF/* is a problem or not)

I took this jar, deployed it on a yarn cluster with shuffle service enabled, and made sure the YARN node managers came up. An application with a shuffle was run and it succeeded.

Author: Mark Grover <[email protected]>

Closes apache#17990 from markgrover/spark-20756.

(cherry picked from commit 3630911)
Signed-off-by: Marcelo Vanzin <[email protected]>
  • Loading branch information
markgrover authored and Marcelo Vanzin committed May 22, 2017
1 parent c3a986b commit f5ef076
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion common/network-yarn/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,9 @@
<includes>
<include>*:*</include>
</includes>
<excludes>
<exclude>org.scala-lang:scala-library</exclude>
</excludes>
</artifactSet>
<filters>
<filter>
Expand All @@ -98,7 +101,7 @@
</excludes>
</filter>
</filters>
<relocations>
<relocations combine.children="append">
<relocation>
<pattern>com.fasterxml.jackson</pattern>
<shadedPattern>${spark.shade.packageName}.com.fasterxml.jackson</shadedPattern>
Expand Down

0 comments on commit f5ef076

Please sign in to comment.