-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cross-compile all shims from JDK17 to JDK8 #3
Cross-compile all shims from JDK17 to JDK8 #3
Conversation
Eliminate Logging inheritance to prevent shimming of unshimmable API classes Signed-off-by: Gera Shegalov <[email protected]>
Signed-off-by: Gera Shegalov <[email protected]>
Signed-off-by: Gera Shegalov <[email protected]>
object ShimLoader extends Logging { | ||
logDebug(s"ShimLoader object instance: $this loaded by ${getClass.getClassLoader}") | ||
object ShimLoader { | ||
val log = org.slf4j.LoggerFactory.getLogger(getClass().getName().stripSuffix("$")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So much better than what we were previously doing! I don't know why we even bothered with the Logging
trait
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is/was a convenient pattern but it is an internal class that would now need shimming. And resulted in a chicken-and-egg problem for shim loading.
@@ -173,12 +173,12 @@ function verify_same_sha_for_unshimmed() { | |||
# TODO currently RapidsShuffleManager is "removed" from /spark* by construction in | |||
# dist pom.xml via ant. We could delegate this logic to this script | |||
# and make both simmpler | |||
if [[ ! "$class_file_quoted" =~ (com/nvidia/spark/rapids/spark[34].*/.*ShuffleManager.class|org/apache/spark/sql/rapids/shims/spark[34].*/ProxyRapidsShuffleInternalManager.class) ]]; then | |||
if [[ ! "$class_file_quoted" =~ com/nvidia/spark/rapids/spark[34].*/.*ShuffleManager.class ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we got rid of the ProxyRapidsShuffleInternalManager
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tech debt, should have been done in NVIDIA#6030
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Javac is failing |
Signed-off-by: Gera Shegalov <[email protected]>
the failing check are JDK8 builds. Fixed some issues, you can fix the rest, |
96e0843
into
razajafri:SP-9259-POM-changes
…s] (#10994) * POM changes for Spark 4.0.0 Signed-off-by: Raza Jafri <[email protected]> * validate buildver and scala versions * more pom changes * fixed the scala-2.12 comment * more fixes for scala-2.13 pom * addressed comments * add in shim check to account for 400 * add 400 for premerge tests against jdk 17 * temporarily remove 400 from snapshotScala213 * fixed 2.13 pom * Remove 400 from jdk17 as it will compile with Scala 2.12 * github workflow changes * added quotes to pom-directory * update version defs to include scala 213 jdk 17 * Cross-compile all shims from JDK17 to JDK8 Eliminate Logging inheritance to prevent shimming of unshimmable API classes Signed-off-by: Gera Shegalov <[email protected]> * dummy * undo api pom change Signed-off-by: Gera Shegalov <[email protected]> * Add preview1 to the allowed shim versions Signed-off-by: Gera Shegalov <[email protected]> * Scala 2.13 to require JDK17 Signed-off-by: Gera Shegalov <[email protected]> * Removed unused import left over from razajafri#3 * Setup JAVA_HOME before caching * Only upgrade the Scala plugin for Scala 2.13 * Regenerate Scala 2.13 poms * Remove 330 from JDK17 builds for Scala 2.12 * Revert "Remove 330 from JDK17 builds for Scala 2.12" This reverts commit 1faabd4. * Downgrade scala.plugin.version for cloudera * Updated comment to include the issue * Upgrading the scala.maven.plugin version to 4.9.1 which is the same as Spark 4.0.0 * Downgrade scala-maven-plugin for Cloudera * revert mvn verify changes * Avoid cache for JDK 17 * removed cache dep from scala 213 * Added Scala 2.13 specific checks * Handle the change for UnaryPositive now extending RuntimeReplaceable * Removing 330 from jdk17.buildvers as we only support Scala2.13 and fixing the enviornment variable in version-defs.sh that we read for building against JDK17 with Scala 213 * Update Scala 2.13 poms * fixed scala2.13 verify to actually use the scala2.13/pom.xml * Added missing csv files * Skip Opcode tests There is a bytecode incompatibility which is why we are skipping these until we add support for it. For details please see the following two issues #11174 #10203 * upmerged and fixed the new compile error introduced * addressed review comments * Removed jdk17 cloudera check and moved it inside the 321,330 and 332 cloudera profiles * fixed upmerge conflicts * reverted renaming of id * Fixed HiveGenericUDFShim * addressed review comments * reverted the debugging code * generated Scala 2.13 poms --------- Signed-off-by: Raza Jafri <[email protected]> Signed-off-by: Gera Shegalov <[email protected]> Co-authored-by: Gera Shegalov <[email protected]>
Additional changes
Testing
Ran smoke tests in spark-shell to the tune of
Run 4.0.0-preview1 with JDK 17
$ TZ=UTC JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64 \ ~/dist/spark-4.0.0-preview1-bin-hadoop3/bin/spark-shell \ --master local-cluster[2,1,1024] \ --jars scala2.13/dist/target/rapids-4-spark_2.13-24.08.0-SNAPSHOT-cuda11.jar \ --conf spark.plugins=com.nvidia.spark.SQLPlugin \ --conf spark.rapids.sql.explain=ALL \ --conf spark.rapids.memory.gpu.allocSize=1536m
Run 3.3.0 with JDK 8 with the same jar
$ TZ=UTC JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 \ ~/dist/spark-3.3.0-bin-hadoop3-scala2.13/bin/spark-shell \ --master local-cluster[2,1,1024] \ --jars scala2.13/dist/target/rapids-4-spark_2.13-24.08.0-SNAPSHOT-cuda11.jar \ --conf spark.plugins=com.nvidia.spark.SQLPlugin \ --conf spark.rapids.sql.explain=ALL \ --conf spark.rapids.memory.gpu.allocSize=1536m
Note that RapidsShuffleManager does not work yet because we switched the plugin and SM initialization order in 4.0, and so the plugin cannot call initialize on null SM.
Signed-off-by: Gera Shegalov [email protected]