-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-44376][BUILD] Fix maven build using scala 2.13 and Java 11 or later #41943
[SPARK-44376][BUILD] Fix maven build using scala 2.13 and Java 11 or later #41943
Conversation
A quick question: previously, the output artifacts are runnable on JDK 8 whatever the building JDK version is. is it true after this change? cc @LuciferYang |
It's my bad, I only tested Java 11 and 17 under SBT and missed the Maven scenario when upgrade Scala 2.13.11 |
@eejbyfeldt Do you know why SBT is not failed? spark/project/SparkBuild.scala Lines 352 to 369 in 6c02bd0
|
friendly ping @dongjoon-hyun Should we keep using Scala 2.13.8 in Spark 3.5.0? WDYT? also cc @srowen |
The failures are not directly related to scala 2.13.11. They are caused by the maven plugin we are using for running scalac. For versions greater or equal to
Based on the example provided in scala/bug#12824 that using |
I think the issue is that you end up compiling against a later version of JDK libraries, which is not necessarily compatible, even if emitting bytecode for a lower Java version. However, yeah we've always accepted that and test it in CI/CD to make sure it works. |
There are still issues compiling 2.13 with newer newer java like 11 and not setting java.version to 11. When not setting the java.version property the scala-maven-plugin will set |
I think we target Java 8 for release builds and then want that to run on Java 11+. Does that work, or did that already work? |
I think this is ok now |
I think the current issue is that we may have to specify the Java version through I do the following experiments:
this will failed due to
We have to use the following command to build using Java 17 now(Current usage in Github action):
|
I think the idea is you build with java 8, and test with 11/17 - does that work? or that is certainly what we want, to have one release that works across all the java versions. |
I think this is ok. |
The
|
Anyone opinions how we should proceed with this? Would be nice to have this fixed in the 3.5 branch as not being able to build with java 11 or newer is a regression compared to previous spark releases. |
Is this suggestion works? If it works, we don't need to manually specify |
6804b1b
to
d5b9d4f
Compare
Added it to this PR. Seems to work based on my testing locally. I did not add it immediately as it seemed like a change for how it worked compared to now, and it was not clear to be everyone agreed that was a desired solution. |
I think the issue is you will target Java 17 bytecode if running on 17, when we want to target 8 in all cases |
If that is the case then the changes currently in this PR are not what we want. But are we really sure that this is something that is expected or used? Because as far as I can tell this is not something that actually worked in the passed. If I take a spark 3.4.1 build that I build using the v3.4.1 tag on Java 11 and then try to run
I think the problem here boils down to that we previously have used the scalac arg
so only specifying |
Right, targeting Java 8 bytecode is necessary but not sufficient to run on Java 8. I think that's why we build releases on Java 8. These releases should still work on later Java releases, at least that's what the CI jobs are trying to test. |
@eejbyfeldt can we change to use I have made the following changes based on your pr:
then I test
Both Scala 2.12 and Scala 2.13 with Java 17 build successfully, and the |
Hardcoding
Running with
and based on the discussion in scala/bug#12643 I belive this is the expected behavior. |
@eejbyfeldt Thank you for your response, I don’t have any further suggestions for now. |
Can we set java.version to 8? |
The problem is that this does not work with the current code as building with |
OK, got it. The problem is, I don't think it helps to target Java 11 here, as it will just make the release unusable on Java 8 right? Is this workaround possible? scala/bug#12643 (comment) Or else, just don't further upgrade Scala 2.13 until Java 8 support is dropped. That could reasonably happen in Spark 4 |
I made the following attempt and test using Java 17:
So, it seems that even if we upgrade to Java 11, we still need to keep the values of |
Due to this issue, should we downgrade the Scala 2.13 version to 2.13.8 in branch-3.5? also cc @dongjoon-hyun |
Yeah, I also confirmed this fails by testing the 3.5.0 RC1 build. If you build with Java 8, but test on Java 17, it won't work. We want this combination to work. If 2.13.8 still works as before, I believe we should use that. (And I think we should drop java 8 support in 4.0) |
I submitted a PR to test the |
### What changes were proposed in this pull request? This pr downgrade `scala-maven-plugin` to version 4.7.1 to avoid it automatically adding the `-release` option as a Scala compilation argument. ### Why are the changes needed? The `scala-maven-plugin` versions 4.7.2 and later will try to automatically append the `-release` option as a Scala compilation argument when it is not specified by the user: 1. 4.7.2 and 4.8.0: try to add the `-release` option for Scala versions 2.13.9 and higher. 2. 4.8.1: try to append the `-release` option for Scala versions 2.12.x/2.13.x/3.1.1, and append `-java-output-version` for Scala 3.1.2. The addition of the `-release` option has caused issues mentioned in SPARK-44376 | #41943 and #40442 (comment). This is because the `-release` option has stronger compilation restrictions than `-target`, ensuring not only bytecode format, but also that the API used in the code is compatible with the specified version of Java. However, many APIs in the `sun.*` package are not `exports` in Java 11, 17, and 21, such as `sun.nio.ch.DirectBuffer`, `sun.util.calendar.ZoneInfo`, and `sun.nio.cs.StreamDecoder`, making them invisible when compiling across different versions. For discussions within the Scala community, see scala/bug#12643, scala/bug#12824, scala/bug#12866, but this is not a bug. I have also submitted an issue to the `scala-maven-plugin` community to discuss the possibility of adding additional settings to control the addition of the `-release` option: davidB/scala-maven-plugin#722. For Apache Spark 4.0, in the short term, I suggest downgrading `scala-maven-plugin` to version 4.7.1 to avoid it automatic adding the `-release` option as a Scala compilation argument. In the long term, we should reduce use of APIs that are not `exports` for compatibility with the `-release` compilation option due to `-target` already deprecated after Scala 2.13.9. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GitHub Actions - Manual check run `git revert 656bf36` to revert to using Scala 2.13.11 and run `dev/change-scala-version.sh 2.13` to change Scala to 2.13 1. Run `build/mvn clean install -DskipTests -Pscala-2.13 -X` to check the Scala compilation arguments. Before ``` [[DEBUG] [zinc] Running cached compiler 1992eaf4 for Scala compiler version 2.13.11 [DEBUG] [zinc] The Scala compiler is invoked with: -unchecked -deprecation -feature -explaintypes -target:jvm-1.8 -Wconf:cat=deprecation:wv,any:e -Wunused:imports -Wconf:cat=scaladoc:wv -Wconf:cat=lint-multiarg-infix:wv -Wconf:cat=other-nullary-override:wv -Wconf:cat=other-match-analysis&site=org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupFunction.catalogFunction:wv -Wconf:cat=other-pure-statement&site=org.apache.spark.streaming.util.FileBasedWriteAheadLog.readAll.readFile:wv -Wconf:cat=other-pure-statement&site=org.apache.spark.scheduler.OutputCommitCoordinatorSuite.<local OutputCommitCoordinatorSuite>.futureAction:wv -Wconf:msg=^(?=.*?method|value|type|object|trait|inheritance)(?=.*?deprecated)(?=.*?since 2.13).+$:s -Wconf:msg=^(?=.*?Widening conversion from)(?=.*?is deprecated because it loses precision).+$:s -Wconf:msg=Auto-application to \`\(\)\` is deprecated:s -Wconf:msg=method with a single empty parameter list overrides method without any parameter list:s -Wconf:msg=method without a parameter list overrides a method with a single empty one:s -Wconf:cat=deprecation&msg=procedure syntax is deprecated:e -Wconf:cat=unchecked&msg=outer reference:s -Wconf:cat=unchecked&msg=eliminated by erasure:s -Wconf:msg=^(?=.*?a value of type)(?=.*?cannot also be).+$:s -Wconf:cat=unused-imports&src=org\/apache\/spark\/graphx\/impl\/VertexPartitionBase.scala:s -Wconf:cat=unused-imports&src=org\/apache\/spark\/graphx\/impl\/VertexPartitionBaseOps.scala:s -Wconf:msg=Implicit definition should have explicit type:s -release 8 -bootclasspath ... ``` After ``` [DEBUG] [zinc] Running cached compiler 72dd4888 for Scala compiler version 2.13.11 [DEBUG] [zinc] The Scala compiler is invoked with: -unchecked -deprecation -feature -explaintypes -target:jvm-1.8 -Wconf:cat=deprecation:wv,any:e -Wunused:imports -Wconf:cat=scaladoc:wv -Wconf:cat=lint-multiarg-infix:wv -Wconf:cat=other-nullary-override:wv -Wconf:cat=other-match-analysis&site=org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupFunction.catalogFunction:wv -Wconf:cat=other-pure-statement&site=org.apache.spark.streaming.util.FileBasedWriteAheadLog.readAll.readFile:wv -Wconf:cat=other-pure-statement&site=org.apache.spark.scheduler.OutputCommitCoordinatorSuite.<local OutputCommitCoordinatorSuite>.futureAction:wv -Wconf:msg=^(?=.*?method|value|type|object|trait|inheritance)(?=.*?deprecated)(?=.*?since 2.13).+$:s -Wconf:msg=^(?=.*?Widening conversion from)(?=.*?is deprecated because it loses precision).+$:s -Wconf:msg=Auto-application to \`\(\)\` is deprecated:s -Wconf:msg=method with a single empty parameter list overrides method without any parameter list:s -Wconf:msg=method without a parameter list overrides a method with a single empty one:s -Wconf:cat=deprecation&msg=procedure syntax is deprecated:e -Wconf:cat=unchecked&msg=outer reference:s -Wconf:cat=unchecked&msg=eliminated by erasure:s -Wconf:msg=^(?=.*?a value of type)(?=.*?cannot also be).+$:s -Wconf:cat=unused-imports&src=org\/apache\/spark\/graphx\/impl\/VertexPartitionBase.scala:s -Wconf:cat=unused-imports&src=org\/apache\/spark\/graphx\/impl\/VertexPartitionBaseOps.scala:s -Wconf:msg=Implicit definition should have explicit type:s -target:8 -bootclasspath ... ``` After downgrading the version, the `-release` option should no longer appear in the compilation arguments. 2. Maven can build the project with Java 17 without the issue described in #41943. And after this pr, we can re-upgrade Scala 2.13 to Scala 2.13.11. ### Was this patch authored or co-authored using generative AI tooling? No Closes #42899 from LuciferYang/SPARK-45144. Authored-by: yangjie01 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
Closing this as it no longer relevant. In the 3.5 scala 2.13 was downgraded and the update will be done in #42918 |
### What changes were proposed in this pull request? This PR aims to re-upgrade Scala to 2.13.11, after SPARK-45144 was merged, the build issues mentioned in #41943 should no longer exist. - https://www.scala-lang.org/news/2.13.11 Additionally, this pr adds a new suppression rule for warning message: `Implicit definition should have explicit type`, this is a new compile check introduced by scala/scala#10083, we must fix it when we upgrading to use Scala 3 ### Why are the changes needed? This release improves collections, adds support for JDK 20 and 21, adds support for JDK 17 `sealed`: - scala/scala#10363 - scala/scala#10184 - scala/scala#10397 - scala/scala#10348 - scala/scala#10105 There are 2 known issues in this version: - scala/bug#12800 - scala/bug#12799 For the first one, there is no compilation warning messages related to `match may not be exhaustive` in Spark compile log, and for the second one, there is no case of `method.isAnnotationPresent(Deprecated.class)` in Spark code, there is just https://github.com/apache/spark/blob/8c84d2c9349d7b607db949c2e114df781f23e438/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala#L130 in Spark Code, and I checked `javax.annotation.Nonnull` no this issue. So I think These two issues will not affect Spark itself, but this doesn't mean it won't affect the code written by end users themselves The full release notes as follows: - https://github.com/scala/scala/releases/tag/v2.13.11 ### Does this PR introduce _any_ user-facing change? Yes, this is a Scala version change. ### How was this patch tested? - Existing Test ### Was this patch authored or co-authored using generative AI tooling? No Closes #42918 from LuciferYang/SPARK-40497-2. Authored-by: yangjie01 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
This pr downgrade `scala-maven-plugin` to version 4.7.1 to avoid it automatically adding the `-release` option as a Scala compilation argument. The `scala-maven-plugin` versions 4.7.2 and later will try to automatically append the `-release` option as a Scala compilation argument when it is not specified by the user: 1. 4.7.2 and 4.8.0: try to add the `-release` option for Scala versions 2.13.9 and higher. 2. 4.8.1: try to append the `-release` option for Scala versions 2.12.x/2.13.x/3.1.1, and append `-java-output-version` for Scala 3.1.2. The addition of the `-release` option has caused issues mentioned in SPARK-44376 | apache#41943 and apache#40442 (comment). This is because the `-release` option has stronger compilation restrictions than `-target`, ensuring not only bytecode format, but also that the API used in the code is compatible with the specified version of Java. However, many APIs in the `sun.*` package are not `exports` in Java 11, 17, and 21, such as `sun.nio.ch.DirectBuffer`, `sun.util.calendar.ZoneInfo`, and `sun.nio.cs.StreamDecoder`, making them invisible when compiling across different versions. For discussions within the Scala community, see scala/bug#12643, scala/bug#12824, scala/bug#12866, but this is not a bug. I have also submitted an issue to the `scala-maven-plugin` community to discuss the possibility of adding additional settings to control the addition of the `-release` option: davidB/scala-maven-plugin#722. For Apache Spark 4.0, in the short term, I suggest downgrading `scala-maven-plugin` to version 4.7.1 to avoid it automatic adding the `-release` option as a Scala compilation argument. In the long term, we should reduce use of APIs that are not `exports` for compatibility with the `-release` compilation option due to `-target` already deprecated after Scala 2.13.9. No - Pass GitHub Actions - Manual check run `git revert 656bf36` to revert to using Scala 2.13.11 and run `dev/change-scala-version.sh 2.13` to change Scala to 2.13 1. Run `build/mvn clean install -DskipTests -Pscala-2.13 -X` to check the Scala compilation arguments. Before ``` [[DEBUG] [zinc] Running cached compiler 1992eaf4 for Scala compiler version 2.13.11 [DEBUG] [zinc] The Scala compiler is invoked with: -unchecked -deprecation -feature -explaintypes -target:jvm-1.8 -Wconf:cat=deprecation:wv,any:e -Wunused:imports -Wconf:cat=scaladoc:wv -Wconf:cat=lint-multiarg-infix:wv -Wconf:cat=other-nullary-override:wv -Wconf:cat=other-match-analysis&site=org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupFunction.catalogFunction:wv -Wconf:cat=other-pure-statement&site=org.apache.spark.streaming.util.FileBasedWriteAheadLog.readAll.readFile:wv -Wconf:cat=other-pure-statement&site=org.apache.spark.scheduler.OutputCommitCoordinatorSuite.<local OutputCommitCoordinatorSuite>.futureAction:wv -Wconf:msg=^(?=.*?method|value|type|object|trait|inheritance)(?=.*?deprecated)(?=.*?since 2.13).+$:s -Wconf:msg=^(?=.*?Widening conversion from)(?=.*?is deprecated because it loses precision).+$:s -Wconf:msg=Auto-application to \`\(\)\` is deprecated:s -Wconf:msg=method with a single empty parameter list overrides method without any parameter list:s -Wconf:msg=method without a parameter list overrides a method with a single empty one:s -Wconf:cat=deprecation&msg=procedure syntax is deprecated:e -Wconf:cat=unchecked&msg=outer reference:s -Wconf:cat=unchecked&msg=eliminated by erasure:s -Wconf:msg=^(?=.*?a value of type)(?=.*?cannot also be).+$:s -Wconf:cat=unused-imports&src=org\/apache\/spark\/graphx\/impl\/VertexPartitionBase.scala:s -Wconf:cat=unused-imports&src=org\/apache\/spark\/graphx\/impl\/VertexPartitionBaseOps.scala:s -Wconf:msg=Implicit definition should have explicit type:s -release 8 -bootclasspath ... ``` After ``` [DEBUG] [zinc] Running cached compiler 72dd4888 for Scala compiler version 2.13.11 [DEBUG] [zinc] The Scala compiler is invoked with: -unchecked -deprecation -feature -explaintypes -target:jvm-1.8 -Wconf:cat=deprecation:wv,any:e -Wunused:imports -Wconf:cat=scaladoc:wv -Wconf:cat=lint-multiarg-infix:wv -Wconf:cat=other-nullary-override:wv -Wconf:cat=other-match-analysis&site=org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupFunction.catalogFunction:wv -Wconf:cat=other-pure-statement&site=org.apache.spark.streaming.util.FileBasedWriteAheadLog.readAll.readFile:wv -Wconf:cat=other-pure-statement&site=org.apache.spark.scheduler.OutputCommitCoordinatorSuite.<local OutputCommitCoordinatorSuite>.futureAction:wv -Wconf:msg=^(?=.*?method|value|type|object|trait|inheritance)(?=.*?deprecated)(?=.*?since 2.13).+$:s -Wconf:msg=^(?=.*?Widening conversion from)(?=.*?is deprecated because it loses precision).+$:s -Wconf:msg=Auto-application to \`\(\)\` is deprecated:s -Wconf:msg=method with a single empty parameter list overrides method without any parameter list:s -Wconf:msg=method without a parameter list overrides a method with a single empty one:s -Wconf:cat=deprecation&msg=procedure syntax is deprecated:e -Wconf:cat=unchecked&msg=outer reference:s -Wconf:cat=unchecked&msg=eliminated by erasure:s -Wconf:msg=^(?=.*?a value of type)(?=.*?cannot also be).+$:s -Wconf:cat=unused-imports&src=org\/apache\/spark\/graphx\/impl\/VertexPartitionBase.scala:s -Wconf:cat=unused-imports&src=org\/apache\/spark\/graphx\/impl\/VertexPartitionBaseOps.scala:s -Wconf:msg=Implicit definition should have explicit type:s -target:8 -bootclasspath ... ``` After downgrading the version, the `-release` option should no longer appear in the compilation arguments. 2. Maven can build the project with Java 17 without the issue described in apache#41943. And after this pr, we can re-upgrade Scala 2.13 to Scala 2.13.11. No Closes apache#42899 from LuciferYang/SPARK-45144. Authored-by: yangjie01 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
This PR aims to re-upgrade Scala to 2.13.11, after SPARK-45144 was merged, the build issues mentioned in apache#41943 should no longer exist. - https://www.scala-lang.org/news/2.13.11 Additionally, this pr adds a new suppression rule for warning message: `Implicit definition should have explicit type`, this is a new compile check introduced by scala/scala#10083, we must fix it when we upgrading to use Scala 3 This release improves collections, adds support for JDK 20 and 21, adds support for JDK 17 `sealed`: - scala/scala#10363 - scala/scala#10184 - scala/scala#10397 - scala/scala#10348 - scala/scala#10105 There are 2 known issues in this version: - scala/bug#12800 - scala/bug#12799 For the first one, there is no compilation warning messages related to `match may not be exhaustive` in Spark compile log, and for the second one, there is no case of `method.isAnnotationPresent(Deprecated.class)` in Spark code, there is just https://github.com/apache/spark/blob/8c84d2c9349d7b607db949c2e114df781f23e438/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala#L130 in Spark Code, and I checked `javax.annotation.Nonnull` no this issue. So I think These two issues will not affect Spark itself, but this doesn't mean it won't affect the code written by end users themselves The full release notes as follows: - https://github.com/scala/scala/releases/tag/v2.13.11 Yes, this is a Scala version change. - Existing Test No Closes apache#42918 from LuciferYang/SPARK-40497-2. Authored-by: yangjie01 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
Drop hardcoded
--target:jvm-1.8
value from scalac argument in pom.xml.Why are the changes needed?
Build using maven is broken using 2.13 and Java 11 or later.
It fails with
if setting the
java.version
property orThis is caused by that we in pom.xml hardcode that scalac should run with
-target:jvm-1.8
(regardless of the value ofjava.version
) this was fine for scala 2.12.18 and scala 2.13.8 as the scala-maven-plugin would add the arg-target
based on thejava.version
property. (https://github.com/davidB/scala-maven-plugin/blob/4.8.0/src/main/java/scala_maven/ScalaMojoSupport.java#L629-L648) since this argument is later it took precedence over the value we hardcoded in maven and everything works as expected.The problem comes in scala 2.13.11 where
-target
is deprecated and therefore the scala-maven-plugin uses the-release
argument instead. The first second failure about not being able to accessingsun._
packages which is expected behvaior when using-release 8
see: scala/bug#12643 but if one sets--release 11
when using Java 11 access tosun._
compile just fine.Note: That builds using scala 2.13 and java 11 or later without setting
java.version
to the appropriate value will still fail.Note2: The java 8 builds still succeeds as the
rt.jar
is pased on the-bootclasspath
when using java8.Does this PR introduce any user-facing change?
Fixes the maven build when using scala 2.13 and Java 11 or later.
How was this patch tested?
Exising CI builds and manual builds locally.