-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-50135][BUILD] Upgrade ZooKeeper to 3.9.3 #48771
Conversation
we should change to exclude
from zookeeper instead of
However, I believe it would be better to upgrade netty to 4.1.113, because zk 3.9.3 was compiled with a dependency on netty 4.1.113. Downgrading to use 4.1.110 may introduce uncertainties and potential issues. |
There is an issue with the version |
I have verified that the compilation issue with |
project/SparkBuild.scala
Outdated
dependencyOverrides += "com.google.guava" % "guava" % guavaVersion, | ||
dependencyOverrides += "jline" % "jline" % "2.14.6", | ||
dependencyOverrides += "org.apache.avro" % "avro" % "1.11.3") | ||
dependencyOverrides ++= { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LuciferYang @dongjoon-hyun
Do you want to keep this? Although the issue was fixed when netty
upgraded to version 1.1.114
, similar problems may still occur next time when other libraries depend on different netty
versions.
project/SparkBuild.scala
Outdated
dependencyOverrides ++= { | ||
val nettyVersion = SbtPomKeys.effectivePom.value.getProperties.get( | ||
"netty.version").asInstanceOf[String] | ||
Seq( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally tend not to add this part of the configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @HyukjinKwon and @zhengruifeng
I am fine as long as the tests work :-). |
Let's remove that part and proceed to merge cleanly, @panbingkun . |
@panbingkun can you also simply check with simple queries like
The whole pyspark connect was broken before, including all pyspark-connect tests and the REPL. |
The verification process based on
./build/sbt -Phadoop-3 -Pkubernetes -Pkinesis-asl -Phive-thriftserver -Pdocker-integration-tests -Pyarn -Phadoop-cloud -Pspark-ganglia-lgpl -Phive -Pjvm-profiler clean package
[info] Note: Some input files use or override a deprecated API.
[info] Note: Recompile with -Xlint:deprecation for details.
[warn] multiple main classes detected: run 'show discoveredMainClasses' to see the list
[success] Total time: 251 s (04:11), completed Nov 7, 2024, 9:23:18 AM (pyspark) ➜ spark-community git:(SPARK-50135_FOLLOWUP) ✗ ls -1 assembly/target/scala-2.13/jars/netty-*1*
assembly/target/scala-2.13/jars/netty-all-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-buffer-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-codec-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-codec-http-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-codec-http2-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-codec-socks-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-common-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-handler-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-handler-proxy-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-resolver-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-transport-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-transport-classes-epoll-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-transport-classes-kqueue-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-transport-native-epoll-4.1.114.Final-linux-aarch_64.jar
assembly/target/scala-2.13/jars/netty-transport-native-epoll-4.1.114.Final-linux-riscv64.jar
assembly/target/scala-2.13/jars/netty-transport-native-epoll-4.1.114.Final-linux-x86_64.jar
assembly/target/scala-2.13/jars/netty-transport-native-kqueue-4.1.114.Final-osx-aarch_64.jar
assembly/target/scala-2.13/jars/netty-transport-native-kqueue-4.1.114.Final-osx-x86_64.jar
assembly/target/scala-2.13/jars/netty-transport-native-unix-common-4.1.114.Final.jar (pyspark) ➜ spark-community git:(SPARK-50135_FOLLOWUP) ✗ ./python/run-tests --python-executables=python3 --testnames "pyspark.sql.tests.connect.test_connect_collection"
Running PySpark tests. Output is in /Users/panbingkun/Developer/spark/spark-community/python/unit-tests.log
Will test against the following Python executables: ['python3']
Will test the following Python tests: ['pyspark.sql.tests.connect.test_connect_collection']
python3 python_implementation is CPython
python3 version is: Python 3.9.19
Starting test(python3): pyspark.sql.tests.connect.test_connect_collection (temp output: /Users/panbingkun/Developer/spark/spark-community/python/target/2ed7ab5b-39f0-41f7-85c5-0fdcab0b0b2a/python3__pyspark.sql.tests.connect.test_connect_collection__61isowkl.log)
Finished test(python3): pyspark.sql.tests.connect.test_connect_collection (13s)
Tests passed in 13 seconds (pyspark) ➜ spark-community git:(SPARK-50135_FOLLOWUP) ✗ ./bin/pyspark --remote "sc://127.0.0.1:8888"
Python 3.9.19 (main, Mar 21 2024, 12:07:41)
[Clang 14.0.6 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 4.0.0.dev0
/_/
Using Python version 3.9.19 (main, Mar 21 2024 12:07:41)
Client connected to the Spark Connect server at 127.0.0.1:8888
SparkSession available as 'spark'.
>>> spark.range(10).collect()
[Row(id=0), Row(id=1), Row(id=2), Row(id=3), Row(id=4), Row(id=5), Row(id=6), Row(id=7), Row(id=8), Row(id=9)]
./build/mvn -DskipTests -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pjvm-profiler -Pspark-ganglia-lgpl -Pkinesis-asl clean package -T4
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 15:16 min (Wall Clock)
[INFO] Finished at: 2024-11-07T09:43:02+08:00
[INFO] ------------------------------------------------------------------------ (pyspark) ➜ spark-community git:(SPARK-50135_FOLLOWUP) ✗ ls -1 assembly/target/scala-2.13/jars/netty-*1*
assembly/target/scala-2.13/jars/netty-all-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-buffer-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-codec-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-codec-http-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-codec-http2-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-codec-socks-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-common-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-handler-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-handler-proxy-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-resolver-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-transport-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-transport-classes-epoll-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-transport-classes-kqueue-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-transport-native-epoll-4.1.114.Final-linux-aarch_64.jar
assembly/target/scala-2.13/jars/netty-transport-native-epoll-4.1.114.Final-linux-riscv64.jar
assembly/target/scala-2.13/jars/netty-transport-native-epoll-4.1.114.Final-linux-x86_64.jar
assembly/target/scala-2.13/jars/netty-transport-native-kqueue-4.1.114.Final-osx-aarch_64.jar
assembly/target/scala-2.13/jars/netty-transport-native-kqueue-4.1.114.Final-osx-x86_64.jar
assembly/target/scala-2.13/jars/netty-transport-native-unix-common-4.1.114.Final.jar (pyspark) ➜ spark-community git:(SPARK-50135_FOLLOWUP) ✗ ./python/run-tests --python-executables=python3 --testnames "pyspark.sql.tests.connect.test_connect_collection"
Running PySpark tests. Output is in /Users/panbingkun/Developer/spark/spark-community/python/unit-tests.log
Will test against the following Python executables: ['python3']
Will test the following Python tests: ['pyspark.sql.tests.connect.test_connect_collection']
python3 python_implementation is CPython
python3 version is: Python 3.9.19
Starting test(python3): pyspark.sql.tests.connect.test_connect_collection (temp output: /Users/panbingkun/Developer/spark/spark-community/python/target/7a50ed3b-267c-4891-903d-951aaf75f74c/python3__pyspark.sql.tests.connect.test_connect_collection__vqf968_3.log)
Finished test(python3): pyspark.sql.tests.connect.test_connect_collection (16s)
Tests passed in 16 seconds (pyspark) ➜ spark-community git:(SPARK-50135_FOLLOWUP) ✗ ./bin/pyspark --remote "sc://127.0.0.1:8888"
Python 3.9.19 (main, Mar 21 2024, 12:07:41)
[Clang 14.0.6 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 4.0.0.dev0
/_/
Using Python version 3.9.19 (main, Mar 21 2024 12:07:41)
Client connected to the Spark Connect server at 127.0.0.1:8888
SparkSession available as 'spark'.
>>> spark.range(10).collect()
[Row(id=0), Row(id=1), Row(id=2), Row(id=3), Row(id=4), Row(id=5), Row(id=6), Row(id=7), Row(id=8), Row(id=9)] |
Done, the verification process has been posted. |
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @panbingkun and all.
Merged to master back.
Thanks all @LuciferYang @zhengruifeng @HyukjinKwon @dongjoon-hyun ! |
What changes were proposed in this pull request?
The pr aims to upgrade
ZooKeeper
from3.9.2
to3.9.3
.This PR is to fix potential issues with PR #48666
Why are the changes needed?
The full release notes: https://zookeeper.apache.org/doc/r3.9.3/releasenotes.html
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?
No.