Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-50135][BUILD] Upgrade ZooKeeper to 3.9.3 #48771

Closed
wants to merge 2 commits into from

Conversation

panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Nov 6, 2024

What changes were proposed in this pull request?

The pr aims to upgrade ZooKeeper from 3.9.2 to 3.9.3.
This PR is to fix potential issues with PR #48666

Why are the changes needed?

The full release notes: https://zookeeper.apache.org/doc/r3.9.3/releasenotes.html

Does this PR introduce any user-facing change?

No.

How was this patch tested?

  • Pass GA.
  • Manually check
./build/sbt -Phadoop-3 -Pkubernetes -Pkinesis-asl -Phive-thriftserver -Pdocker-integration-tests -Pyarn -Phadoop-cloud -Pspark-ganglia-lgpl -Phive -Pjvm-profiler clean package

[info] Note: Some input files use or override a deprecated API.
[info] Note: Recompile with -Xlint:deprecation for details.
[warn] multiple main classes detected: run 'show discoveredMainClasses' to see the list
[success] Total time: 272 s (04:32), completed Nov 6, 2024, 4:29:52 PM
(pyspark) ➜  spark-community git:(SPARK-50135_FOLLOWUP) ✗ ./python/run-tests --python-executables=python3 --testnames "pyspark.sql.tests.connect.test_connect_collection"
Running PySpark tests. Output is in /Users/panbingkun/Developer/spark/spark-community/python/unit-tests.log
Will test against the following Python executables: ['python3']
Will test the following Python tests: ['pyspark.sql.tests.connect.test_connect_collection']
python3 python_implementation is CPython
python3 version is: Python 3.9.19
Starting test(python3): pyspark.sql.tests.connect.test_connect_collection (temp output: /Users/panbingkun/Developer/spark/spark-community/python/target/097bd7e0-9311-4484-ae2d-c0f4c63fc6f9/python3__pyspark.sql.tests.connect.test_connect_collection__8dzaeio9.log)
Finished test(python3): pyspark.sql.tests.connect.test_connect_collection (14s)
Tests passed in 14 seconds

Was this patch authored or co-authored using generative AI tooling?

No.

@LuciferYang
Copy link
Contributor

we should change to exclude

<exclusion> 
       <groupId>io.netty</groupId> 
       <artifactId>*</artifactId> 
     </exclusion> 

from zookeeper instead of

<exclusion> 
       <groupId>org.jboss.netty</groupId> 
       <artifactId>netty</artifactId> 
     </exclusion> 

org.jboss.netty is the groupId of netty 3.x

However, I believe it would be better to upgrade netty to 4.1.113, because zk 3.9.3 was compiled with a dependency on netty 4.1.113. Downgrading to use 4.1.110 may introduce uncertainties and potential issues.

@panbingkun
Copy link
Contributor Author

we should change to exclude

<exclusion> 
       <groupId>io.netty</groupId> 
       <artifactId>*</artifactId> 
     </exclusion> 

from zookeeper instead of

<exclusion> 
       <groupId>org.jboss.netty</groupId> 
       <artifactId>netty</artifactId> 
     </exclusion> 

org.jboss.netty is the groupId of netty 3.x

However, I believe it would be better to upgrade netty to 4.1.113, because zk 3.9.3 was compiled with a dependency on netty 4.1.113. Downgrading to use 4.1.110 may introduce uncertainties and potential issues.

There is an issue with the version 4.1.113, but version 4.1.114 appears to have fixed the problem that has been present since version 4.1.112. Let's upgrade netty to version 4.1.114 first.
#46945

@panbingkun
Copy link
Contributor Author

I have verified that the compilation issue with sbt has been resolved after upgrading the netty version to 1.1.114. UT ./python/run-tests --python-executables=python3 --testnames "pyspark.sql.tests.connect.test_connect_collection" has also been successful.

@panbingkun panbingkun marked this pull request as ready for review November 6, 2024 10:44
@panbingkun panbingkun marked this pull request as draft November 6, 2024 10:44
dependencyOverrides += "com.google.guava" % "guava" % guavaVersion,
dependencyOverrides += "jline" % "jline" % "2.14.6",
dependencyOverrides += "org.apache.avro" % "avro" % "1.11.3")
dependencyOverrides ++= {
Copy link
Contributor Author

@panbingkun panbingkun Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LuciferYang @dongjoon-hyun
Do you want to keep this? Although the issue was fixed when netty upgraded to version 1.1.114, similar problems may still occur next time when other libraries depend on different netty versions.

@panbingkun panbingkun marked this pull request as ready for review November 6, 2024 11:09
dependencyOverrides ++= {
val nettyVersion = SbtPomKeys.effectivePom.value.getProperties.get(
"netty.version").asInstanceOf[String]
Seq(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally tend not to add this part of the configuration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HyukjinKwon
Copy link
Member

I am fine as long as the tests work :-).

@dongjoon-hyun
Copy link
Member

Let's remove that part and proceed to merge cleanly, @panbingkun .

@zhengruifeng
Copy link
Contributor

zhengruifeng commented Nov 6, 2024

@panbingkun can you also simply check bin/pyspark --remote "local[*]"?

with simple queries like

In [1]: spark.range(10).collect()
24/11/06 10:50:00 WARN CheckAllocator: More than one DefaultAllocationManager on classpath. Choosing first found
Out[1]:
[Row(id=0),
 Row(id=1),
 Row(id=2),
 Row(id=3),
 Row(id=4),
 Row(id=5),
 Row(id=6),
 Row(id=7),
 Row(id=8),
 Row(id=9)]

The whole pyspark connect was broken before, including all pyspark-connect tests and the REPL.

@panbingkun
Copy link
Contributor Author

panbingkun commented Nov 7, 2024

The verification process based on SBT and Maven compilation is as follows:

  • Based on SBT compile
./build/sbt -Phadoop-3 -Pkubernetes -Pkinesis-asl -Phive-thriftserver -Pdocker-integration-tests -Pyarn -Phadoop-cloud -Pspark-ganglia-lgpl -Phive -Pjvm-profiler clean package

[info] Note: Some input files use or override a deprecated API.
[info] Note: Recompile with -Xlint:deprecation for details.
[warn] multiple main classes detected: run 'show discoveredMainClasses' to see the list
[success] Total time: 251 s (04:11), completed Nov 7, 2024, 9:23:18 AM
(pyspark) ➜  spark-community git:(SPARK-50135_FOLLOWUP) ✗ ls -1 assembly/target/scala-2.13/jars/netty-*1*
assembly/target/scala-2.13/jars/netty-all-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-buffer-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-codec-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-codec-http-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-codec-http2-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-codec-socks-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-common-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-handler-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-handler-proxy-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-resolver-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-transport-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-transport-classes-epoll-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-transport-classes-kqueue-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-transport-native-epoll-4.1.114.Final-linux-aarch_64.jar
assembly/target/scala-2.13/jars/netty-transport-native-epoll-4.1.114.Final-linux-riscv64.jar
assembly/target/scala-2.13/jars/netty-transport-native-epoll-4.1.114.Final-linux-x86_64.jar
assembly/target/scala-2.13/jars/netty-transport-native-kqueue-4.1.114.Final-osx-aarch_64.jar
assembly/target/scala-2.13/jars/netty-transport-native-kqueue-4.1.114.Final-osx-x86_64.jar
assembly/target/scala-2.13/jars/netty-transport-native-unix-common-4.1.114.Final.jar
(pyspark) ➜  spark-community git:(SPARK-50135_FOLLOWUP) ✗ ./python/run-tests --python-executables=python3 --testnames "pyspark.sql.tests.connect.test_connect_collection"
Running PySpark tests. Output is in /Users/panbingkun/Developer/spark/spark-community/python/unit-tests.log
Will test against the following Python executables: ['python3']
Will test the following Python tests: ['pyspark.sql.tests.connect.test_connect_collection']
python3 python_implementation is CPython
python3 version is: Python 3.9.19
Starting test(python3): pyspark.sql.tests.connect.test_connect_collection (temp output: /Users/panbingkun/Developer/spark/spark-community/python/target/2ed7ab5b-39f0-41f7-85c5-0fdcab0b0b2a/python3__pyspark.sql.tests.connect.test_connect_collection__61isowkl.log)
Finished test(python3): pyspark.sql.tests.connect.test_connect_collection (13s)
Tests passed in 13 seconds
(pyspark) ➜  spark-community git:(SPARK-50135_FOLLOWUP) ✗  ./bin/pyspark --remote "sc://127.0.0.1:8888"
Python 3.9.19 (main, Mar 21 2024, 12:07:41)
[Clang 14.0.6 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 4.0.0.dev0
      /_/

Using Python version 3.9.19 (main, Mar 21 2024 12:07:41)
Client connected to the Spark Connect server at 127.0.0.1:8888
SparkSession available as 'spark'.
>>> spark.range(10).collect()
[Row(id=0), Row(id=1), Row(id=2), Row(id=3), Row(id=4), Row(id=5), Row(id=6), Row(id=7), Row(id=8), Row(id=9)]
  • Based on MAVEN compile
./build/mvn -DskipTests -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pjvm-profiler -Pspark-ganglia-lgpl -Pkinesis-asl clean package -T4

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  15:16 min (Wall Clock)
[INFO] Finished at: 2024-11-07T09:43:02+08:00
[INFO] ------------------------------------------------------------------------
(pyspark) ➜  spark-community git:(SPARK-50135_FOLLOWUP) ✗  ls -1 assembly/target/scala-2.13/jars/netty-*1*
assembly/target/scala-2.13/jars/netty-all-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-buffer-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-codec-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-codec-http-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-codec-http2-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-codec-socks-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-common-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-handler-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-handler-proxy-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-resolver-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-transport-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-transport-classes-epoll-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-transport-classes-kqueue-4.1.114.Final.jar
assembly/target/scala-2.13/jars/netty-transport-native-epoll-4.1.114.Final-linux-aarch_64.jar
assembly/target/scala-2.13/jars/netty-transport-native-epoll-4.1.114.Final-linux-riscv64.jar
assembly/target/scala-2.13/jars/netty-transport-native-epoll-4.1.114.Final-linux-x86_64.jar
assembly/target/scala-2.13/jars/netty-transport-native-kqueue-4.1.114.Final-osx-aarch_64.jar
assembly/target/scala-2.13/jars/netty-transport-native-kqueue-4.1.114.Final-osx-x86_64.jar
assembly/target/scala-2.13/jars/netty-transport-native-unix-common-4.1.114.Final.jar
(pyspark) ➜  spark-community git:(SPARK-50135_FOLLOWUP) ✗ ./python/run-tests --python-executables=python3 --testnames "pyspark.sql.tests.connect.test_connect_collection"
Running PySpark tests. Output is in /Users/panbingkun/Developer/spark/spark-community/python/unit-tests.log
Will test against the following Python executables: ['python3']
Will test the following Python tests: ['pyspark.sql.tests.connect.test_connect_collection']
python3 python_implementation is CPython
python3 version is: Python 3.9.19
Starting test(python3): pyspark.sql.tests.connect.test_connect_collection (temp output: /Users/panbingkun/Developer/spark/spark-community/python/target/7a50ed3b-267c-4891-903d-951aaf75f74c/python3__pyspark.sql.tests.connect.test_connect_collection__vqf968_3.log)
Finished test(python3): pyspark.sql.tests.connect.test_connect_collection (16s)
Tests passed in 16 seconds
(pyspark) ➜  spark-community git:(SPARK-50135_FOLLOWUP) ✗ ./bin/pyspark --remote "sc://127.0.0.1:8888"
Python 3.9.19 (main, Mar 21 2024, 12:07:41)
[Clang 14.0.6 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 4.0.0.dev0
      /_/

Using Python version 3.9.19 (main, Mar 21 2024 12:07:41)
Client connected to the Spark Connect server at 127.0.0.1:8888
SparkSession available as 'spark'.
>>> spark.range(10).collect()
[Row(id=0), Row(id=1), Row(id=2), Row(id=3), Row(id=4), Row(id=5), Row(id=6), Row(id=7), Row(id=8), Row(id=9)]

@panbingkun
Copy link
Contributor Author

@panbingkun can you also simply check bin/pyspark --remote "local[*]"?

with simple queries like

In [1]: spark.range(10).collect()
24/11/06 10:50:00 WARN CheckAllocator: More than one DefaultAllocationManager on classpath. Choosing first found
Out[1]:
[Row(id=0),
 Row(id=1),
 Row(id=2),
 Row(id=3),
 Row(id=4),
 Row(id=5),
 Row(id=6),
 Row(id=7),
 Row(id=8),
 Row(id=9)]

The whole pyspark connect was broken before, including all pyspark-connect tests and the REPL.

Done, the verification process has been posted.

@panbingkun
Copy link
Contributor Author

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @panbingkun and all.
Merged to master back.

@panbingkun
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants