Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-48867][BUILD] Upgrade okhttp to 4.12.0, okio to 3.9.0 and esdk-obs-java to 3.24.3 #47795

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 6 additions & 5 deletions dev/deps/spark-deps-hadoop-3-hive-2.3
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ derbyshared/10.16.1.1//derbyshared-10.16.1.1.jar
derbytools/10.16.1.1//derbytools-10.16.1.1.jar
dropwizard-metrics-hadoop-metrics2-reporter/0.1.2//dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar
error_prone_annotations/2.26.1//error_prone_annotations-2.26.1.jar
esdk-obs-java/3.20.4.2//esdk-obs-java-3.20.4.2.jar
esdk-obs-java/3.24.3//esdk-obs-java-3.24.3.jar
failureaccess/1.0.2//failureaccess-1.0.2.jar
flatbuffers-java/24.3.25//flatbuffers-java-24.3.25.jar
gcs-connector/hadoop3-2.2.25/shaded/gcs-connector-hadoop3-2.2.25-shaded.jar
Expand Down Expand Up @@ -123,7 +123,6 @@ jakarta.ws.rs-api/3.0.0//jakarta.ws.rs-api-3.0.0.jar
jakarta.xml.bind-api/2.3.2//jakarta.xml.bind-api-2.3.2.jar
janino/3.1.9//janino-3.1.9.jar
java-diff-utils/4.12//java-diff-utils-4.12.jar
java-xmlbuilder/1.2//java-xmlbuilder-1.2.jar
javassist/3.30.2-GA//javassist-3.30.2-GA.jar
javax.jdo/3.2.0-m3//javax.jdo-3.2.0-m3.jar
javax.servlet-api/4.0.1//javax.servlet-api-4.0.1.jar
Expand Down Expand Up @@ -158,6 +157,7 @@ json4s-scalap_2.13/4.0.7//json4s-scalap_2.13-4.0.7.jar
jsr305/3.0.0//jsr305-3.0.0.jar
jta/1.1//jta-1.1.jar
jul-to-slf4j/2.0.16//jul-to-slf4j-2.0.16.jar
kotlin-stdlib/2.0.10//kotlin-stdlib-2.0.10.jar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a little bit crazy to pull another JVM lang runtime into Spark classpath by just consuming an HTTP library.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pan3793,

yes, I agree that it is bad that we have to add kotlin-stdlib/2.0.10//kotlin-stdlib-2.0.10.jar as a dependency. I have just removed the other two extra dependencies because we do not need them. Related comment: #47795 (comment)

As I mentioned in the pull request's description the kubernetes-client's maintainers do not want upgrade to okhttp 4.x because it's based on Kotlin, they recommend to exclude 3.x. Related documentation:

https://github.com/fabric8io/kubernetes-client/blob/main/doc/KubernetesClientWithIPv6Clusters.md

Currently I do not see other solution to resolve these CVEs: CVE-2021-0341, CVE-2023-0833

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubernetes-httpclient-vertx replace kubernetes-httpclient-okhttp to avoid this problem
fabric8io/kubernetes-client#2764

@pan3793 @roczei

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@melin the mockserver is not ported over..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hadoop uses the v3 mockwebserver in test dependencies only, which also seems to be the only place that koitlin comes in.

it's a little bit crazy to pull another JVM lang runtime into Spark classpath by just consuming an HTTP librar

mmm. maybe they've found koitlin a good language for coding; as more projects use then it will becomes less unusual -instead becoming just another dependency pain point à la guava

Copy link
Contributor Author

@roczei roczei Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the mockserver is not ported over.

@melin / @bjornjorgensen

yes, I confirm it as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roczei and @melin her fabric8io/kubernetes-client#5632 is the POC for replacing the MockWebServer but as the PR it is not merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bjornjorgensen,

Thanks!

kryo-shaded/4.0.2//kryo-shaded-4.0.2.jar
kubernetes-client-api/6.13.4//kubernetes-client-api-6.13.4.jar
kubernetes-client/6.13.4//kubernetes-client-6.13.4.jar
Expand Down Expand Up @@ -194,7 +194,7 @@ log4j-api/2.24.1//log4j-api-2.24.1.jar
log4j-core/2.24.1//log4j-core-2.24.1.jar
log4j-layout-template-json/2.24.1//log4j-layout-template-json-2.24.1.jar
log4j-slf4j2-impl/2.24.1//log4j-slf4j2-impl-2.24.1.jar
logging-interceptor/3.12.12//logging-interceptor-3.12.12.jar
logging-interceptor/4.12.0//logging-interceptor-4.12.0.jar
lz4-java/1.8.0//lz4-java-1.8.0.jar
metrics-core/4.2.28//metrics-core-4.2.28.jar
metrics-graphite/4.2.28//metrics-graphite-4.2.28.jar
Expand Down Expand Up @@ -228,8 +228,9 @@ netty-transport-native-kqueue/4.1.114.Final/osx-x86_64/netty-transport-native-kq
netty-transport-native-unix-common/4.1.114.Final//netty-transport-native-unix-common-4.1.114.Final.jar
netty-transport/4.1.114.Final//netty-transport-4.1.114.Final.jar
objenesis/3.3//objenesis-3.3.jar
okhttp/3.12.12//okhttp-3.12.12.jar
okio/1.17.6//okio-1.17.6.jar
okhttp/4.12.0//okhttp-4.12.0.jar
okio-jvm/3.9.0//okio-jvm-3.9.0.jar
okio/3.9.0//okio-3.9.0.jar
opencsv/2.3//opencsv-2.3.jar
opentracing-api/0.33.0//opentracing-api-0.33.0.jar
opentracing-noop/0.33.0//opentracing-noop-0.33.0.jar
Expand Down
35 changes: 35 additions & 0 deletions hadoop-cloud/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,41 @@
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-cos</artifactId>
</exclusion>
<!--
HADOOP-19224 / SPARK-48867: com.huaweicloud:esdk-obs-java:jar:3.20.4.2 is
vulnerable due to okhttp 3.x (CVE-2023-0833, CVE-2021-0341),
it has to be upgraded to 3.24.3 which depends on okhttp 4.12.0
-->
<exclusion>
<groupId>com.huaweicloud</groupId>
<artifactId>esdk-obs-java</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will it actually work with this removal? if not best to stop trying to restore it and exclude all huaweicloud support with the release note/spark docs saying "explicitly import it"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply

will it actually work with this removal?

I do not have access to Huawei Cloud, therefore I could not test it with an obs bucket. Found this documentation for Spark testing: https://support.huaweicloud.com/intl/en-us/devg-dli/dli_09_0205.html#section6

I don't think apache spark have any tests for this, does hadoop have it

I did some research and found only Hadoop unit tests but these are disabled by default. Related documentation:

https://github.com/apache/hadoop/blob/trunk/hadoop-cloud-storage-project/hadoop-huaweicloud/src/site/markdown/index.md#testing-the-hadoop-huaweicloud-module

Similar configuration has to be created if somebody has such credentials:

$ cat src/test/resources/auth-keys.xml
<configuration>
  <property>
    <name>fs.contract.test.fs.obs</name>
    <value>obs://testobscontract</value>
  </property>

  <property>
    <name>fs.obs.access.key</name>
    <value>secret</value>
  </property>

  <property>
    <name>fs.obs.secret.key</name>
    <value>secret</value>
  </property>
</configuration>
$

Just to be on the safe side, I agree that we should do what @steveloughran suggested above:

if not best to stop trying to restore it and exclude all huaweicloud support with the release note/spark docs saying "explicitly import it"

@pan3793 / @panbingkun / @dongjoon-hyun / @melin / @bjornjorgensen what is your opinion about this suggestion? If you agre as well, I would like to implement these:

  • Exclude the whole org.apache.hadoop:hadoop-huaweicloud artifact instead of com.huaweicloud:esdk-obs-java
  • Update the "Does this PR introduce any user-facing change" section and mention that it includes this user facing change
  • Request to add it to the release notes but I do not know what is the proper way to do it in case of Apache Spark. Please share with me if you know the official process for this. Thanks!

<groupId>com.huaweicloud</groupId>
<artifactId>esdk-obs-java</artifactId>
<version>${esdk.obs.java.version}</version>
<exclusions>
<exclusion>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib-jdk8</artifactId>
</exclusion>
<exclusion>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib</artifactId>
<version>${kotlin-stdlib.version}</version>
<exclusions>
<exclusion>
<groupId>org.jetbrains</groupId>
<artifactId>annotations</artifactId>
</exclusion>
</exclusions>
</dependency>
<!--
Expand Down
10 changes: 9 additions & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,12 @@
<aws.java.sdk.v2.version>2.24.6</aws.java.sdk.v2.version>
<!-- the producer is used in tests -->
<aws.kinesis.producer.version>0.12.8</aws.kinesis.producer.version>
<!--
HADOOP-19224 / SPARK-48867: com.huaweicloud:esdk-obs-java:jar:3.20.4.2 is
vulnerable due to okhttp 3.x (CVE-2023-0833, CVE-2021-0341),
it has to be upgraded to 3.24.3 which depends on okhttp 4.12.0
-->
<esdk.obs.java.version>3.24.3</esdk.obs.java.version>
<!-- Do not use 3.0.0: https://github.com/GoogleCloudDataproc/hadoop-connectors/issues/1114 -->
<gcs-connector.version>hadoop3-2.2.25</gcs-connector.version>
<!-- org.apache.httpcomponents/httpclient-->
Expand Down Expand Up @@ -237,7 +243,9 @@
<!-- org.fusesource.leveldbjni will be used except on arm64 platform. -->
<leveldbjni.group>org.fusesource.leveldbjni</leveldbjni.group>
<kubernetes-client.version>6.13.4</kubernetes-client.version>
<okio.version>1.17.6</okio.version>
<okio.version>3.9.0</okio.version>
<okhttp.version>4.12.0</okhttp.version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@roczei roczei Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have uncompressed the kotlin-stdlib-2.0.10.jar file and all class files are part of the kotlin directory / package:

[roczei@roczei-MBP16 2.0.10]$ jar tf kotlin-stdlib-2.0.10.jar | head
META-INF/
META-INF/MANIFEST.MF
META-INF/kotlin-stdlib.kotlin_module
kotlin/
kotlin/ArrayIntrinsicsKt.class
kotlin/BuilderInference.class
kotlin/CharCodeJVMKt.class
kotlin/CharCodeKt.class
kotlin/CompareToKt.class
kotlin/ContextFunctionTypeParams.class
[roczei@roczei-MBP16 2.0.10]$ 

All Spark unit tests have passed in this separate pull request: roczei#4. Currently I can only validate these, I hope it will be enough.

<kotlin-stdlib.version>2.0.10</kotlin-stdlib.version>

<test.java.home>${java.home}</test.java.home>

Expand Down
47 changes: 47 additions & 0 deletions resource-managers/kubernetes/core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,53 @@
<groupId>io.fabric8</groupId>
<artifactId>kubernetes-httpclient-okhttp</artifactId>
<version>${kubernetes-client.version}</version>
<exclusions>
<exclusion>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
</exclusion>
<exclusion>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>logging-interceptor</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
<version>${okhttp.version}</version>
<exclusions>
<exclusion>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib-jdk8</artifactId>
</exclusion>
<exclusion>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>logging-interceptor</artifactId>
<version>${okhttp.version}</version>
<exclusions>
<exclusion>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib-jdk8</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib</artifactId>
<version>${kotlin-stdlib.version}</version>
<exclusions>
<exclusion>
<groupId>org.jetbrains</groupId>
<artifactId>annotations</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>io.fabric8</groupId>
Expand Down
47 changes: 47 additions & 0 deletions resource-managers/kubernetes/integration-tests/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,53 @@
<groupId>io.fabric8</groupId>
<artifactId>kubernetes-client</artifactId>
<version>${kubernetes-client.version}</version>
<exclusions>
<exclusion>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
</exclusion>
<exclusion>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>logging-interceptor</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
<version>${okhttp.version}</version>
<exclusions>
<exclusion>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib-jdk8</artifactId>
</exclusion>
<exclusion>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>logging-interceptor</artifactId>
<version>${okhttp.version}</version>
<exclusions>
<exclusion>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib-jdk8</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.jetbrains.kotlin</groupId>
<artifactId>kotlin-stdlib</artifactId>
<version>${kotlin-stdlib.version}</version>
<exclusions>
<exclusion>
<groupId>org.jetbrains</groupId>
<artifactId>annotations</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
Expand Down