Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO-NOT-MERGE][SPARK-23710] Upgrade built-in Hive to 2.3.4(Without hive-thriftserver) #23552

Closed
wants to merge 1 commit into from
Closed

[DO-NOT-MERGE][SPARK-23710] Upgrade built-in Hive to 2.3.4(Without hive-thriftserver) #23552

wants to merge 1 commit into from

Conversation

wangyum
Copy link
Member

@wangyum wangyum commented Jan 15, 2019

What changes were proposed in this pull request?

This is the first PR. Just to make it easier to review the changes except hive-thriftserver module.
Complete changes please go:

How was this patch tested?

unit tests and manual tests

This commit does not contains the module of hive-thriftserver as it has many changes.
@SparkQA
Copy link

SparkQA commented Jan 15, 2019

Test build #101272 has finished for PR 23552 at commit a4dc989.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it still necessary to have the hive-contrib-2.3.4.jar etc in the test resources, to make the tests work?

Could you summarize the compatibility changes in the "Docs text" field of the JIRA and add the release-notes tag? that would help everyone understand the implications of the change.

But yeah this seems like something we have to do.

core/pom.xml Show resolved Hide resolved
private def isSubDir(p1: Path, p2: Path, fs: FileSystem): Boolean = {
val path1 = fs.makeQualified(p1).toString
val path2 = fs.makeQualified(p2).toString
if (path1.startsWith(path2)) true else false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for if (...) true else false. Just return the value of the predicate

@@ -197,6 +201,13 @@ class OrcFileFormat extends FileFormat with DataSourceRegister with Serializable

case _ => false
}

def toKryo(sarg: SearchArgument): String = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private?

*/
private def castLiteralValue(value: Any, dataType: DataType): Any = dataType match {
case ByteType | ShortType | IntegerType | LongType =>
value.asInstanceOf[Number].longValue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trivial, but I'd use () or not consistently between here and line 110

@@ -192,6 +192,7 @@ private[hive] class IsolatedClientLoader(
(name.startsWith("com.google") && !name.startsWith("com.google.cloud")) ||
name.startsWith("java.lang.") ||
name.startsWith("java.net") ||
name.startsWith("org.apache.derby.") ||
Copy link
Member

@gatorsmile gatorsmile Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks Apache Livy, as @HyukjinKwon pointed out. #20944 (comment)

@@ -86,15 +86,17 @@
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.apache.hive</groupId>
Copy link
Member

@gatorsmile gatorsmile Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not let sql/core depend on hive

<dependency>
<groupId>org.apache.orc</groupId>
<artifactId>orc-core</artifactId>
<classifier>${orc.classifier}</classifier>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our native orc upgrade should be independent on Hive ORC reader, as @dongjoon-hyun pointed out.

Copy link
Member

@gatorsmile gatorsmile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really doubt the value of the upgrade of Hive execution JARs based on the risk it brings.

@wangyum wangyum closed this Mar 14, 2019
@srowen
Copy link
Member

srowen commented Mar 14, 2019

@gatorsmile @wangyum I agree with the risks here; one important consideration too is that I don't think Hive 1.x will work at all with Java 9+. Still investigating but it looks like Hive 2.x might. That doesn't mean we have to merge this but may come up again as a tough call to make for 3.0.

@wangyum
Copy link
Member Author

wangyum commented Mar 14, 2019

@srowen We plan to upgrade the built-in Hive to 2.3.4 for hadoop-3.1. hadoop-2.7 still uses 1.2.1. more details: https://issues.apache.org/jira/browse/SPARK-23710

@srowen
Copy link
Member

srowen commented Mar 14, 2019

That could make sense, to only publish / test with Java 11 with Hadoop 3.x and Hive 2.x+

@wangyum wangyum deleted the SPARK-23710-without-thriftserver branch February 15, 2020 01:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants