-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DO-NOT-MERGE][SPARK-23710] Upgrade built-in Hive to 2.3.4(Without hive-thriftserver) #23552
Conversation
This commit does not contains the module of hive-thriftserver as it has many changes.
Test build #101272 has finished for PR 23552 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it still necessary to have the hive-contrib-2.3.4.jar etc in the test resources, to make the tests work?
Could you summarize the compatibility changes in the "Docs text" field of the JIRA and add the release-notes tag? that would help everyone understand the implications of the change.
But yeah this seems like something we have to do.
private def isSubDir(p1: Path, p2: Path, fs: FileSystem): Boolean = { | ||
val path1 = fs.makeQualified(p1).toString | ||
val path2 = fs.makeQualified(p2).toString | ||
if (path1.startsWith(path2)) true else false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for if (...) true else false
. Just return the value of the predicate
@@ -197,6 +201,13 @@ class OrcFileFormat extends FileFormat with DataSourceRegister with Serializable | |||
|
|||
case _ => false | |||
} | |||
|
|||
def toKryo(sarg: SearchArgument): String = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private?
*/ | ||
private def castLiteralValue(value: Any, dataType: DataType): Any = dataType match { | ||
case ByteType | ShortType | IntegerType | LongType => | ||
value.asInstanceOf[Number].longValue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trivial, but I'd use () or not consistently between here and line 110
@@ -192,6 +192,7 @@ private[hive] class IsolatedClientLoader( | |||
(name.startsWith("com.google") && !name.startsWith("com.google.cloud")) || | |||
name.startsWith("java.lang.") || | |||
name.startsWith("java.net") || | |||
name.startsWith("org.apache.derby.") || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breaks Apache Livy, as @HyukjinKwon pointed out. #20944 (comment)
@@ -86,15 +86,17 @@ | |||
<scope>test</scope> | |||
</dependency> | |||
|
|||
<dependency> | |||
<groupId>org.apache.hive</groupId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not let sql/core depend on hive
<dependency> | ||
<groupId>org.apache.orc</groupId> | ||
<artifactId>orc-core</artifactId> | ||
<classifier>${orc.classifier}</classifier> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our native orc upgrade should be independent on Hive ORC reader, as @dongjoon-hyun pointed out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really doubt the value of the upgrade of Hive execution JARs based on the risk it brings.
@gatorsmile @wangyum I agree with the risks here; one important consideration too is that I don't think Hive 1.x will work at all with Java 9+. Still investigating but it looks like Hive 2.x might. That doesn't mean we have to merge this but may come up again as a tough call to make for 3.0. |
@srowen We plan to upgrade the built-in Hive to 2.3.4 for hadoop-3.1. hadoop-2.7 still uses 1.2.1. more details: https://issues.apache.org/jira/browse/SPARK-23710 |
That could make sense, to only publish / test with Java 11 with Hadoop 3.x and Hive 2.x+ |
What changes were proposed in this pull request?
This is the first PR. Just to make it easier to review the changes except
hive-thriftserver
module.Complete changes please go:
How was this patch tested?
unit tests and manual tests