Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-23831][SQL] Add org.apache.derby to IsolatedClientLoader #20944

Closed
wants to merge 4 commits into from
Closed

[SPARK-23831][SQL] Add org.apache.derby to IsolatedClientLoader #20944

wants to merge 4 commits into from

Conversation

wangyum
Copy link
Member

@wangyum wangyum commented Mar 30, 2018

What changes were proposed in this pull request?

Add org.apache.derby to IsolatedClientLoader, otherwise it may throw an exception:

...
[info] Cause: java.sql.SQLException: Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@2439ab23, see the next exception for details.
[info] at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
[info] at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
[info] at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)
[info] at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
[info] at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
[info] at org.apache.derby.jdbc.InternalDriver$1.run(Unknown Source)
...

How was this patch tested?

unit tests and manual tests

@SparkQA
Copy link

SparkQA commented Mar 30, 2018

Test build #88748 has finished for PR 20944 at commit 7d5cc71.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -188,6 +188,9 @@ private[hive] class IsolatedClientLoader(
(name.startsWith("com.google") && !name.startsWith("com.google.cloud")) ||
name.startsWith("java.lang.") ||
name.startsWith("java.net") ||
name.startsWith("com.sun.") ||
name.startsWith("sun.reflect.") ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two lines above are really broad, is this intentional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it doesn't matter if add these two lines, but I think it's best to add. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not add them unless we have to do it.

@wangyum
Copy link
Member Author

wangyum commented Apr 8, 2018

cc @jerryshao

@gatorsmile
Copy link
Member

@wangyum What is the root cause?

@@ -188,6 +188,9 @@ private[hive] class IsolatedClientLoader(
(name.startsWith("com.google") && !name.startsWith("com.google.cloud")) ||
name.startsWith("java.lang.") ||
name.startsWith("java.net") ||
name.startsWith("com.sun.") ||
name.startsWith("sun.reflect.") ||
name.startsWith("org.apache.derby.") ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also move this line to other org.apache* above

@jerryshao
Copy link
Contributor

Please list out the reason why do you need such change? If it is a UT bug, why it didn't happen before?

@SparkQA
Copy link

SparkQA commented Apr 9, 2018

Test build #89043 has finished for PR 20944 at commit 1c801f1.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Apr 9, 2018

retest this please.

@SparkQA
Copy link

SparkQA commented Apr 9, 2018

Test build #89046 has finished for PR 20944 at commit 1c801f1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 21, 2018

Test build #92170 has finished for PR 20944 at commit da36564.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 21, 2018

Test build #92171 has finished for PR 20944 at commit 7133d7a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor

holdenk commented Jun 27, 2018

So I had this come up while I was testing Spark 2.1.3 RC2 on a machine with an existing YARN cluster with spark-testing-base. Haven't had the chance to dig into it fully.

@@ -182,6 +182,7 @@ private[hive] class IsolatedClientLoader(
name.startsWith("org.slf4j") ||
name.startsWith("org.apache.log4j") || // log4j1.x
name.startsWith("org.apache.logging.log4j") || // log4j2
name.startsWith("org.apache.derby.") ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to clarify, this adds derby to the "shared" classes (we might want to mention this above on L139) and presumably this fix is for cases where we have an existing derby version on the class-path which may differ and if that happens we end up with strangeness on the metastore? cc @gatorsmile ?

Like I said, I've run into this issue so I'd like to see a fix :)

@gatorsmile
Copy link
Member

LGTM

Thanks! Merged to master

@asfgit asfgit closed this in a75571b Jul 13, 2018
@HyukjinKwon
Copy link
Member

HyukjinKwon commented Nov 7, 2018

Sorry, why was this change required? I don't see #20944 (comment) is addressed Can you elaborate please? Why do we make org.apache.derby as shared?

Ideally, minor or maintenance versions of derby can be dumped up, and they shouldn't be shared unless there's a strong reason to keep it shared, for instance, fixing class resolution failed. How did you reproduce this and why the unit test is not added?

I found an actual issue while working on Apache Livy Spark 2.4 support. I am still investigating how it relates with the test failures but at the very least I see this specific commit matters since Apache Livy unittests pass without this specific commit.

Adding @vanzin and @mgaido91

@HyukjinKwon
Copy link
Member

Please describe manual tests and how it relates to actual usecase.

@HyukjinKwon
Copy link
Member

ping @wangyum, I'm going to revert this today if there's no response today.

@wangyum
Copy link
Member Author

wangyum commented Nov 8, 2018

Sorry @HyukjinKwon It's difficult reproduce. I am not sure whether it is caused by multithreading.
But you can verify it by:

test("SPARK-23831: Add org.apache.derby to IsolatedClientLoader") {
val client1 = HiveUtils.newClientForMetadata(new SparkConf, new Configuration)
val client2 = HiveUtils.newClientForMetadata(new SparkConf, new Configuration)
assert(!client1.equals(client2))

@HyukjinKwon
Copy link
Member

I understood the reproducer step in JIRA but how and why it matters? Did it cause an actual problem in your production environment?

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Nov 8, 2018

To me, it sounds we made a fix but it was difficult to figure out exactly what's going on internally. It's okay if it's difficult to reproduce but it can be reproduced in production; however, the problem is that the fix of this caused another problem.

I am asking those to see why and how important this fix was.

@wangyum
Copy link
Member Author

wangyum commented Nov 8, 2018

This fix for testing only, production won't use derby as their matestore database.

@HyukjinKwon
Copy link
Member

Let's revert this then if this only targeted to fix the test. We can bring this back later when it's needed - tho, yea . This caused a specific case failure in Livy' when restarting Hive enabled Spark session.

@gatorsmile, I will revert this but I don't mind getting this back again if it actually fixes a usecase since I either don't know how exactly this fixes the case above and how it causes to make the Livy case failed - one thing clear is this specific commit is the cause.

Please revert me action if I missed an actual usecase fixed by this, I am okay.

@wangyum
Copy link
Member Author

wangyum commented Apr 8, 2019

@HyukjinKwon I could reproduce this issue:

build/sbt clean package -Phive -Phive-thriftserver
export SPARK_PREPEND_CLASSES=true
bin/spark-sql --conf spark.sql.hive.metastore.version=2.3.4 --conf spark.sql.hive.metastore.jars=maven -e "create table t1 as select 1 as c"

@HyukjinKwon
Copy link
Member

So this isn't test only fix anymore? If so, let's get it back. Let me check it too soon.

@HyukjinKwon
Copy link
Member

Hm, but I still think #20944 (comment) is valid, actually. @wangyum, do we meet this issue when we use our binary releases, or did you meet this only in dev as described above?

@wangyum
Copy link
Member Author

wangyum commented Apr 11, 2019

I hint this issue in two places:

  1. Specify spark.sql.hive.metastore.version when using derby:
wget https://www-eu.apache.org/dist/spark/spark-2.4.1/spark-2.4.1-bin-hadoop2.7.tgz && tar -zxf spark-2.4.1-bin-hadoop2.7.tgz && cd spark-2.4.1-bin-hadoop2.7
bin/spark-sql --conf spark.sql.hive.metastore.version=2.0.0 --conf spark.sql.hive.metastore.jars=maven -e "create table t1 as select 1 as c"
  1. Unit testing when the built-in Hive is upgraded to 2.3([WIP][test-hadoop3.2] Test Hadoop 3.2 on jenkins #24044. Remove this line and run):
build/sbt  "hive/testOnly *.HiveExternalSessionCatalogSuite *.HiveExternalCatalogSuite *.MultiDatabaseSuite"  -Phive -Phadoop-3.2

@HyukjinKwon
Copy link
Member

I see. But can you check if Hive 3.1 support and onward still works with this change? My impression is that we will don't respect their derby version with this change anymore.

@wangyum
Copy link
Member Author

wangyum commented Apr 15, 2019

The root cause is: Hive creates more java.sql.Connection since HIVE-10632(Hive 2.1.0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants