Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hive sync error with <class org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl not com.uber.hoodie.org.apache.hadoop_hive.metastore.MetaStoreFilterHoo> #533

Closed
louisliu318 opened this issue Dec 14, 2018 · 10 comments
Assignees

Comments

@louisliu318
Copy link

Environment:
spark-2.3.2
hadoop-2.7.3
hive-1.2.1

Error:
I am using spark datasource api to insert data into hoodie table and sync to hive.
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: class org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl not com.uber.hoodie.org.apache.hadoop_hive.metastore.MetaStoreFilterHook at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227) at com.uber.hoodie.org.apache.hadoop_hive.metastore.HiveMetaStoreClient.loadFilterHooks(HiveMetaStoreClient.java:240) at com.uber.hoodie.org.apache.hadoop_hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:192) at com.uber.hoodie.org.apache.hadoop_hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:181) at com.uber.hoodie.hive.HoodieHiveClient.<init>(HoodieHiveClient.java:102) at com.uber.hoodie.hive.HiveSyncTool.<init>(HiveSyncTool.java:61) at com.uber.hoodie.HoodieSparkSqlWriter$.syncHive(HoodieSparkSqlWriter.scala:246) at com.uber.hoodie.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:179) at com.uber.hoodie.DefaultSource.createRelation(DefaultSource.scala:106) at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80) at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:656) at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:656) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:656) at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:273) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:267) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:225) at com.lianjia.dtarch.databus.hudi.HudiBatchSync.execute(HudiBatchSync.java:85) at com.lianjia.dtarch.databus.hudi.HudiBatchSync.main(HudiBatchSync.java:63) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.RuntimeException: class org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl not com.uber.hoodie.org.apache.hadoop_hive.metastore.MetaStoreFilterHook at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2221) ... 39 more

@louisliu318
Copy link
Author

Removing the s in maven-shade-plugin , the error gone. Maybe need some workaroud in HoodieSparkSqlWriter.syncHive() in which we can set some hive configurations with com.uber.hoodie or set configurations in hive-site.xml.

@vinothchandar
Copy link
Member

Hive 1.x.. Are you using the correct spark bundle?

From quickstart

To work with older version of Hive (pre Hive-1.2.1), use

$ mvn clean install -DskipTests -DskipITs -Dhive11

@bvaradar for context

@louisliu318
Copy link
Author

@vinothchandar I'm using Hive-1.2.1, not Hive-1.1.1. In packaging/hoodie-spark-bundle/pom.xml, -Dhive12 points to hive version 1.2.1.

@bvaradar
Copy link
Contributor

@louisliu318 : THanks for filing this ticket. Yes, with Hive-1.2.1, the maven profile is hive12 (default). When I tested similar setup, I did not encounter this issue. This is caused by shading some of hive jars and including them in the bundle (bit of a magic by trial and error).

It is not clear from your comment if you solved this by disabling shading ?

Can you also try shading hive-metastore jar. Add this relocation in the shading section of pom.xml of hoodie-spark -

org.apache.hadoop.hive.metastore.
com.uber.hoodie.org.apache.hadoop_hive.metastore.

Let me know if this solves the problem.

@louisliu318
Copy link
Author

I solved the problem by comment out the relocations in the shading section of pom.xml of hoodie-spark -
.

@vinothchandar
Copy link
Member

can you throw your changes into a PR? Balaji and I discussed this more. He seems to have tested on hive1.2 as a part of the dockerized setup, and things worked for him.. Ideally we need to test this across all hive versions before making a call ... The hive jar versioning is very sensitive and changes made for one version often end up causing side effects for others

@vinothchandar
Copy link
Member

@louisliu318 ping again, to see if you can share the changes with us..

@bvaradar bvaradar self-assigned this Jan 17, 2019
@louisliu318
Copy link
Author

louisliu318 commented Feb 13, 2019

@vinothchandar In my environment, I commented out the following code about hive in packaging\hoodie-spark-bundle\hoodie-spark-bundle.iml

            <relocation>
              <pattern>org.apache.hive.jdbc.</pattern>
              <shadedPattern>com.uber.hoodie.org.apache.hive.jdbc.</shadedPattern>
            </relocation>
            <relocation>
              <pattern>org.apache.hadoop.hive.metastore.</pattern>
              <shadedPattern>com.uber.hoodie.org.apache.hadoop_hive.metastore.</shadedPattern>
            </relocation>
            <relocation>
              <pattern>org.apache.hive.common.</pattern>
              <shadedPattern>com.uber.hoodie.org.apache.hive.common.</shadedPattern>
            </relocation>
            <relocation>
              <pattern>org.apache.hadoop.hive.common.</pattern>
              <shadedPattern>com.uber.hoodie.org.apache.hadoop_hive.common.</shadedPattern>
            </relocation>
            <relocation>
              <pattern>org.apache.hadoop.hive.conf.</pattern>
              <shadedPattern>com.uber.hoodie.org.apache.hadoop_hive.conf.</shadedPattern>
            </relocation>
            <relocation>
              <pattern>org.apache.hive.service.</pattern>
              <shadedPattern>com.uber.hoodie.org.apache.hive.service.</shadedPattern>
            </relocation>
            <relocation>
              <pattern>org.apache.hadoop.hive.service.</pattern>
              <shadedPattern>com.uber.hoodie.org.apache.hadoop_hive.service.</shadedPattern>
            </relocation>

@bvaradar bvaradar self-assigned this Apr 9, 2019
@bvaradar
Copy link
Contributor

bvaradar commented Apr 9, 2019

This should be fixed by #633

@n3nash
Copy link
Contributor

n3nash commented Apr 10, 2019

closing this ticket in favor of #633 fixing the underlying issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants