Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug report] In certain environment, failed to insert data to hive table in spark-sql #3181

Closed
danhuawang opened this issue Apr 24, 2024 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@danhuawang
Copy link
Contributor

Version

main branch

Describe what's wrong

spark-sql (default)> use cc2;
use cc2
Time taken: 0.059 seconds
spark-sql ()> CREATE DATABASE db2;
CREATE DATABASE db2
Time taken: 0.135 seconds
spark-sql ()> use db2;
use db2
Time taken: 0.04 seconds
spark-sql (db2)> CREATE TABLE hive_students (id INT, name STRING);
CREATE TABLE hive_students (id INT, name STRING)
Time taken: 0.252 seconds
spark-sql (db2)> INSERT INTO hive_students VALUES (1, 'Alice'), (2, 'Bob');
INSERT INTO hive_students VALUES (1, 'Alice'), (2, 'Bob')
24/04/24 11:59:23 WARN ObjectStore: Failed to get database db2, returning NoSuchObjectException
[SCHEMA_NOT_FOUND] The schema `db2` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a catalog, verify the current_schema() output, or qualify the name with the correct catalog.
To tolerate the error on drop use DROP SCHEMA IF EXISTS.
spark-sql (db2)> show tables;
show tables
hive_students
Time taken: 0.22 seconds, Fetched 1 row(s)

Error message and/or stacktrace

spark-sql (default)> use spark_catalog;
use spark_catalog
Time taken: 0.054 seconds
spark-sql (default)> CREATE DATABASE db;
CREATE DATABASE db
24/04/24 11:56:14 WARN ObjectStore: Failed to get database db, returning NoSuchObjectException
24/04/24 11:56:14 WARN ObjectStore: Failed to get database db, returning NoSuchObjectException
24/04/24 11:56:15 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
24/04/24 11:56:15 WARN ObjectStore: Failed to get database db, returning NoSuchObjectException
24/04/24 11:56:15 ERROR log: Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=ubuntu, access=WRITE, inode="/user/hive/warehouse-hive/db.db":root:hdfs:drwxr-xr-x
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1728)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1712)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1695)
        at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:71)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3896)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:984)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:622)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

How to reproduce

  1. Configure the Spark session to use the Gravitino spark connector.
ubuntu@ip-172-31-33-70:/opt/spark$ ./bin/spark-sql -v --conf spark.plugins="com.datastrato.gravitino.spark.connector.plugin.GravitinoSparkPlugin" --conf spark.sql.gravitino.uri=http://3.115.106.59:8090 --conf spark.sql.gravitino.metalake=test --conf spark.sql.warehouse.dir=hdfs://18.183.104.49:9000/user/hive/warehouse-hive
  1. Execute the Spark SQL query.
// use hive catalog
USE hive;
CREATE DATABASE db;
USE db;
CREATE TABLE hive_students (id INT, name STRING);
INSERT INTO hive_students VALUES (1, 'Alice'), (2, 'Bob');

Additional context

No response

@danhuawang danhuawang added the bug Something isn't working label Apr 24, 2024
@FANNG1 FANNG1 self-assigned this Apr 29, 2024
@FANNG1 FANNG1 added this to the Gravitino June Release milestone Apr 29, 2024
@FANNG1
Copy link
Contributor

FANNG1 commented May 6, 2024

@danhuawang , it's mainly caused by using spark-sql and not setting spark.sql.hive.metastore.jars explicitly. I create a issue in apache/kyuubi#6362, and will create a new PR to allow setting spark.sql.hive.metastore.jars to different values in #3270 , you could continue test spark connector with spark-shell not spark-sql

@FANNG1
Copy link
Contributor

FANNG1 commented Jun 1, 2024

@danhuawang , you could download hive jars to your machine and set corresponding catalog properties like below to use spark-sql

{
    "name": "hive",
    "type": "RELATIONAL",
    "comment": "comment",
    "provider": "hive",
    "properties": {
        "metastore.uris": "thrift://localhost:9083",
        "spark.sql.hive.metastore.jars":"path",
        "spark.sql.hive.metastore.jars.path":"file:///Users/fanng/deploy/hive/lib/*"
    }
}

@FANNG1
Copy link
Contributor

FANNG1 commented Aug 1, 2024

seems we could close it, @danhuawang WDYT?

@jerryshao jerryshao added the 0.6.0 label Aug 1, 2024
@danhuawang
Copy link
Contributor Author

seems we could close it, @danhuawang WDYT?

@FANNG1 Sure. We can closed it.

@FANNG1 FANNG1 closed this as completed Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants