[SUPPORT] DISTRIBUTE BY is not supported(line 59:undefined, pos 0) when using hudi-0.11.1 & spark-3.2.1 #6156
Labels
feature-enquiry
issue contains feature enquiries/requests or great improvement ideas
priority:minor
everything else; usability gaps; questions; feature reqs
spark-sql
Tips before filing an issue
Have you gone through our FAQs?
Join the mailing list to engage in conversations and get faster support at [email protected].
If you have triaged this as a bug, then file an issue directly.
Describe the problem you faced
A clear and concise description of the problem.
To Reproduce
Steps to reproduce the behavior:
When without conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' ,start a sparksql application by "/opt/apache/SPARK/SPARK-CURRENT/bin/spark-sql --num-executors 5 --queue=root.bi --conf spark.executor.cores=3 --conf spark.driver.memory=2G --conf spark.executor.memory=5G --conf spark.executor.memoryOverhead=2G"
![image](https://user-images.githubusercontent.com/98273236/180110031-ef64c9ad-2921-4e05-bec9-44d66322d24f.png)
-------------------[sparksql]---------------------------
select 1 distribute by rand()
-------------------[sparksql]---------------------------
The SQL execution results are as follows:
But when conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' is added,start another application by "/opt/apache/SPARK/SPARK-CURRENT/bin/spark-sql --num-executors 5 --queue=root.bi --conf spark.executor.cores=3 --conf spark.driver.memory=2G --conf spark.executor.memory=5G --conf spark.executor.memoryOverhead=2G --conf spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension"
![image](https://user-images.githubusercontent.com/98273236/180110348-e5800aac-b04d-49a3-a5c6-ea9e606b3c46.png)
-------------------[sparksql]---------------------------
select 1 distribute by rand()
-------------------[sparksql]---------------------------
Error operating EXECUTE_STATEMENT: org.apache.spark.sql.catalyst.parser.ParseException: DISTRIBUTE BY is not supported(line 1:undefined, pos 9)
It makes it impossible for me to use distribute by on other non Hudi tables
Expected behavior
A clear and concise description of what you expected to happen.
Environment Description
Hudi version :0.11.1
Spark version :3.2.1
Hive version :2.1.1-cdh6.3.2
Hadoop version :3.0.0-cdh6.3.2
Storage (HDFS/S3/GCS..) :HDFS
Running on Docker? (yes/no) :no
Additional context
Add any other context about the problem here.
Stacktrace
Add the stacktrace of the error.
The text was updated successfully, but these errors were encountered: