[SUPPORT] DISTRIBUTE BY is not supported(line 59:undefined, pos 0) when using hudi-0.11.1 & spark-3.2.1 #6156

jiezi2026 · 2022-07-21T01:34:01Z

Tips before filing an issue

Have you gone through our FAQs?
Join the mailing list to engage in conversations and get faster support at [email protected].
If you have triaged this as a bug, then file an issue directly.

Describe the problem you faced

A clear and concise description of the problem.

To Reproduce

Steps to reproduce the behavior:

When without conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' ,start a sparksql application by "/opt/apache/SPARK/SPARK-CURRENT/bin/spark-sql --num-executors 5 --queue=root.bi --conf spark.executor.cores=3 --conf spark.driver.memory=2G --conf spark.executor.memory=5G --conf spark.executor.memoryOverhead=2G"
-------------------[sparksql]---------------------------
select 1 distribute by rand()
-------------------[sparksql]---------------------------
The SQL execution results are as follows:

But when conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' is added,start another application by "/opt/apache/SPARK/SPARK-CURRENT/bin/spark-sql --num-executors 5 --queue=root.bi --conf spark.executor.cores=3 --conf spark.driver.memory=2G --conf spark.executor.memory=5G --conf spark.executor.memoryOverhead=2G --conf spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension"
-------------------[sparksql]---------------------------
select 1 distribute by rand()
-------------------[sparksql]---------------------------
Error operating EXECUTE_STATEMENT: org.apache.spark.sql.catalyst.parser.ParseException: DISTRIBUTE BY is not supported(line 1:undefined, pos 9)

It makes it impossible for me to use distribute by on other non Hudi tables

Expected behavior

A clear and concise description of what you expected to happen.

Environment Description

Hudi version :0.11.1
Spark version :3.2.1
Hive version :2.1.1-cdh6.3.2
Hadoop version :3.0.0-cdh6.3.2
Storage (HDFS/S3/GCS..) :HDFS
Running on Docker? (yes/no) :no

Additional context

Add any other context about the problem here.

Stacktrace

Add the stacktrace of the error.

The text was updated successfully, but these errors were encountered:

KnightChess · 2022-08-03T05:00:56Z

#6033 will fix it

nsivabalan · 2022-08-28T00:15:50Z

closing it out since the PR is landed. thanks @KnightChess

xushiyan added priority:minor everything else; usability gaps; questions; feature reqs feature-enquiry issue contains feature enquiries/requests or great improvement ideas spark-sql labels Jul 21, 2022

nsivabalan assigned XuQianJin-Stars Aug 9, 2022

nsivabalan closed this as completed Aug 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SUPPORT] DISTRIBUTE BY is not supported(line 59:undefined, pos 0) when using hudi-0.11.1 & spark-3.2.1 #6156

[SUPPORT] DISTRIBUTE BY is not supported(line 59:undefined, pos 0) when using hudi-0.11.1 & spark-3.2.1 #6156

jiezi2026 commented Jul 21, 2022

KnightChess commented Aug 3, 2022

nsivabalan commented Aug 28, 2022

[SUPPORT] DISTRIBUTE BY is not supported(line 59:undefined, pos 0) when using hudi-0.11.1 & spark-3.2.1 #6156

[SUPPORT] DISTRIBUTE BY is not supported(line 59:undefined, pos 0) when using hudi-0.11.1 & spark-3.2.1 #6156

Comments

jiezi2026 commented Jul 21, 2022

KnightChess commented Aug 3, 2022

nsivabalan commented Aug 28, 2022