INSERT, DELETE, UPDATE, MERGE query fails when merging into Iceberg table with non-lowercase partitioning column #16622

arunb2w · 2023-03-19T05:34:31Z

Getting Internal when running this merge statement in trino using iceberg connector and glue catalog for partitioned table. For non-partitioned table it works fine.
trino version - 403
connector - iceberg
Stacktrace:

java.lang.NullPointerException: undefined
	at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:889)
	at com.google.common.collect.ImmutableList$Builder.add(ImmutableList.java:813)
	at java.base/java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
	at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
	at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1845)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitMerge(StatementAnalyzer.java:3372)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitMerge(StatementAnalyzer.java:468)
	at io.trino.sql.tree.Merge.accept(Merge.java:100)
	at io.trino.sql.tree.AstVisitor.process(AstVisitor.java:27)
	at io.trino.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:485)
	at io.trino.sql.analyzer.StatementAnalyzer.analyze(StatementAnalyzer.java:447)
	at io.trino.sql.analyzer.Analyzer.analyze(Analyzer.java:79)
	at io.trino.sql.analyzer.Analyzer.analyze(Analyzer.java:71)
	at io.trino.execution.SqlQueryExecution.analyze(SqlQueryExecution.java:267)
	at io.trino.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:204)
	at io.trino.execution.SqlQueryExecution$SqlQueryExecutionFactory.createQueryExecution(SqlQueryExecution.java:856)
	at io.trino.dispatcher.LocalDispatchQueryFactory.lambda$createDispatchQuery$0(LocalDispatchQueryFactory.java:138)
	at io.trino.$gen.Trino_403_amzn_0____20230313_135431_2.call(Unknown Source)
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74)
	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)

Even after updating to 410 version, could see the same issue.
Upon further analyzing, was able to identify the root cause and it seems to be that when the column name used for partitioning is in upper-case it is throwing NPE whereas when created the table with partitioning column in lower case was able to resolve the issue.

Steps to reproduce:

Load the dataset from TPCH in a dataframe to create iceberg table using spark.
Change the case of column name that you want to partition using df = df.withColumn("PARTN_COLUMN", col("partn_column")). No fancy functions, just change the column name to upper case.
Create iceberg table by partitioned using this upper-case column. df.writeTo("maintbl").using("iceberg").partitionedBy("PARTN_COLUMN").createOrReplace()
Run merge query by using this partitioned_column in join condn
merge into maintbl t using join_tbl s on (t.PARTN_COLUMN = s.join_column) when matched then update ...

Then it will throw the same NPE error whereas if we used column name for partition in lower case itself it will work fine.

The text was updated successfully, but these errors were encountered:

findepi · 2023-03-20T10:00:12Z

cc @djsstarburst

ebyhr · 2023-03-24T07:34:45Z

I confirmed INSERT, DELETE and UPDATE queries also fail. Going to send a PR.

findepi changed the title ~~Partitioned table merge issue using iceberg connector~~ MERGE query fails when merging into Iceberg table with non-lowercase partitioning column Mar 20, 2023

findepi added the bug Something isn't working label Mar 20, 2023

ebyhr self-assigned this Mar 24, 2023

ebyhr changed the title ~~MERGE query fails when merging into Iceberg table with non-lowercase partitioning column~~ INSERT, DELETE, UPDATE, MERGE query fails when merging into Iceberg table with non-lowercase partitioning column Mar 24, 2023

ebyhr mentioned this issue Mar 24, 2023

Fix failure when partition column contains uppercase in Iceberg #16713

Merged

ebyhr closed this as completed in #16713 Mar 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

INSERT, DELETE, UPDATE, MERGE query fails when merging into Iceberg table with non-lowercase partitioning column #16622

INSERT, DELETE, UPDATE, MERGE query fails when merging into Iceberg table with non-lowercase partitioning column #16622

arunb2w commented Mar 19, 2023

findepi commented Mar 20, 2023

ebyhr commented Mar 24, 2023

INSERT, DELETE, UPDATE, MERGE query fails when merging into Iceberg table with non-lowercase partitioning column #16622

INSERT, DELETE, UPDATE, MERGE query fails when merging into Iceberg table with non-lowercase partitioning column #16622

Comments

arunb2w commented Mar 19, 2023

findepi commented Mar 20, 2023

ebyhr commented Mar 24, 2023