Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't optimize Iceberg table with hour-s range. #24793

Open
alex-art-repos opened this issue Jan 24, 2025 · 0 comments
Open

Can't optimize Iceberg table with hour-s range. #24793

alex-art-repos opened this issue Jan 24, 2025 · 0 comments

Comments

@alex-art-repos
Copy link

When run query

ALTER TABLE ice.v.t EXECUTE optimize
    where
    "timestamp" >= timestamp '2025-01-19 00:00:00'
    and "timestamp"  < timestamp '2025-01-19 12:00:00'

I got

java.lang.IllegalStateException: Unexpected FilterNode found in plan; probably connector was not able to handle provided WHERE expression
	at io.trino.sql.planner.sanity.TableExecuteStructureValidator.lambda$validate$1(TableExecuteStructureValidator.java:53)
	at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:423)
	at io.trino.sql.planner.sanity.TableExecuteStructureValidator.validate(TableExecuteStructureValidator.java:51)
	at io.trino.sql.planner.sanity.PlanSanityChecker.lambda$validate$0(PlanSanityChecker.java:107)
	at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:423)
	at io.trino.sql.planner.sanity.PlanSanityChecker.validate(PlanSanityChecker.java:107)
	at io.trino.sql.planner.sanity.PlanSanityChecker.validateFinalPlan(PlanSanityChecker.java:78)
	at io.trino.sql.planner.LogicalPlanner.plan(LogicalPlanner.java:280)
	at io.trino.sql.planner.LogicalPlanner.plan(LogicalPlanner.java:244)
	at io.trino.sql.planner.LogicalPlanner.plan(LogicalPlanner.java:239)
	at io.trino.execution.SqlQueryExecution.doPlanQuery(SqlQueryExecution.java:503)
	at io.trino.execution.SqlQueryExecution.planQuery(SqlQueryExecution.java:482)
	at io.trino.execution.SqlQueryExecution.start(SqlQueryExecution.java:420)
	at io.trino.execution.SqlQueryManager.createQuery(SqlQueryManager.java:272)
	at io.trino.dispatcher.LocalDispatchQuery.startExecution(LocalDispatchQuery.java:150)
	at io.trino.dispatcher.LocalDispatchQuery.lambda$waitForMinimumWorkers$2(LocalDispatchQuery.java:134)
	at io.airlift.concurrent.MoreFutures.lambda$addSuccessCallback$12(MoreFutures.java:570)
	at io.airlift.concurrent.MoreFutures$3.onSuccess(MoreFutures.java:545)
	at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1139)
	at io.trino.$gen.Trino_468____20250124_082131_2.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1575)
	Suppressed: java.lang.Exception: Current plan:
                Output[columnNames = [rows]]
                │   Layout: [rows:bigint]
                └─ TableCommit[target = ice:schemaTableName:v.t, procedureId:OPTIMIZE, procedureHandle:{IcebergOptimizeHandle[snapshotId=Optional[8819777825440361650], schemaAsJson={"type":"struct","schema-id":0,"fields":[{"id":1,"name":"c_i","required":true,"type":"long"},{"id":2,"name":"timestamp","required":true,"type":"timestamptz"},{"id":3,"name":"s_m","required":false,"type":"string"},{"id":4,"name":"s_i","required":false,"type":"string"},{"id":5,"name":"s_im","required":false,"type":"string"},{"id":6,"name":"s_mc","required":false,"type":"int"},{"id":7,"name":"s_mn","required":false,"type":"int"},{"id":8,"name":"lc","required":false,"type":"int"},{"id":9,"name":"cl","required":false,"type":"long"},{"id":10,"name":"tc","required":false,"type":"int"},{"id":11,"name":"ci","required":false,"type":"long"},{"id":12,"name":"t_s","required":false,"type":"int"},{"id":13,"name":"sw_id","required":true,"type":"string"},{"id":14,"name":"o_l","required":false,"type":"int"}]}, partitionSpecAsJson={"spec-id":0,"fields":[{"name":"timestamp_day","transform":"day","source-id":2,"field-id":1000}]}, tableColumns=[1:c_i:bigint, 2:timestamp:timestamp(6) with time zone, 3:s_m:varchar, 4:s_i:varchar, 5:s_im:varchar, 6:s_mc:integer, 7:s_mn:integer, 8:lc:integer, 9:cl:bigint, 10:tc:integer, 11:ci:bigint, 12:t_s:integer, 13:sw_id:varchar, 14:o_l:integer], sortOrder=[TrinoSortField[sourceColumnId=2, sortOrder=ASC NULLS FIRST]], fileFormat=ORC, tableStorageProperties={write.orc.bloom.filter.fpp=0.05, write.format.default=ORC, write.orc.bloom.filter.columns=s_m,s_i,s_im, write.parquet.compression-codec=zstd}, maxScannedFileSize=100MB, retriesEnabled=false]}]
                   │   Layout: [rows:bigint]
                   └─ LocalExchange[partitioning = SINGLE]
                      │   Layout: [partialrows:bigint, fragment:varbinary]
                      └─ RemoteExchange[type = GATHER]
                         │   Layout: [partialrows:bigint, fragment:varbinary]
                         └─ TableExecute[]
                            │   Layout: [partialrows:bigint, fragment:varbinary]
                            │   c_i := c_i
                            │   timestamp := timestamp
                            │   s_m := s_m
                            │   s_i := s_i
                            │   s_im := s_im
                            │   s_mc := s_mc
                            │   s_mn := s_mn
                            │   lc := lc
                            │   cl := cl
                            │   tc := tc
                            │   ci := ci
                            │   t_s := t_s
                            │   sw_id := sw_id
                            │   o_l := o_l
                            └─ LocalExchange[partitioning = ice:IcebergPartitioningHandle[update=false, partitionFunctions=[IcebergPartitionFunction[transform=DAY, dataPath=[0], type=timestamp(6) with time zone, size=OptionalInt.empty]]], arguments = [timestamp::timestamp(6) with time zone]]
                               │   Layout: [c_i:bigint, timestamp:timestamp(6) with time zone, s_m:varchar, s_i:varchar, s_im:varchar, s_mc:integer, s_mn:integer, lc:integer, cl:bigint, tc:integer, ci:bigint, t_s:integer, sw_id:varchar, o_l:integer]
                               └─ RemoteExchange[type = REPARTITION]
                                  │   Layout: [c_i:bigint, timestamp:timestamp(6) with time zone, s_m:varchar, s_i:varchar, s_im:varchar, s_mc:integer, s_mn:integer, lc:integer, cl:bigint, tc:integer, ci:bigint, t_s:integer, sw_id:varchar, o_l:integer]
                                  └─ ScanFilter[table = ice:v.t$data@8819777825440361650, filterPredicate = ((timestamp >= timestamp(6) with time zone '2025-01-19 00:00:00.000000 UTC') AND (timestamp < timestamp(6) with time zone '2025-01-19 12:00:00.000000 UTC'))]
                                         Layout: [c_i:bigint, timestamp:timestamp(6) with time zone, s_m:varchar, s_i:varchar, s_im:varchar, s_mc:integer, s_mn:integer, lc:integer, cl:bigint, tc:integer, ci:bigint, t_s:integer, sw_id:varchar, o_l:integer]
                                         t_s := 12:t_s:integer
                                         timestamp := 2:timestamp:timestamp(6) with time zone
                                         tc := 10:tc:integer
                                         s_m := 3:s_m:varchar
                                         ci := 11:ci:bigint
                                         s_im := 5:s_im:varchar
                                         s_mc := 6:s_mc:integer
                                         c_i := 1:c_i:bigint
                                         sw_id := 13:sw_id:varchar
                                         s_i := 4:s_i:varchar
                                         o_l := 14:o_l:integer
                                         s_mn := 7:s_mn:integer
                                         lc := 8:lc:integer
                                         cl := 9:cl:bigint

		at io.trino.sql.planner.sanity.PlanSanityChecker.validate(PlanSanityChecker.java:120)
		... 17 more

But query

ALTER TABLE ice.v.t EXECUTE optimize
    where
    "timestamp" >= timestamp '2025-01-19 00:00:00'
    and "timestamp"  < timestamp '2025-01-20 00:00:00';

was succesfully started.

Table definition

CREATE TABLE ice.v.t (
   c_id bigint NOT NULL,
   timestamp timestamp(6) with time zone NOT NULL,
   s_m varchar,
   s_i varchar,
   s_im varchar,
   s_mc integer,
   s_mn integer,
   lc integer,
   cl bigint,
   tc integer,
   ci bigint,
   t_s integer,
   sw_id varchar NOT NULL,
   o_l integer
)
WITH (
   format = 'ORC',
   format_version = 2,
   location = 's3a://trino/v/t-b11111111111111a00c88de3509bb9c',
   orc_bloom_filter_columns = ARRAY['s_m','s_i','s_im'],
   orc_bloom_filter_fpp = 5E-2,
   partitioning = ARRAY['day(timestamp)'],
   sorted_by = ARRAY['timestamp ASC NULLS FIRST']
)

Environment:

  • Trino 468 (Docker)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant