You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue: On some asserted INSERT INTO ... SELECT query the data part of resulting parquet files is corrupted, footer is readable, though (verified via trino SELECT and parquet-tools)
Trino version: 367
Connector: Iceberg
Compression: GZIP
Example query:
INSERT INTO insert_table
SELECT RANK() OVER (ORDER BY coalesce(rank_dim_1.rank,0),coalesce(rank_dim_2.rank,0),coalesce(rank_dim_3.rank,0),coalesce(
rank_dim_4.rank,0),coalesce(rank_dim_5.rank,0)), rl.id AS line_item_id
FROM main_table rl
LEFT OUTER JOIN
(SELECT DENSE_RANK() OVER (order by dim_1) AS rank, dim_1
FROM (SELECT DISTINCT LOWER(dim_1) AS dim_1 FROM main_table)) rank_dim_1 ON ( rank_dim_1.dim_1 = LOWER(rl.dim_1) )
LEFT OUTER JOIN
(SELECT DENSE_RANK() OVER
(order by dim_2) AS rank, dim_2
FROM (SELECT DISTINCT LOWER(dim_2) AS dim_2 FROM main_table)) rank_dim_2 ON ( rank_dim_2.dim_2 = LOWER(rl.dim_2) )
LEFT OUTER JOIN
(SELECT DENSE_RANK() OVER (order by dim_3) AS rank, dim_3
FROM (SELECT DISTINCT LOWER(dim_3) AS dim_3
FROM main_table)) rank_dim_3 ON ( rank_dim_3.dim_3 = LOWER(rl.dim_3) )
LEFT OUTER JOIN (SELECT DENSE_RANK() OVER (order by dim_4) AS rank, dim_4
FROM (SELECT DISTINCT LOWER(dim_4) AS dim_4
FROM main_table)) rank_dim_4 ON ( rank_dim_4.dim_4 = LOWER(rl.dim_4) )
LEFT OUTER JOIN (SELECT DENSE_RANK() OVER (order by dim_5) AS rank, dim_5
FROM (SELECT DISTINCT LOWER(dim_5) AS dim_5 FROM main_table)
) rank_dim_5 ON ( rank_dim_5.dim_5 = LOWER(rl.dim_5) )
Number of rows in main_table: approx. 63M
Unfortunately I cannot share any data to reproduce, because it is company internal data and not too small in size. Any ideas towards what could cause this issue and how to work around it are highly appreciated.
The text was updated successfully, but these errors were encountered:
Issue: On some asserted INSERT INTO ... SELECT query the data part of resulting parquet files is corrupted, footer is readable, though (verified via trino SELECT and parquet-tools)
Trino version: 367
Connector: Iceberg
Compression: GZIP
Example query:
Number of rows in
main_table
: approx. 63MUnfortunately I cannot share any data to reproduce, because it is company internal data and not too small in size. Any ideas towards what could cause this issue and how to work around it are highly appreciated.
The text was updated successfully, but these errors were encountered: