Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iceberg snapshot queries use the latest schema of the table #12743

Closed
Tracked by #1324
findinpath opened this issue Jun 8, 2022 · 2 comments · Fixed by #12786
Closed
Tracked by #1324

Iceberg snapshot queries use the latest schema of the table #12743

findinpath opened this issue Jun 8, 2022 · 2 comments · Fixed by #12786
Assignees
Labels
bug Something isn't working

Comments

@findinpath
Copy link
Contributor

trino> use iceberg.default;
USE
trino:default> create table test1 (x integer);
CREATE TABLE
trino:default> insert into test1 values (1);
INSERT: 1 row

Query 20220608_151154_00004_jjwsa, FINISHED, 1 node
Splits: 19 total, 19 done (100.00%)
4.50 [0 rows, 0B] [0 rows/s, 0B/s]

trino:default> alter table test1 add column y integer;
ADD COLUMN
trino:default> insert into test1 values (2,2);
INSERT: 1 row

Query 20220608_151228_00006_jjwsa, FINISHED, 1 node
Splits: 19 total, 19 done (100.00%)
2.79 [0 rows, 0B] [0 rows/s, 0B/s]

trino:default> select * from "test1$snapshots";
             committed_at              |     snapshot_id     |      parent_id      | operation |                              >
---------------------------------------+---------------------+---------------------+-----------+------------------------------>
 2022-06-08 17:11:46.960 Europe/Vienna | 2042731164093364991 |                NULL | append    | hdfs://hadoop-master:9000/use>
 2022-06-08 17:11:57.613 Europe/Vienna | 6495047863464771572 | 2042731164093364991 | append    | hdfs://hadoop-master:9000/use>
 2022-06-08 17:12:30.035 Europe/Vienna | 2562510531947863916 | 6495047863464771572 | append    | hdfs://hadoop-master:9000/use>
(3 rows)

Query 20220608_151319_00007_jjwsa, FINISHED, 1 node
Splits: 9 total, 9 done (100.00%)
0.52 [3 rows, 1.5KB] [5 rows/s, 2.89KB/s]

trino:default> select * from "test1$data@6495047863464771572";
 x |  y   
---+------
 1 | NULL 
(1 row)

Query 20220608_151348_00008_jjwsa, FINISHED, 1 node
Splits: 9 total, 9 done (100.00%)
0.89 [1 rows, 234B] [1 rows/s, 264B/s]

Expected result:

trino:default> select * from "test1$data@6495047863464771572";
 x 
---
 1 
(1 row)
@findinpath findinpath added the bug Something isn't working label Jun 8, 2022
@findepi findepi mentioned this issue Jun 8, 2022
93 tasks
@findepi
Copy link
Member

findepi commented Jun 8, 2022

cc @phd3

@findinpath
Copy link
Contributor Author

When retrieving the table schema for Iceberg:

TableSchema tableSchema = metadata.getTableSchema(session, tableHandle.get());

the field io.trino.plugin.iceberg.IcebergTableHandle#snapshotId is not being taken into account.

Useful code snippet to retrieve the schema of the table at a given snapshot:

int schemaId = table.snapshot(snapshotIdYouWant).schemaId(); 
Schema schema = table.schemas().get(schemaId);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

Successfully merging a pull request may close this issue.

2 participants