You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since Trino changed the semantics of timestamps, please refer to issue #37. In Trino, the "timestamp" type refers to a point in time measured in seconds from 1970-01-01 00:00:00 and is not affected by the session's time zone.
In the old Presto 332, when reading ORC files, the values obtained from the ORC reader match the values written by the ORC writer if the storage time zone matches the file's time zone and is not UTC. Timestamp values are encoded using three fields:
Seconds since 2015-01-01 in the file's time zone
Nanoseconds
File time zone in the stripe footer
It appears that the time zone in this file only affects the encoding and decoding process, with the original value remaining unchanged.
However, in Trino 424, when reading legacy files, the ORC reader still decodes using the same three fields as mentioned above. After decoding, the value is then converted using fileDateTimeZone.convertUTCToLocal:
File: io.trino.orc.reader.TimestampColumnReader.java, Line 408
This means that the original value written by the ORC writer is altered by the reader in Trino 424, leading to incorrect timestamp semantics. Here's an example:
Session Time Zone: UTC
ORC File/Storage Time Zone: Asia/Shanghai
Since Trino changed the semantics of timestamps, please refer to issue #37. In Trino, the "timestamp" type refers to a point in time measured in seconds from 1970-01-01 00:00:00 and is not affected by the session's time zone.
In the old Presto 332, when reading ORC files, the values obtained from the ORC reader match the values written by the ORC writer if the storage time zone matches the file's time zone and is not UTC. Timestamp values are encoded using three fields:
Seconds since 2015-01-01 in the file's time zone
Nanoseconds
File time zone in the stripe footer
It appears that the time zone in this file only affects the encoding and decoding process, with the original value remaining unchanged.
However, in Trino 424, when reading legacy files, the ORC reader still decodes using the same three fields as mentioned above. After decoding, the value is then converted using fileDateTimeZone.convertUTCToLocal:
File: io.trino.orc.reader.TimestampColumnReader.java, Line 408
This means that the original value written by the ORC writer is altered by the reader in Trino 424, leading to incorrect timestamp semantics. Here's an example:
Session Time Zone: UTC
ORC File/Storage Time Zone: Asia/Shanghai
Original value: 2020-01-01 00:00:00+00:00
Presto 332 read: 2020-01-01 00:00:00+00:00
Trino 424 read: 2020-01-01 08:00:00
Desired Trino 424 result: 2020-01-01 00:00:00
The text was updated successfully, but these errors were encountered: