Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When querying TIMESTAMP WITH TIMEZONE data the values should be presented to user in session timezone #13157

Closed
Sartan4455 opened this issue Jul 12, 2022 · 5 comments
Labels
enhancement New feature or request

Comments

@Sartan4455
Copy link

running docker version 389

timestamps in timestamp with timezone(timestamp(6) with time zone).

I see this in #37 closed which mentions : When querying TIMESTAMP WITH TIMEZONE data the values are presented to user in session timezone

Expected Behavior:

When querying TIMESTAMP WITH TIMEZONE data the values are presented to user in session timezone

Actual Behavior

Results are show in imported timezone(UTC) regardless of client timezone

Examples:

With the timezone of the client set to UTC

select current_timezone();
 _col0 
-------
 UTC   

select start_time from sessions where id = 1 limit 1;
           start_time           
--------------------------------
 2022-07-12 16:22:20.118428 UTC

Client Timezone set to America/Denver

select current_timezone();
     _col0      
----------------
 America/Denver 

select start_time from sessions where id = 1 limit 1;
           start_time           
--------------------------------
 2022-07-12 16:22:20.118428 UTC

Result in UTC not America/Denver.

@findinpath
Copy link
Contributor

findinpath commented Jul 13, 2022

Reproduction scenario on the product test environment:

testing/bin/ptl env up --environment 'singlenode'  --config 'config-default'
USE iceberg.default;

CREATE TABLE t1(v timestamp(6) with time zone);

SET TIME ZONE 'Europe/Vienna';

INSERT INTO t1 VALUES TIMESTAMP '2022-07-13 08:48:27.617 Europe/Vienna';

SELECT v FROM t1;
-- 2022-07-13 06:48:27.617000 UTC

SELECT v AT TIME ZONE 'Europe/Vienna' FROM t1;
-- 2022-07-13 08:48:27.617000 Europe/Vienna

SELECT v AT TIME ZONE 'Asia/Tokyo' FROM t1;
-- 2022-07-13 15:48:27.617000 Asia/Tokyo 

Given that the user set the session time zone to be 'Europe/Vienna' I find it logical as well that the results would be delivered at the session time zone and not at UTC.

Investigated the issue through debugging Int96ArrayBlock constructor and noticed that the value 6789918341599232 corresponding to the use case above is being read with UTC time zone from the data source.
Applying AT TIME ZONE 'Europe/Vienna' on the value read initially, transforms the value to 6789918341601332 (with the time zone Europe/Vienna).
Naively I'd say that applying a rewrite to wrap with AtTimeZone the timestamp with time zone value read from persistence would be a possible solution. See package io.trino.sql.planner.iterative.rule for reference.

@findinpath findinpath added the enhancement New feature or request label Jul 13, 2022
@Sartan4455
Copy link
Author

More evidence of this discrepancy, in my mind.

SELECT v, now() as now FROM t1;
               v                |                  now                  
--------------------------------+---------------------------------------
 2022-07-13 06:48:27.617000 UTC | 2022-07-13 17:53:56.697 Europe/Vienna

@martint
Copy link
Member

martint commented Jul 13, 2022

The session time zone has no bearing on how timestamp(p) with time zone values are treated, but the session time zone is often used for constructing a value of such type (e.g., to figure out which time zone to "attach" to a timestamp to make a timestamp with time zone). The session time zone is a "default" time zone to be used for these kinds of operations.

From the SQL spec:

For the convenience of users, whenever a datetime value with time zone is to be implicitly derived from one without (for example, in a simple assignment operation), SQL assumes the value without time zone to be local, subtracts the current default time zone displacement of the SQL-session from it to give UTC, and associates that time zone displacement with the result.

and

An SQL-session has an original time zone displacement and a current default time zone displacement, which are values of data type INTERVAL HOUR TO MINUTE. Both the original time zone displacement and the current default time zone displacement are initially set to the same implementation-defined value. The current default time zone displacement can subsequently be changed by successful execution of a <set local time zone statement>. The original time zone displacement cannot be changed. It is also possible to set the current default time zone displacement to equal the value of the original time zone displacement.

Also changing the time zone of a timestamp(p) with time zone implicitly would cause a loss of information -- it would be impossible for a caller to determine what was the original time zone associated with the value.

The mechanism prescribed by the SQL standard to produce that behavior is to use the AT LOCAL clause. For example:

SELECT value AT LOCAL 
FROM t

Unfortunately, Trino doesn't yet support that syntax.

@Sartan4455
Copy link
Author

Sartan4455 commented Jul 13, 2022

Also changing the time zone of a timestamp(p) with time zone implicitly would cause a loss of information -- it would be impossible for a caller to determine what was the original time zone associated with the value.

Is knowing the original disk time zone required? As I understood it postgres, as that's what I know well, store all time data in UTC and then does conversions as needed based on client settings. Thus postgres already lost the "original" timezone. As a user or client to the database I'd expect all timestamps with timezones to be displayed in session timezone unless explicitly requested.

If I insert MDT timestamp data I'd expect the database to store all that in UTC and lose the original MDT. I'd expect results to firstly show the data with the current client timezone and secondly show it overridden with a conversion request(AT TIME ZONE, etc).

From Postgres docs:

For timestamp with time zone, the internally stored value is always in UTC (Universal Coordinated Time, traditionally known as Greenwich Mean Time, GMT). An input value that has an explicit time zone specified is converted to UTC using the appropriate offset for that time zone. If no time zone is stated in the input string, then it is assumed to be in the time zone indicated by the system's TimeZone parameter, and is converted to UTC using the offset for the timezone zone.

When a timestamp with time zone value is output, it is always converted from UTC to the current timezone zone, and displayed as local time in that zone. To see the time in another time zone, either change timezone or use the AT TIME ZONE construct (see Section 9.9.4).

I can see both sides. It is more useful to me, and my team, to have essentially a global 'AT TIMEZONE' implied when using SET TIMEZONE or X-Trino-Time-Zone rather than every query have explicit conversions.

We do have a running gag that Trino is NOT postgres. :)

@findepi
Copy link
Member

findepi commented Jul 19, 2022

Is knowing the original disk time zone required? As I understood it postgres, as that's what I know well, store all time data in UTC and then does conversions as needed based on client settings. Thus postgres already lost the "original" timezone.

Correct. PostgreSQL's timestamptz stores point in time only

As a user or client to the database I'd expect all timestamps with timezones to be displayed in session timezone unless explicitly requested.

That would create representability problems. Session time zone cannot unambiguously represent certain points in time when there is a backward DST change (or some other, irregular timezone offset adjustment).

See eg #5488 for a case where we explicitly switched a connector from using session zone to fixed UTC zone.

Let me close this issue as answered.
We can continue conversation here, or on #dev channel on our slack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

4 participants