-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TIMESTAMP behaviour does not match sql standard #7122
Comments
Indeed. We've been talking about this for a while in our team. It was due to misinterpretation of the spec. We need to fix it, but haven't had the time to figure out what needs to be updated and the impact. |
We are happy to do the work. We just have settle how to handle the backward incompatibility this change implies. |
Ideally, yes, but it depends on how ugly it gets. |
We will look into that and try to figure out some implementation path. |
@losipiuk I agree with you that Timestamp w/o TZ in Presto is broken. I do NOT agree that Timestamp w TZ should behave like Instant. I believe it should also have an associated time zone. (In other word, I believe Timestamp w TZ is implemented correctly today.) Below is an excerpt of something I wrote early last year that summarizes the current behavior and my understanding. To summarize how things work today:
The way I understand it
Here is the reason I believe the first understanding is inconsistent. I can only think of one possible interpretation for the other 3 concepts:
Note here the inconsistency between interpretation of Timestamp w/o TZ and Time w/o TZ if we adopt the first interpretation of Timestamp w/o TZ (Instant vs LocalTime). Whereas under the second interpretation, it will be consistent (LocalDateTime vs LocalTime). I went to SQL spec for the definitive answer:
I believe these two rules proves that SQL spec agrees with my interpretation. Let's consider cast from Timestamp w/o TZ to Timestamp w/ TZ
Under first interpretation, these two cast should have yield results that are equal. Under second interpretation, they would produce different result. The rule in SQL spec produces two different results. Lastly, a side note from me. Both interpretation can produce results that is dependent on user session time zone:
Under the SQL spec, cast from timestamp w/o TZ to timestamp w/ TZ can produce different results based on user time zone. As a result, I guess this cast probably should NOT have been implicit. |
@losipiuk, @dain, and I reached agreement:
|
I'm currently working on that. |
As part of this project, we will need to identify any backward incompatible behavior and document it |
This is flag for switching between legacy TIME/TIMESTAMP semantic and new one beeing aligned with ANSI SQL. See: prestodb#7122
This is flag for switching between legacy TIME/TIMESTAMP semantic and new one beeing aligned with ANSI SQL. See: prestodb#7122
This is flag for switching between legacy TIME/TIMESTAMP semantic and new one beeing aligned with ANSI SQL. See: prestodb#7122
This is flag for switching between legacy TIME/TIMESTAMP semantic and new one being aligned with ANSI SQL. See: prestodb#7122
This is flag for switching between legacy TIME/TIMESTAMP semantic and new one being aligned with ANSI SQL. See: prestodb#7122
This is flag for switching between legacy TIME/TIMESTAMP semantic and new one being aligned with ANSI SQL. See: prestodb#7122
This is flag for switching between legacy TIME/TIMESTAMP semantic and new one being aligned with ANSI SQL. See: prestodb#7122
This is flag for switching between legacy TIME/TIMESTAMP semantic and new one being aligned with ANSI SQL. See: prestodb#7122
This is flag for switching between legacy TIME/TIMESTAMP semantic and new one being aligned with ANSI SQL. See: prestodb#7122
This is flag for switching between legacy TIME/TIMESTAMP semantic and new one being aligned with ANSI SQL. See: prestodb#7122
This is flag for switching between legacy TIME/TIMESTAMP semantic and new one being aligned with ANSI SQL. See: prestodb#7122
This is flag for switching between legacy TIME/TIMESTAMP semantic and new one being aligned with ANSI SQL. See: #7122
This is flag for switching between legacy TIME/TIMESTAMP semantic and new one being aligned with ANSI SQL. See: prestodb#7122
Summary: Pull Request resolved: #1230 From discussions in [github issue](prestodb/presto#7122 (comment)), it says "Extracting hour from 2016-01-01 12:00:00 <TZ> should return 12 no matter what <TZ> is put in template.". This diff implement that behavior. Reviewed By: kagamiori Differential Revision: D34944701 fbshipit-source-id: 4f9bad2de01432dce0ae158f84462a994ff5ddcc
…incubator#1230) Summary: Pull Request resolved: facebookincubator#1230 From discussions in [github issue](prestodb/presto#7122 (comment)), it says "Extracting hour from 2016-01-01 12:00:00 <TZ> should return 12 no matter what <TZ> is put in template.". This diff implement that behavior. Reviewed By: kagamiori Differential Revision: D34944701 fbshipit-source-id: 4f9bad2de01432dce0ae158f84462a994ff5ddcc
It seems like meaning of
TIMESTAMP
andTIMESTAMP WITH TIMEZONE
datatypes in Presto is totally not what is specified by SQL standard (and what other databases do).This is my understanding of SQL 2003 standard (4.6.2 Datetimes):
TIMESTAMP WITH TIMEZONE
represents absolute point in time. Typically databases store it internally as seconds since epoch in some fixed timezone (usually UTC). When queryingTIMESTAMP WITH TIMEZONE
data the values are presented to user in session timezone (yet session timezone is used just for presentation purposes).TIMESTAMP
does not represent specific point in time, but rather a reading of a wall clock+calendar. Selecting values fromTIMESTAMP
column should return same result set no matter what is the client's session timezone.While Presto semantics is different:
TIMESTAMP
seems to do whatTIMESTAMP WITH TIMEZONE
should.TIMESTAMP WITH TIMEZONE
encodes explicit timezone information to each value stored in table. The sql standard does not define a type like that. But it does not seem very practical. Assuming that values selected fromTIMESTAMP WITH TIMEZONE
are presented to user in session timezone anyway, the per-row timezone information can be stripped away and all values can be stored in some arbitrary fixed timezone (e.g. UTC).Please comment on the semantics. It seems wrong. Why the choice - as it is hard to believe that it was not done intentionally.
cc: @dain, @martint, @fiedukow, @cawallin
Edit: roadmap: #10326
The text was updated successfully, but these errors were encountered: