-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TIME/TIMESTAMP W/O TIME ZONE semantics fix (#7122) #7378
Conversation
Awesome! I'm very excited about this change.
|
ISOChronology chronology = getChronology(session.getTimeZoneKey()); | ||
return utcMillis - chronology.getZone().getOffset(utcMillis); | ||
// date is encoded as milliseconds at midnight in UTC | ||
// convert to midnight if the session timezone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/if/in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I might need to reduce verbosity... first cmt digested.
public SqlTimeWithTimeZone(long timeWithTimeZone) | ||
{ | ||
millisUtc = unpackMillisUtc(timeWithTimeZone); | ||
if (millisUtc > MAX_MILLIS_TIME) { | ||
throw new RuntimeException("millisUtc must be time in millisecond that past since midnight UTC."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- s/past/passed/ ?
- I wouldn't use trailing dot in the msg
- consider including the
millisUtc
value in msg
@@ -16,30 +16,64 @@ | |||
import com.fasterxml.jackson.annotation.JsonValue; | |||
|
|||
import java.time.Instant; | |||
import java.time.LocalTime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cmt msg
s/Instance.atZone/Instant.ofEpochMilli(..).atZone
s/will always use TZ/\0 offset/
s/chaning/changing
s/missunderstanding/misunderstanding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmt msg cont:
- remove "ID" from the msg, i.e.: "Fix TZ issue in TIME WITH TZ for TZ
IDchanging offset over time"
import java.time.format.DateTimeFormatter; | ||
import java.util.Objects; | ||
import java.util.TimeZone; | ||
|
||
import static com.facebook.presto.spi.type.DateTimeEncoding.unpackMillisUtc; | ||
import static com.facebook.presto.spi.type.DateTimeEncoding.unpackZoneKey; | ||
|
||
/** | ||
* This type represent instant of time in some day and timezone, where timezone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
represents
public final class SqlTimeWithTimeZone | ||
{ | ||
private static final DateTimeFormatter formatter = DateTimeFormatter.ofPattern("HH:mm:ss.SSS VV"); | ||
|
||
/** | ||
* As we dont have negative timestamps and max value is midnight in timezone that is ~(UTC -12:00), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"midnight" might be interpreted as time 00:00 where 23:59:59.(9) should probably be understood.
Isn't max value in ~ UTC +12
(~easternmost time zone)?
s/dont/don't
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With UTC +12, 23:00:00 millisUTC equals 1000*60*60*(23-12) = x
With UTC -12, 23:00:00 millisUTC equals 1000*60*60*(23+12) = y
y > x
, or am I wrong at some point?
Also, the part about negative millisUtc was wrong. I removed that.
private static final long MAX_MILLIS_TIME = 2 * 24 * 60 * 60 * 1000; | ||
|
||
/** | ||
* Milliseconds relative to midnight UTC of the given time (millisUtc < MAX_MILLIS_TIME). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is MIN_MILLIS_TIME ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
~(-MAX_MILLIS_TIME)
, do you think we should double sanity checks here and check that as well? I thought that MAX is enough sanity as representing dates before 1970-01-01 in timestamp is rare and I didn't seen this constraint broken in Presto codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8bd8ff87b4e3db7423bfd92f2ed65f504beafd60
private final long millisUtc; | ||
private final TimeZoneKey timeZoneKey; | ||
|
||
/** | ||
* FIXME This is used only to determine last of TZ shifts for TIME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me neither ;)
private final long millisUtc; | ||
private final TimeZoneKey timeZoneKey; | ||
|
||
/** | ||
* FIXME This is used only to determine last of TZ shifts for TIME | ||
* This is workaround to avoid chaning semantic of millisUTC being part of SPI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changing
* to +5:45, without usage of this timestamp, we would use TZ shift valid | ||
* on 1970-01-01, which is invalid today. | ||
*/ | ||
private final long timezoneOffsetReferenceTimestampUtc = System.currentTimeMillis(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When SqlTimeWithTimeZone
objects are instantiated? Does keeping this on a field give any more stable results than consulting System.currentTimeMillis
whenever current time zone offset is to be queried?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really was only thinking about making this more static (like having singleton for all Presto that will just provide that value). Just to avoid potential change in some weird scenarios.
Think about that, if we would call System.currentTimeMillis in toString, we would have DTO with toString (potentially) morphing over time.
I'm strongly against making this calculation later and more "dynamic", I would rather talk about making it earlier and more "static".
// FIXME This is hack that we need to use as the timezone interpretation depends on date (not only on time) | ||
// Additionally we cannot just grab current TZ offset from timezoneOffsetReferenceTimestampUtc and use it | ||
// as our formatting library will fail to display expected TZ symbol. | ||
long currentMillisOfDay = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"currentMillisOfDayInZone" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what zone? This is UTC, right?
|
||
Instant utcInstantInCurrentDay = Instant.ofEpochMilli(timeMillisUtcInCurrentDay); | ||
ZonedDateTime zonedDateTime = utcInstantInCurrentDay.atZone(ZoneId.of(timeZoneKey.getId())); | ||
return zonedDateTime.format(formatter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow, looks correct! Do we have a test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots and lots of test, sadly mostly indirect :(
I think we could add some testing to that, especially checking if this is aligned with TIME WITH TIME ZONE to VARCHAR cast, which is expected but not guaranteed by the code.
I'll add that to TODO for later or escalating to further tickets - depending on circumstances.
long currentMillisOfDay = | ||
LocalTime.from( | ||
Instant.ofEpochMilli(timezoneOffsetReferenceTimestampUtc) | ||
.atZone(ZoneId.of(TimeZoneKey.UTC_KEY.getId()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ZoneOffset.UTC
could be used here?
LocalTime.from( | ||
Instant.ofEpochMilli(timezoneOffsetReferenceTimestampUtc) | ||
.atZone(ZoneId.of(TimeZoneKey.UTC_KEY.getId()))) | ||
.toNanoOfDay() / 1_000_000L; // nanos to millis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you do here in those several lines, seems equivalent to "timezoneOffsetReferenceTimestampUtc % (86400 * 1000)". See "Java Time-Scale" @ https://docs.oracle.com/javase/8/docs/api/java/time/Instant.html
Or, as you wrote in DTF.currentTime
: UTC_CHRONOLOGY.millisOfDay().get(timezoneOffsetReferenceTimestampUtc)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timezoneOffsetReferenceTimestampUtc % (86400 * 1000)
- I know that is correct, but this code is hard to understand without weird magic constants.
As for other proposition this would require to to introduce Joda to SPI and as far as we understand we avoid introducing libraries to SPI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sure, Joda dep is not cool (especially that I'd like to see Joda go away in some time:), so what about:
/*
* Equivalent to using Joda: UTC_CHRONOLOGY.millisOfDay().get(timezoneOffsetReferenceTimestampUtc)
*/
long millisInADay = 86400 * 1000; // or TimeUnit.DAYS.toMillis(1) if you wish
currentMillisOfDay = timezoneOffsetReferenceTimestampUtc % millisInADay;
?
By naming this constant (and adding an explanatory comment), you deprive it of any magic it had.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently, Joda-based code can be transliterated to java.time this way:
currentMillisOfDay = ChronoField.MILLI_OF_DAY.getFrom(
Instant.ofEpochMilli(timezoneOffsetReferenceTimestampUtc).atZone(ZoneOffset.UTC))
no constants involved this time.
// but the start of the day is relative to the current time zone. | ||
long millis = getChronology(session.getTimeZoneKey()).millisOfDay().get(session.getStartTime()); | ||
// We do all calculation in UTC, as session.getStartTime() is in UTC | ||
// and we need to have UTC milis for packDateTimeWithZone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: millis
@@ -132,7 +131,8 @@ public static long currentTimestamp(ConnectorSession session) | |||
@SqlType(StandardTypes.TIMESTAMP) | |||
public static long localTimestamp(ConnectorSession session) | |||
{ | |||
return session.getStartTime(); | |||
ISOChronology localChronology = getChronology(session.getTimeZoneKey()); | |||
return localChronology.getZone().convertUTCToLocal(session.getStartTime()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope you're aware that testLocalTimestamp
doesn't pass at this commit. I assume your strategy is not to be green at every step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I just messed with that to much while rebasing I will check that.
@@ -119,8 +120,13 @@ public void testCurrentDateTimezone() | |||
throws Exception | |||
{ | |||
TimeZoneKey kievTimeZoneKey = getTimeZoneKey("Europe/Kiev"); | |||
for (long instant = new DateTime(2000, 6, 15, 0, 0).getMillis(); instant < new DateTime(2016, 6, 15, 0, 0).getMillis(); instant += TimeUnit.HOURS.toMillis(1)) { | |||
TimeZoneKey montrealTimeZoneKey = getTimeZoneKey("America/Montreal"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please comment why adding Montreal here increases cases coverage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't comment in commit message enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure it is. I missed that
TimeZoneKey montrealTimeZoneKey = getTimeZoneKey("America/Montreal"); | ||
// We expect UTC millis later on so we have to use UTC chronology | ||
for (long instant = ISOChronology.getInstanceUTC().getDateTimeMillis(2000, 6, 15, 0, 0, 0, 0); | ||
instant < ISOChronology.getInstanceUTC().getDateTimeMillis(2016, 6, 15, 0, 0, 0, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While at it... Test method is usually executed once in JVM lifetime. Can we expect JVM to pull ISOChronology.getInstanceUTC().getDateTimeMillis(2016, 6, 15, 0, 0, 0, 0)
"constant" before the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we try to optimise tests to work better with JVM now?
// We expect UTC millis later on so we have to use UTC chronology | ||
for (long instant = ISOChronology.getInstanceUTC().getDateTimeMillis(2000, 6, 15, 0, 0, 0, 0); | ||
instant < ISOChronology.getInstanceUTC().getDateTimeMillis(2016, 6, 15, 0, 0, 0, 0); | ||
instant += TimeUnit.HOURS.toMillis(1)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would our testing be better if we checked not only whole (UTC) hours? Like with instant += 57 minutes
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made it 53 as primes are fancier :P
@@ -69,6 +69,7 @@ | |||
public static final String REORDER_WINDOWS = "reorder_windows"; | |||
public static final String ITERATIVE_OPTIMIZER = "iterative_optimizer_enabled"; | |||
public static final String EXCHANGE_COMPRESSION = "exchange_compression"; | |||
public static final String LEGACY_TIMESTAMP = "legacy_timestamp"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmt msg s/beeing/being/
false), | ||
booleanSessionProperty( | ||
LEGACY_TIMESTAMP, | ||
"Use legacy TIMESTAMP, TIME & DATE semantic", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
semantics
public static boolean isLegacyTimestamp(Session session) | ||
{ | ||
return session.getSystemProperty(LEGACY_TIMESTAMP, Boolean.class); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add \n
@@ -134,7 +134,7 @@ public void testCurrentDateTimezone() | |||
private static void assertCurrentDateAtInstant(TimeZoneKey timeZoneKey, long instant) | |||
{ | |||
long expectedDays = epochDaysInZone(timeZoneKey, instant); | |||
long dateTimeCalculation = currentDate(new TestingConnectorSession("test", timeZoneKey, US, instant, ImmutableList.of(), ImmutableMap.of())); | |||
long dateTimeCalculation = currentDate(new TestingConnectorSession("test", timeZoneKey, US, instant, ImmutableList.of(), ImmutableMap.of(), true)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider assigning the true
to a variable first, so that it has a name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice suggestion but as it's just a test so I think it's better to keep it short.
TimeZoneKey timestampZone, | ||
ConnectorSession session) | ||
{ | ||
if (session.isLegacyTimestamp()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really reviewing this yet, but...
"Introduce sqlTime(stamp)Of method to avoid direct constructor calls" cmt msg suggests a refactor and here you seem to add logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope that new msg is fine for you ;)
8c69bf8
to
5a351fe
Compare
|
||
Instant utcInstantInCurrentDay = Instant.ofEpochMilli(timeMillisUtcInCurrentDay); | ||
ZonedDateTime zonedDateTime = utcInstantInCurrentDay.atZone(ZoneId.of(timeZoneKey.getId())); | ||
return zonedDateTime.format(formatter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing. While range of allowed values is > 24h (48h? infinite?), toString()
output only within 24h. That means two different values (and different by 24h or more) will be indistinguishable. :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope it's more clear after 8bd8ff87b4e3db7423bfd92f2ed65f504beafd60 :-)
@@ -19,30 +19,47 @@ | |||
import java.time.ZoneId; | |||
import java.time.format.DateTimeFormatter; | |||
import java.util.Objects; | |||
import java.util.Optional; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmt msg ("Introduce new TIME/TIMESTAMP semantic to SqlTypes")
- semantics
SqlTypes
orSqlTime
or "SQL types" ?
{ | ||
return sessionTimeZoneKey; | ||
} | ||
|
||
@Deprecated | ||
private boolean isDeprecatedTimestamp() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isDeprecatedTime
stamp?, usesDeprecatedTimeSemantics
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this name is not perfect in this (and many other) contexts.
I would still keep this name aligned across the code as it will be easier to remove all those functions and variables when we decide to drop backward compatibility.
@@ -19,30 +19,47 @@ | |||
import java.time.ZoneId; | |||
import java.time.format.DateTimeFormatter; | |||
import java.util.Objects; | |||
import java.util.Optional; | |||
|
|||
import static com.facebook.presto.spi.type.TimeZoneKey.UTC_KEY; | |||
|
|||
public final class SqlTime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While at it... consider adding a javadoc, like there is in SqlTimeWithTimeZone
. E.g. you might likewise explain range of possible values for the millisUtc
field.
* is known and day is not. | ||
*/ | ||
public final class SqlTimeWithTimeZone | ||
{ | ||
private static final DateTimeFormatter formatter = DateTimeFormatter.ofPattern("HH:mm:ss.SSS VV"); | ||
|
||
/** | ||
* As we dont have negative timestamps and max value is midnight in timezone that is ~(UTC -12:00), | ||
* just not to care about precision we do use ~(UTC -24:00). | ||
* We know that max allowed value is midnight in timezone that is ~(UTC -12:00) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still... "midnight" means 00:00:00 or ~23:59:59 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8bd8ff87b4e3db7423bfd92f2ed65f504beafd60
return Instant.ofEpochMilli(millisUtc).atZone(ZoneId.of(sessionTimeZoneKey.get().getId())).format(formatter); | ||
} | ||
else { | ||
// FIXME this should be exact equivalent of TIMESTAMP to VARCHAR cast |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FIXME?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its just a note at this point.
} | ||
else { | ||
// FIXME this should be exact equivalent of TIMESTAMP to VARCHAR cast | ||
return Instant.ofEpochMilli(millisUtc).atZone(ZoneId.of(UTC_KEY.getId())).format(formatter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this works, Instant.ofEpochMilli(millisUtc)
creates an instant in time, which is "over interpretation" of the millisUtc
which stores "millis of the day". Would
LocalTime.ofNanoOfDay(...).format(formatter)
do?
(If not, use ZoneOffset.UTC
...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About this over interpretation.
Semantically, TIMESTAMP W/O TZ (as well as TIME W/O TZ) is just set of integers YEAR (1), MONTH (2), DAY (3), HOUR (4), MINUTE (5), SECOND (6), MILLIS (7).
Our representation encodes that fields into single long that can be interpreted in following way: "If you interpret that long as Instance (epoch in millis), than UTC wall time of this instance (1-7) gives you exact value of (1-7) fields.". This kind of encoding is weird but as far as I understand should work perfectly fine, as each value of such long maps to exactly one UTC wall time and each wall time maps to exactly one value of this representation. Also maths on that is easy for fields (4-7) and it's hard for (1-3) but can be easily done using Instance class.
This is why I think it's perfectly fine (in fact is is the only valid way for fields (1-3)) to use this value as Instance in UTC. For fields (4-7), there is choice to make, and we can use Instance logic or simple duration logic (including basic division/modulo operations) as we know that each day contains exactly 1000*60*60*24
milliseconds.
I use both approaches. When working directly on long value I use TimeUnits for converting values (4-7). I use Instance for everything else as we did use that before and I think that having less changes is better in such huge refactor.
I think that LocalTime is used only in some tests at this point, as you helped me with eliminating that from SqlTimeWithTimeZone.toString(), so I would just not add yet another class to understand our codebase (even though it's class closest to our semantic).
Maybe it is worth considering to drop usage of Instance in all TIME/TIMESTAMP W/O TZ contexts. This is huge change though.
Please let me know if you agree with that approach, as it will (should be) visible across all changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about leap seconds. Isn't modulo approach broken if we take them into consideration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.oracle.com/javase/8/docs/api/java/time/Instant.html
Also SQL standard does leave leap second support to implementation so using Java approach seems fine for us
c5bedea
to
adaef11
Compare
return new SqlTimestamp(new DateTime(year, monthOfYear, dayOfMonth, hourOfDay, minuteOfHour, secondOfMinute, millisOfSecond, baseZone).getMillis(), timestampZone); | ||
} | ||
else { | ||
return new SqlTimestamp(new DateTime(year, monthOfYear, dayOfMonth, hourOfDay, minuteOfHour, secondOfMinute, millisOfSecond, getDateTimeZone(UTC_KEY)).getMillis()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/getDateTimeZone(UTC_KEY)
/DateTimeZone.UTC
/ ?
@@ -548,4 +552,52 @@ else if (parsePos < 0) { | |||
return ~bestInvalidPos; | |||
} | |||
} | |||
|
|||
public static SqlTimestamp sqlTimestampOf(int year, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is used only in tests. Could this be in tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be in presto-main/testing as those are helpers for different modules tests as well.
@@ -734,6 +734,11 @@ public ConnectorPageSource createPageSource(Session session, Split split, List<C | |||
} | |||
} | |||
|
|||
public Session getSession() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor, but I'd put this method higher in the class, not between static inner classes
assertFunction("TIME '03:04'", TIME, new SqlTime(new DateTime(1970, 1, 1, 3, 4, 0, 0, DATE_TIME_ZONE).getMillis(), TIME_ZONE_KEY)); | ||
assertFunction("TIME '03:04:05.321'", TIME, sqlTimeOf(1970, 1, 1, 3, 4, 5, 321, DATE_TIME_ZONE, TIME_ZONE_KEY, functionAssertions.getSession().toConnectorSession())); | ||
assertFunction("TIME '03:04:05'", TIME, sqlTimeOf(1970, 1, 1, 3, 4, 5, 0, DATE_TIME_ZONE, TIME_ZONE_KEY, functionAssertions.getSession().toConnectorSession())); | ||
assertFunction("TIME '03:04'", TIME, sqlTimeOf(1970, 1, 1, 3, 4, 0, 0, DATE_TIME_ZONE, TIME_ZONE_KEY, functionAssertions.getSession().toConnectorSession())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps you could have extracted functionAssertions.getSession().toConnectorSession()
to a method for readability.
EDIT just as you did in TestTimestamp.java
@@ -67,22 +68,27 @@ private void assertFunction(String projection, Type expectedType, Object expecte | |||
functionAssertions.assertFunction(projection, expectedType, expected); | |||
} | |||
|
|||
private ConnectorSession connectorSession() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
assertFunction("cast('\n\t 2001-1-22 03:04:05.321 \t\n' as timestamp)", | ||
TIMESTAMP, | ||
new SqlTimestamp(new DateTime(2001, 1, 22, 3, 4, 5, 321, DATE_TIME_ZONE).getMillis(), TIME_ZONE_KEY)); | ||
assertFunction("cast('2001-1-22 03:04:05.321' as timestamp)", TIMESTAMP, sqlTimestampOf(2001, 1, 22, 3, 4, 5, 321, DATE_TIME_ZONE, TIME_ZONE_KEY, connectorSession())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You seems to have unnecessarily changed formatting here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you mean avoiding line break I think it's fine
int secondOfMinute, | ||
int millisOfSecond, | ||
DateTimeZone baseZone, | ||
TimeZoneKey timestampZone, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When timestampZone ≠ session.getTimeZoneKey
? Similarly, when baseZone ≠ DateTimeZoneIndex.getDateTimeZone(timestampZone)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eg. In old semantic when literal WITH TZ is casted to TIMESTAMP W/O TZ and TZ of literal is different than session TZ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm not lost, this was about ...#sqlTimestampOf(int, int, int, int, int, int, int, org.joda.time.DateTimeZone, com.facebook.presto.spi.type.TimeZoneKey, com.facebook.presto.spi.ConnectorSession)
- this is test-only (now)
- i looked briefly through usages, and it seems
timestampZone = session.getTimeZoneKey
in all cases.. didn't check for sure, though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestTimestampWithTimeZone.java:
assertFunction("cast(TIMESTAMP '2001-1-22 03:04:05.321 +07:09' as timestamp)",
TIMESTAMP,
sqlTimestampOf(2001, 1, 22, 3, 4, 5, 321, WEIRD_ZONE, session.getTimeZoneKey(), session.toConnectorSession()));
|
||
public static SqlTime sqlTimeOf(int year, | ||
int monthOfYear, | ||
int dayOfMonth, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need year, monthOfYear, dayOfMonth
to construct SqlTime
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Not) surprisingly, the first args to this method are always 1970, 1, 1
:)
int secondOfMinute, | ||
int millisOfSecond, | ||
DateTimeZone baseZone, | ||
TimeZoneKey timestampZone, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same q as for sqlTimestampOf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
answered
@@ -678,7 +679,6 @@ private static SqlDate sqlDate(int year, int month, int day) | |||
|
|||
private static SqlTimestamp sqlTimestamp(int year, int month, int day, int hour, int minute, int second) | |||
{ | |||
DateTime dateTime = new DateTime(year, month, day, hour, minute, second, 0, UTC); | |||
return new SqlTimestamp(dateTime.getMillis(), UTC_KEY); | |||
return sqlTimestampOf(year, month, day, hour, minute, second, 0, UTC, UTC_KEY, SESSION); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See how this private static SqlTimestamp sqlTimestamp
method is used. Probably a better fix for it would be
to detach it completely from SqlTimestamp
, like
long timestampUtc(...) {
return new DateTime(year, month, day, hour, minute, second, 0, UTC).getMillis()
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that way, but it would be unrelated change. Especially I don't want to mix that with this commit content.
@@ -207,7 +208,14 @@ private void testRoundTripNumeric(Iterable<? extends Number> values) | |||
tester.testRoundTrip( | |||
javaTimestampObjectInspector, | |||
writeValues.stream() | |||
.map(timestamp -> new SqlTimestamp(timestamp, UTC_KEY)) | |||
.map(timestamp -> { | |||
if (SESSION.isLegacyTimestamp()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know SESSION
is has UTC
time zone key, so you could simply sqlTimestampOf(timestamp, SESSION)
, right?
@@ -159,7 +160,14 @@ public void testTimestampSequence() | |||
TIMESTAMP, | |||
intsBetween(-31_234, 31_234).stream() | |||
.filter(i -> i % 19 == 0) | |||
.map(timestamp -> new SqlTimestamp(timestamp, UTC_KEY)) | |||
.map(timestamp -> { | |||
if (SESSION.isLegacyTimestamp()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know SESSION
is has UTC
time zone key, so you could simply sqlTimestampOf(timestamp, SESSION)
, right?
@@ -141,7 +142,12 @@ private static SqlDate sqlDate(DateTime from) | |||
|
|||
private static SqlTimestamp toTimestamp(DateTime dateTime) | |||
{ | |||
return new SqlTimestamp(dateTime.getMillis(), SESSION.getTimeZoneKey()); | |||
if (isLegacyTimestamp(SESSION)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return sqlTimestampOf(dateTime.getMillis(), SESSION)
?
return getChronology(session.getTimeZoneKey()).monthOfYear().add(right, left); | ||
} | ||
else { | ||
return MONTH_OF_YEAR_UTC.add(right, left); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You didn't change it, I know, but the variables here are quite intuitively named.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean right & left? I think they are perfect as well ;-)
@@ -276,8 +276,14 @@ public static long timestampWithTimeZoneMinusIntervalYearToMonth(@SqlType(Standa | |||
return updateMillisUtc(dateTimeWithTimeZone, left); | |||
} | |||
|
|||
@Deprecated | |||
public static int modulo24Hour(ISOChronology chronology, long millis) | |||
{ | |||
return chronology.millisOfDay().get(millis) - chronology.getZone().getOffset(millis); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You didn't change anything here, I know, but why oh why the module method isn't modulo....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is modulo - TZ shift to get into SqlTime long range and semantic.
// convert to midnight if the session timezone | ||
ISOChronology chronology = getChronology(session.getTimeZoneKey()); | ||
return utcMillis - chronology.getZone().getOffset(utcMillis); | ||
// date is encoded as milliseconds at midnight in UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TimeUnit.DAYS.toMillis(value)
suggests that "date is encoded as epoch days ...", right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commit is from legacy implementation and it regards utcMillis variable in this context. It was quite clear to me what author meant when I was reading that first time, so I thought this comment is good enough in this legacy context.
This is true that DATE value is expressed in epoch days.
import static io.airlift.slice.Slices.utf8Slice; | ||
|
||
public final class TimeOperators | ||
{ | ||
private static final long REFERENCE_TIMESTAMP_UTC = System.currentTimeMillis(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zonk. This cannot be const :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this will be globally solved in separate commit.
LocalTime.from( | ||
Instant.ofEpochMilli(REFERENCE_TIMESTAMP_UTC) | ||
.atZone(ZoneId.of(TimeZoneKey.UTC_KEY.getId()))) | ||
.toNanoOfDay() / 1_000_000L; // nanos to millis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be simplified, just the same as in the first commit.
|
||
// This cast does treat TIME as wall time in session TZ. This means that in order to get | ||
// its UTC representation we need to shift the value by the offset of TZ. | ||
return packDateTimeWithZone(value - localChronology.getZone().getOffset(timeMillisUtcInCurrentDay), session.getTimeZoneKey()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fail to understand why return packDateTimeWithZone(value, session.getTimeZoneKey());
wasn't good. I think you must fix me before I go any further
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I fixed myself myself.
|
||
// This cast does treat TIMESTAMP as wall time in session TZ. This means that in order to get | ||
// its UTC representation we need to shift the value by the offset of TZ. | ||
return packDateTimeWithZone(modulo24Hour(value - localChronology.getZone().getOffset(value)), session.getTimeZoneKey()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this equivalent to casting to TIMESTAMP_WITH_TZ first and that casting to TIME_WITH_TZ by truncating y,d,m info? Do we have tests covering it is always so? Perhaps even regardless of legacy switch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getOffset(value)
-- same comment as in castToTimestampWithTimeZone
:
localChronology.getZone()#getOffset
expectsmillisUtc
and you give it amillisUtcShiftedToOtherTimeZone
as input. If I'm correct, that means withintimezone's (avg) offset
ofDST change
you will be giving incorrect result.There seems to be
DateTimeZone#getOffsetFromLocal
for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've talked about that it will be fixed according to the standard (using same "hack" as in TIME).
timeZoneKey = unpackZoneKey(timeWithTimeZone); | ||
|
||
validateMillisRange(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you did this(unpackMillisUtc(timeWithTimeZone), unpackZoneKey(timeWithTimeZone))
you would be able to retain input validation directly in the (other) ctor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, on the other hand that way it's easier to get rid of legacy code later. I did such choice in multiple places.
Instant.ofEpochMilli(timezoneOffsetReferenceTimestampUtc) | ||
.atZone(ZoneOffset.UTC)) | ||
.toNanoOfDay() / 1_000_000L; // nanos to millis | ||
.toNanoOfDay()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a reminder: i still think that this can be written simply as
currentMillisOfDay = ChronoField.MILLI_OF_DAY.getFrom(
Instant.ofEpochMilli(timezoneOffsetReferenceTimestampUtc).atZone(ZoneOffset.UTC))
{ | ||
ZoneId id = ZoneId.of(timeZoneKey.getId()); | ||
ZoneOffset offset = id.getRules().getOffset(getTimeInstantInCurrentDay()); | ||
return millisUtc + TimeUnit.SECONDS.toMillis(offset.getTotalSeconds()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or
ChronoField.MILLI_OF_DAY.getFrom(getTimeInstantInCurrentDay().atZone(id))
?
Note: there should be parallelism between toString()
and getLocalMillis()
as they both effectively work out what current local time is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why they both use getTimeInstantInCurrentDay(), the implementation is forked afterwards.
|
||
// This cast does treat TIMESTAMP as wall time in session TZ. This means that in order to get | ||
// its UTC representation we need to shift the value by the offset of TZ. | ||
return packDateTimeWithZone(value - localChronology.getZone().getOffset(value), session.getTimeZoneKey()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
localChronology.getZone()#getOffset
expects millisUtc
and you give it a millisUtcShiftedToOtherTimeZone
as input. If I'm correct, that means within timezone's (avg) offset
of DST change
you will be giving incorrect result.
There seems to be DateTimeZone#getOffsetFromLocal
for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've talked about that it will be fixed according to the standard (using same "hack" as in TIME).
long date = chronology.dayOfYear().roundFloor(value); | ||
// date is currently midnight in timezone of the session | ||
// convert to UTC | ||
long millis = date + chronology.getZone().getOffset(date); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm dying to know what is the point to query TZ's offset in the first one third of the first second of January 1, 1970.
(ACK this is not yours)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the question (if there is any).
|
||
// This cast does treat TIMESTAMP as wall time in session TZ. This means that in order to get | ||
// its UTC representation we need to shift the value by the offset of TZ. | ||
return packDateTimeWithZone(modulo24Hour(value - localChronology.getZone().getOffset(value)), session.getTimeZoneKey()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getOffset(value)
-- same comment as in castToTimestampWithTimeZone
:
localChronology.getZone()#getOffset
expectsmillisUtc
and you give it amillisUtcShiftedToOtherTimeZone
as input. If I'm correct, that means withintimezone's (avg) offset
ofDST change
you will be giving incorrect result.There seems to be
DateTimeZone#getOffsetFromLocal
for that.
long timeMillisUtcInCurrentDay = REFERENCE_TIMESTAMP_UTC - currentMillisOfDay + unpackMillisUtc(value); | ||
|
||
ISOChronology chronology = getChronology(unpackZoneKey(value)); | ||
return unpackMillisUtc(value) + chronology.getZone().getOffset(timeMillisUtcInCurrentDay); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unpackMillisUtc(value)
is twice, could be assigned to a variable- we talked about fixed offset for zone for the query duration..
getOffset(start of the query)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will fix that globally as we talked about that.
LocalTime.from( | ||
Instant.ofEpochMilli(REFERENCE_TIMESTAMP_UTC) | ||
.atZone(ZoneId.of(TimeZoneKey.UTC_KEY.getId()))) | ||
.toNanoOfDay() / 1_000_000L; // nanos to millis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simplify
@Test | ||
public void testTimeZoneGapIsNotApplied() | ||
{ | ||
// Behavior is different in legacy mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this "different" is easily understood in the context of code review, but later, if looking at this code, one could probably have problem digesting it
@@ -34,6 +34,7 @@ | |||
import java.util.Locale; | |||
|
|||
import static com.facebook.presto.operator.scalar.QuarterOfYearDateTimeField.QUARTER_OF_YEAR; | |||
import static com.facebook.presto.spi.StandardErrorCode.FUNCTION_NOT_FOUND; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmt msg ("Remove to_iso8601 function for new TIMESTAMP semantics"): s/(complet)(ly)/\1e\2/g
} | ||
else { | ||
// FIXME: Simply remove this function when legacyTimestamp flag will be dropped | ||
throw new PrestoException(FUNCTION_NOT_FOUND, "to_iso8601(timestamp) does not exist"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe https://en.wikipedia.org/wiki/ISO_8601 is not authoritative, but quoting
If no UTC relation information is given with a time representation, the time is assumed to be in local time.
So, there is ISO 8601 format suitable for timestamp, it just does not contain the tz info. Thus, an alternative to disabling the method would be changing its semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have serious doubts about wikipedia reliablity based on other sources including Java libraries implementations. I've requested access to ISO 8601 to confirm that, but for now leaving in "as is" state.
@@ -79,6 +80,18 @@ protected void assertDecimalFunction(String statement, SqlDecimal expectedResult | |||
expectedResult); | |||
} | |||
|
|||
protected void assertThrowsPrestoException(String projection, StandardErrorCode errorCode, String message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the other place you called the very similar method assert*Function*ThrowsPrestoException
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key to this mystery is in your comment:
In the other place
{ | ||
try { | ||
evaluateInvalid(projection); | ||
fail("Expected to throw an INVALID_FUNCTION_ARGUMENT exception with message " + message); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/INVALID_FUNCTION_ARGUMENT/errorCode.toErrorCode()/
?
VARCHAR casted to TIMESTAMP cannot contain TIMEZONE in new semantic
To avoid problem with statically read function list, the excepion FUNCTION_NOT_FOUND will be thrown when isLegacyTimestamp = false. This is temporary workaround, function should be removed completely when backward compatibility flag will be dropped.
AT TIME ZONE does not take into account the fact, that TIME WITH TIME ZONE does not represent real millisecond UTC in millisUtc field. In fact, this field contains millisUtc assuming offset of TIME ZONE that was valid on 1970-01-01. Such representation allows to simply represent local time with time zone id. That means that TIME WITH TIME ZONE that represents eg. '10:00:00.000 Asia/Kathmandu' will always represent this exact value. However mapping of such value to other TZ (including UTC) may differ over time. Eg. Asia/Kathmandu switched time zone offset on 1986 from +5:30 to +5:45. Result of query like: `SELECT time_with_tz_column FROM table;` Will always be the same, however: `SELECT time_with_tz_column AT TIME ZONE 'UTC' FROM table;` Will yail differnet value in 1980 and 2000 after changes from this commit. This is done to use current offset of TZ as function that stucked in 1970-01-01 offsets seems useless. This is not perfect solution and is not fully aligned with standard, but standard behavior cannot be achieved with current TIME WITH TIME ZONE representation, as we are not able to read TZ offset from TIME WITH TIME ZONE itselve (at least not in all cases).
Continued here: #7480 |
This is first peek at #7122 solution, full documentation will be provided.
Commits are kept small to show all changes, but as always squashes are welcome.
First three commits are kinda separate from PR, if you'll insist I can separate that, but I would say it has to be fixed anyway.