Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SQL Server DATETIMEOFFSET for old dates #19627

Merged

Conversation

adamjshook
Copy link
Member

Description

The value returned via the microsoft.sql.DateTimeOffset when converted to an OffsetDateTime is changed for old dates due to an issue in the JDBC driver (see linked issue for example).

This changes retrieving datetimeoffset types from SQL Server to use getString instead of getObject and OffsetDateTime.

Predicate pushdown is now disabled for this type due to test failures with IS NOT DISTINCT FROM predicates.

Additional context and related issues

#16559
microsoft/mssql-jdbc#2246

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(X) Release notes are required, with the following suggested text:

# SQL Server Connector
* Fixed incorrect results for old DATETIMEOFFSET values ({issue}`16559`)

@cla-bot cla-bot bot added the cla-signed label Nov 3, 2023
@adamjshook adamjshook requested review from ebyhr and hashhar November 3, 2023 17:39
@adamjshook
Copy link
Member Author

Is there an (easy) way to disable predicate pushdown for a specific predicate? I've been unable to grok the Domain and what means what.

Maybe instead of disabling it based on predicate type, we instead look at the value within the Domain and disable if it is old but enable if it is past the threshold? (need to determine the specific date, the 1582 date in the linked mssql-java issue would be a good place to start).

@adamjshook adamjshook force-pushed the adamjshook/sqlserver-datetime-offset branch from 094fc62 to 289b893 Compare November 3, 2023 19:30
@ebyhr ebyhr requested review from wendigo and removed request for ebyhr November 6, 2023 02:18
@@ -928,7 +939,8 @@ private static ObjectReadFunction longTimestampWithTimeZoneReadFunction()
return ObjectReadFunction.of(
LongTimestampWithTimeZone.class,
(resultSet, columnIndex) -> {
OffsetDateTime offsetDateTime = resultSet.getObject(columnIndex, OffsetDateTime.class);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this causing some performance penalty?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to perform this conversion only for historical dates ? I'm not sure if adding additional if would cause additional penalties ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so, we can get the MSSQL DateTimeOffset object and if it is before 1583 we can do the String conversion. Presumably it'd more performant that always doing the string parsing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a check to skip the date parsing if it is before 1583.

@hashhar
Copy link
Member

hashhar commented Nov 16, 2023

for conditional pushdown see https://github.com/hashhar/trino/blob/b8e44b702b8d4f6e1cc987a2175e430a3d05f373/plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql/PostgreSqlClient.java#L254

the logic can be something like:

  • if the domain is a single value or discrete set (lookup the methods on the Domain class) then we can get the values using getSingleValue or getDiscreteValues etc.
    • check if the value is problematic and in that case return DISABLE_PUSHDOWN.apply(session, domain) and in others return FULL_PUSHDOWN.apply(session, domain).

cc: @Praveen2112 IIRC you implemented some pushdown logic in past which looked at actual values in the domain or do I misremember.

@adamjshook adamjshook force-pushed the adamjshook/sqlserver-datetime-offset branch 2 times, most recently from ab1450a to 77a137b Compare December 4, 2023 16:58
@adamjshook
Copy link
Member Author

Updated predicate pushdown controller to disable pushdown for values prior to the year 1583. Ready for another look @hashhar @wendigo.

@adamjshook adamjshook requested a review from kokosing December 4, 2023 18:26
@hashhar hashhar requested a review from Praveen2112 December 11, 2023 09:10
private static final PredicatePushdownController SQLSERVER_DATE_TIME_PUSHDOWN = (session, domain) -> {
Domain simplifiedDomain = domain.simplify(getDomainCompactionThreshold(session));
for (Range range : simplifiedDomain.getValues().getRanges().getOrderedRanges()) {
Range disableRange = range.getType().getJavaType().equals(LongTimestampWithTimeZone.class)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: should we keep these two ranges as constants, instead of LONG_DATETIMEOFFSET_DISABLE_VALUE and SHORT_DATETIMEOFFSET_DISABLE_VALUE?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nineinchnick I had tried that, but the issue with having the Range as constants is you need to create them like below, but these class types cannot be imported because they aren't public in io.trino.spi.type. Please let me know if there is another way to create them.

Range.lessThan(LongTimestampWithTimeZoneType.class, LONG_DATETIMEOFFSET_DISABLE_VALUE)
Range.lessThan(ShortTimestampWithTimeZoneType.class, SHORT_DATETIMEOFFSET_DISABLE_VALUE);

Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good to me but @Praveen2112 has better knowledge of the Domains API so I'd appreciate a second look.

@@ -216,6 +220,13 @@ public class SqlServerClient

private static final DateTimeFormatter DATE_FORMATTER = DateTimeFormatter.ofPattern("uuuu-MM-dd");

private static final DateTimeFormatter DATE_TIME_OFFSET_FORMATTER = new DateTimeFormatterBuilder()
.appendPattern("yyyy-MM-dd HH:mm:ss")
.appendFraction(NANO_OF_SECOND, 0, 7, true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.appendFraction(NANO_OF_SECOND, 0, 7, true)
.appendFraction(NANO_OF_SECOND, 0, MAX_SUPPORTED_TEMPORAL_PRECISION, true)

Comment on lines 967 to 968
String stringValue = resultSet.getString(columnIndex);
OffsetDateTime offsetDateTime = ZonedDateTime.from(DATE_TIME_OFFSET_FORMATTER.parse(stringValue)).toOffsetDateTime();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this change and the one above in shortTimestampWithTimeZoneReadFunction strictly required in this PR?
I'm unable to understand the significance probably.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the issue is the driver changes from their internal DateTimeOffset class to the java.time.OffsetDateTime, the date shifts from what is stored. This instead gets the string representation of the date and we parse it to OffsetDateTime.

https://github.com/microsoft/mssql-jdbc/blob/main/src/main/java/microsoft/sql/DateTimeOffset.java#L229-L234

See microsoft/mssql-jdbc#2246 for examples and their discussion. The root cause of this seems to be due to skipping days/dates when the Gregorian calendar switched. The MSSQL issue is closed, likely nothing that is going to be fixed because maybe it isn't broken? Dates are fun.

@Praveen2112
Copy link
Member

Apologies for the delay. I'll take a look at the PR soon

@adamjshook adamjshook force-pushed the adamjshook/sqlserver-datetime-offset branch 2 times, most recently from 5667923 to 1913373 Compare December 12, 2023 23:23
@@ -928,7 +939,8 @@ private static ObjectReadFunction longTimestampWithTimeZoneReadFunction()
return ObjectReadFunction.of(
LongTimestampWithTimeZone.class,
(resultSet, columnIndex) -> {
OffsetDateTime offsetDateTime = resultSet.getObject(columnIndex, OffsetDateTime.class);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to perform this conversion only for historical dates ? I'm not sure if adding additional if would cause additional penalties ?

@@ -836,6 +837,61 @@ private void testSqlServerDatetimeOffset(ZoneId sessionZone)
.execute(getQueryRunner(), session, sqlServerCreateAndInsert("test_sqlserver_datetimeoffset"));
}

@Test
public void testSqlServerDatetimeOffsetOldDates()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Instead of old we could use historicalDates as old is pretty relative

"'1900-01-01 00:00:00.1234567+00:00'");

try (TestTable table = new TestTable(onRemoteDatabase(), "test_sqlserver_datetimeoffset_old_date_range_query", "(col0 datetimeoffset(7))", dateTimeOffsetValues)) {
assertQuery("SELECT count(*) FROM " + table.getName(), "SELECT 9");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also add an assertion to check if the filter is being pushed down or not ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added checks asserting that the filter is pushed down or not

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addiing additional assertion - Can we merge both of them ? like

        assertThat(query(SELECT count(*) FROM " + table.getName()))
                .matches("SELECT 9")
                .isFullyPushedDown();

assertQuery("SELECT count(*) FROM " + table.getName() + " WHERE col0 <= TIMESTAMP '1582-12-31 23:59:59.9999999+00:00'", "SELECT 3");
assertQuery("SELECT count(*) FROM " + table.getName() + " WHERE col0 >= TIMESTAMP '1583-01-01 00:00:00+00:00'", "SELECT 6");
assertQuery("SELECT count(*) FROM " + table.getName() + " WHERE col0 IN (TIMESTAMP '1582-12-31 23:59:59.9999999+00:00', TIMESTAMP '1583-01-01 00:00:00+00:00')", "SELECT 2");
assertQuery("SELECT count(*) FROM " + table.getName() + " WHERE col0 IN (TIMESTAMP '1583-01-01 00:00:00+00:00', TIMESTAMP '1600-01-01 00:00:00.1234567+00:00')", "SELECT 2");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about additional test cases for NOT IN and BETWEEN

Copy link
Member Author

@adamjshook adamjshook Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added additional test cases

@adamjshook adamjshook force-pushed the adamjshook/sqlserver-datetime-offset branch from 1913373 to 1e9fe12 Compare December 18, 2023 16:10
Copy link
Member

@Praveen2112 Praveen2112 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM

"'1900-01-01 00:00:00.1234567+00:00'");

try (TestTable table = new TestTable(onRemoteDatabase(), "test_sqlserver_datetimeoffset_old_date_range_query", "(col0 datetimeoffset(7))", dateTimeOffsetValues)) {
assertQuery("SELECT count(*) FROM " + table.getName(), "SELECT 9");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addiing additional assertion - Can we merge both of them ? like

        assertThat(query(SELECT count(*) FROM " + table.getName()))
                .matches("SELECT 9")
                .isFullyPushedDown();

return DISABLE_PUSHDOWN.apply(session, domain);
}
return FULL_PUSHDOWN.apply(session, simplifiedDomain);
};

// Dates prior to the Gregorian calendar switch in 1582 can cause incorrect results when pushed down,
// so we disable predicate push down when the domain contains values prior to 1583
private static final Instant DISABLE_DATETIMEOFFSET_PUSHDOWN_IF_BEFORE = Instant.parse("1583-01-01T00:00:00Z");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name gives an assumption to be like a boolean field or some function which returns boolean

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed them to GREGORIAN_SWITCH_INSTANT and GREGORIAN_SWITCH_DATETIMEOFFSET, open to other suggestions.

assertThat(query("SELECT count(*) FROM " + table.getName()))
.isFullyPushedDown();

assertThat(query("SELECT * FROM " + table.getName() + " WHERE col0 <= TIMESTAMP '1582-12-31 23:59:59.9999999+00:00'"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also have an additional test case for WHERE col0 <= TIMESTAMP '1990-01-01 00:00:00+00:00

The value returned via the microsoft.sql.DateTimeOffset when
converted to an OffsetDateTime is changed for old dates due
to an issue in the JDBC driver.

This changes retrieving datetimeoffset types from SQL Server
to use getString instead of getObject and OffsetDateTime.

Predicate pushdown is now disabled for this type due to test
failures with IS NOT DISTINCT FROM predicates if the value
is before the year 1583.
@adamjshook adamjshook force-pushed the adamjshook/sqlserver-datetime-offset branch from 1e9fe12 to 72376f5 Compare December 20, 2023 14:06
@adamjshook
Copy link
Member Author

@Praveen2112 Updated per your last batch of comments. Please take another look when you can, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants