Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat precision as NANOSECONDS for timestamp to be coerced #18003

Conversation

Praveen2112
Copy link
Member

Description

This will be irrespective of the precision configured or specified as
session property.

Additional context and related issues

Depends on #17900.

Release notes

( ) This is not user-visible or docs only and no release notes are required.
(x) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Treat precision as NANOSECONDS for timestamp to be coerced.

@cla-bot cla-bot bot added the cla-signed label Jun 22, 2023
@github-actions github-actions bot added hive Hive connector tests:hive labels Jun 22, 2023
@Praveen2112 Praveen2112 force-pushed the praveen/nano_second_precision_for_coerced_timestamp_column branch from 0c657ad to b941d64 Compare June 22, 2023 08:03
@@ -48,29 +48,6 @@ public final class TimestampCoercer

private TimestampCoercer() {}

public static class ShortTimestampToVarcharCoercer
Copy link
Contributor

@vlad-lyutenko vlad-lyutenko Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little bit missed, why it's safe to remove it here?
It's because of this change in HivePageSource:

if (fromType instanceof TimestampType timestampType && toType instanceof VarcharType varcharType) {
            // --deleted-- if (timestampType.isShort()) {
            //  --deleted--  return Optional.of(new ShortTimestampToVarcharCoercer(timestampType, varcharType));
            // }
            return Optional.of(new LongTimestampToVarcharCoercer(timestampType, varcharType));
        }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we were respecting the timestamp precision - so we need a coercer which would could have timestamp read as long or it could be as LongTimestamp - but this change would ensure that we would be reading the timestamp column as a NANOSECOND - so we don't need ShortTimestampToVarcharCoercer

@Praveen2112 Praveen2112 force-pushed the praveen/nano_second_precision_for_coerced_timestamp_column branch from b941d64 to e520eb3 Compare June 28, 2023 11:56
@Praveen2112 Praveen2112 marked this pull request as ready for review June 28, 2023 11:56
@Praveen2112 Praveen2112 force-pushed the praveen/nano_second_precision_for_coerced_timestamp_column branch from e520eb3 to c34eeea Compare July 4, 2023 07:38
@Praveen2112
Copy link
Member Author

@krvikash Thanks for the review. Addressed the comments

@Praveen2112 Praveen2112 force-pushed the praveen/nano_second_precision_for_coerced_timestamp_column branch 2 times, most recently from eb18007 to 6be4816 Compare July 4, 2023 11:54
@findepi findepi force-pushed the praveen/nano_second_precision_for_coerced_timestamp_column branch from 6be4816 to a2cf304 Compare July 4, 2023 12:40
@findepi
Copy link
Member

findepi commented Jul 4, 2023

squashed to make reviewable

@findepi
Copy link
Member

findepi commented Jul 4, 2023

skimmed. looks directionally correct

@Praveen2112 Praveen2112 force-pushed the praveen/nano_second_precision_for_coerced_timestamp_column branch 2 times, most recently from 656b77a to d3bd7e2 Compare July 4, 2023 17:11
@@ -366,7 +365,7 @@ else if (column.getBaseHiveColumnIndex() < fileColumns.size()) {
Type readType = column.getType();
if (orcColumn != null) {
int sourceIndex = fileReadColumns.size();
Optional<TypeCoercer<?, ?>> coercer = createCoercer(orcColumn.getColumnType(), readType, getTimestampPrecision(session));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question: We are no longer relay on a session?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we rely on the session to determine the timestamp precision so that we could read the timestamp column with that precision and now we would read them as NANOSECOND irrespective of the precision configured.

Copy link
Contributor

@krvikash krvikash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This will be irrespective of the precision configured or specified as
session property.
@Praveen2112 Praveen2112 force-pushed the praveen/nano_second_precision_for_coerced_timestamp_column branch from d3bd7e2 to e864bce Compare July 6, 2023 07:49
@Praveen2112
Copy link
Member Author

@krvikash AC

@findepi findepi requested a review from raunaqmorarka July 10, 2023 12:29
@Praveen2112 Praveen2112 merged commit dbd3e0f into trinodb:master Jul 14, 2023
@github-actions github-actions bot added this to the 423 milestone Jul 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed hive Hive connector
Development

Successfully merging this pull request may close these issues.

6 participants