Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for varchar to timestamp coercion in hive tables #18014

Merged

Conversation

Praveen2112
Copy link
Member

@Praveen2112 Praveen2112 commented Jun 22, 2023

Description

Allows coerce a varchar type to timestamp in hive tables.

Additional context and related issues

Depends on #18004

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Hive
* Add support for varchar to timestamp coercion in hive tables 

@cla-bot cla-bot bot added the cla-signed label Jun 22, 2023
@github-actions github-actions bot added hive Hive connector tests:hive labels Jun 22, 2023
@Praveen2112 Praveen2112 force-pushed the praveen/varchar_to_timestamp_coercion_2 branch from 27cf204 to 9385cbc Compare July 21, 2023 13:16
@Praveen2112 Praveen2112 marked this pull request as ready for review July 21, 2023 13:35
@Praveen2112 Praveen2112 force-pushed the praveen/varchar_to_timestamp_coercion_2 branch from 9385cbc to 6a087ce Compare July 25, 2023 13:14
@Praveen2112 Praveen2112 force-pushed the praveen/varchar_to_timestamp_coercion_2 branch 2 times, most recently from ef81e4a to 21edd63 Compare July 26, 2023 05:31
Copy link
Member

@atanasenko atanasenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says "varchar type to string"

Copy link
Contributor

@krvikash krvikash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

LocalDateTime dateTime = LOCAL_DATE_TIME.parse(value.toStringUtf8(), LocalDateTime::from);
long epochSecond = dateTime.toEpochSecond(UTC);
if (epochSecond < START_OF_MODERN_ERA_SECONDS) {
throw new TrinoException(HIVE_INVALID_TIMESTAMP_COERCION, "Coercion on historical dates is not supported");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we throw exception here, we are not able to adjust date using GregorianCalendar and month -1 ?

Copy link
Member Author

@Praveen2112 Praveen2112 Jul 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hive supports different way of coercing timestamp across various version (2 and 3) and various types (ORC/Parquet/RC) for some it uses java.sql.Timestamp and other it uses a different way - so we are restricting them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we just cannot predict what is the way of storing such date in underlying datasource (is it for example shifted for -1 month or not), so we just deny such cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not storing but actually the way it coerces from String to Timestamp

- Static import of utf8Slice
@Praveen2112 Praveen2112 force-pushed the praveen/varchar_to_timestamp_coercion_2 branch from 21edd63 to ed788a8 Compare July 26, 2023 13:42
@Praveen2112
Copy link
Member Author

@krvikash AC

@Praveen2112
Copy link
Member Author

@raunaqmorarka Thanks for the review. AC

@Praveen2112 Praveen2112 force-pushed the praveen/varchar_to_timestamp_coercion_2 branch from ed788a8 to 7d80d8d Compare July 26, 2023 16:23
@Praveen2112 Praveen2112 changed the title Add support for varchar to timestamp coercer in hive tables Add support for varchar to timestamp coercion in hive tables Jul 27, 2023
@Praveen2112 Praveen2112 force-pushed the praveen/varchar_to_timestamp_coercion_2 branch from 7d80d8d to 614ab46 Compare July 27, 2023 05:43
@Praveen2112 Praveen2112 merged commit 84a3466 into trinodb:master Jul 27, 2023
@github-actions github-actions bot added this to the 423 milestone Jul 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed hive Hive connector
Development

Successfully merging this pull request may close these issues.

6 participants