Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce TrinoToClickHouseWriteChecker in ClickHouse #11148

Merged

Conversation

tangjiangling
Copy link
Member

@tangjiangling tangjiangling commented Feb 22, 2022

Description

Different versions of ClickHouse may support different min/max values
for the same data type, you can refer to the table below:

version column type min value max value
any UInt8 0 255
any UInt16 0 65535
any UInt32 0 4294967295
any UInt64 0 18446744073709551615
< 21.4 Date 1970-01-01 2106-02-07
< 21.4 DateTime 1970-01-01 00:00:00 2106-02-06 06:28:15
>= 21.4 Date 1970-01-01 2149-06-06
>= 21.4 DateTime 1970-01-01 00:00:00 2106-02-07 06:28:15

And when the value written to ClickHouse is out of range, ClickHouse
will store the incorrect result, so we introduced
TrinoToClickHouseWriteChecker to check the range of the written value
to prevent ClickHouse from storing the incorrect value.

Introducing TrinoToClickHouseWriteChecker is also a preparation for
supporting DateTime[(timezone)] and DateTime64(precision, [timezone]).

Fixes part of #11116
Related #10537

Release notes

(x) Release notes entries required with the following suggested text:

# ClickHouse
* Extend the range of ClickHouse `Date` and `DateTime` values. ({issue}`11116`)

@cla-bot cla-bot bot added the cla-signed label Feb 22, 2022
@tangjiangling tangjiangling force-pushed the add-support-for-datetime64-in-clickhouse branch 3 times, most recently from a0c5748 to 44ed9f4 Compare February 23, 2022 20:21
@tangjiangling tangjiangling removed the WIP label Feb 23, 2022
@tangjiangling tangjiangling changed the title [WIP] Add support for DateTime64 in ClickHouse Add support for DateTime64 in ClickHouse Feb 23, 2022
@tangjiangling tangjiangling marked this pull request as ready for review February 23, 2022 20:23
@github-actions github-actions bot added the docs label Feb 23, 2022
Copy link
Member

@ebyhr ebyhr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Left some initial comments.

@tangjiangling tangjiangling force-pushed the add-support-for-datetime64-in-clickhouse branch from 32f4f1c to 8c2f64b Compare February 25, 2022 07:17
@tangjiangling
Copy link
Member Author

Updated, ping @ebyhr

Copy link
Member

@ebyhr ebyhr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase on upstream to resolve conflicts.

@tangjiangling tangjiangling force-pushed the add-support-for-datetime64-in-clickhouse branch 2 times, most recently from 51316c0 to f774c1a Compare February 26, 2022 01:48
@tangjiangling
Copy link
Member Author

Updated.

@tangjiangling tangjiangling force-pushed the add-support-for-datetime64-in-clickhouse branch from f774c1a to 885f3ec Compare February 26, 2022 20:05
@tangjiangling
Copy link
Member Author

Minor fixs, PTAL @ebyhr

@tangjiangling
Copy link
Member Author

tangjiangling commented Sep 6, 2022

How did we initially discover the issue that writes are loosing values in ClickHouse?

AFAIK it was @ebyhr who discovered this behavior in the first place (see #10055 #11490).

Then I found out that different versions of ClickHouse support different min-max values for date types (inspired by #10055, see #11116).
(DateTime* also has the same issue)

Can you extract last 2 commits into separate PR?

Sure, will do.

@tangjiangling tangjiangling force-pushed the add-support-for-datetime64-in-clickhouse branch from 4cb5f17 to e55d54e Compare September 8, 2022 16:23
@tangjiangling tangjiangling changed the title Add support for DateTime64 in ClickHouse Introduce TrinoToClickHouseWriteChecker in ClickHouse Sep 8, 2022
@tangjiangling
Copy link
Member Author

tangjiangling commented Sep 8, 2022

@hashhar PTAL updated PR(title, description, RN, addressed comments).

Can you extract last 2 commits into separate PR?

Once this PR is merged, I'll file follow-up PRs.

@hashhar
Copy link
Member

hashhar commented Sep 8, 2022

Thanks, I'll take another look tomorrow.

Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please squash the fixup commits.

Different versions of ClickHouse may support different min/max values
for the same data type, you can refer to the table below:

| version | column type | min value           | max value            |
|---------|-------------|---------------------|----------------------|
| any     | UInt8       | 0                   | 255                  |
| any     | UInt16      | 0                   | 65535                |
| any     | UInt32      | 0                   | 4294967295           |
| any     | UInt64      | 0                   | 18446744073709551615 |
| < 21.4  | Date        | 1970-01-01          | 2106-02-07           |
| < 21.4  | DateTime    | 1970-01-01 00:00:00 | 2106-02-06 06:28:15  |
| >= 21.4 | Date        | 1970-01-01          | 2149-06-06           |
| >= 21.4 | DateTime    | 1970-01-01 00:00:00 | 2106-02-07 06:28:15  |

And when the value written to ClickHouse is out of range, ClickHouse
will store the incorrect result, so we introduced
`TrinoToClickHouseWriteChecker` to check the range of the written value
to prevent ClickHouse from storing the incorrect value.

Introducing `TrinoToClickHouseWriteChecker` is also a preparation for
supporting `DateTime[timezone]` and `DateTime64(precision, [timezone])`.

The next several commits will use `TrinoToClickHouseWriteChecker` to
verify the values written to ClickHouse.
This is a preparatory commit to use `TrinoToClickHouseWriteChecker` to
validate Date.
This is a preparatory commit to use `TrinoToClickHouseWriteChecker` to
validate DateTime.
@tangjiangling tangjiangling force-pushed the add-support-for-datetime64-in-clickhouse branch from 5403be1 to 16968ba Compare September 10, 2022 11:23
@tangjiangling
Copy link
Member Author

Please squash the fixup commits.

done

@hashhar hashhar merged commit a02bb52 into trinodb:master Sep 10, 2022
@github-actions github-actions bot added this to the 396 milestone Sep 10, 2022
@tangjiangling tangjiangling deleted the add-support-for-datetime64-in-clickhouse branch September 10, 2022 13:58
@colebow
Copy link
Member

colebow commented Sep 13, 2022

@tangjiangling Do we want a release note for this change? And is there a plan/need to write docs related to this as a follow-up?

cc @hashhar

@hashhar
Copy link
Member

hashhar commented Sep 13, 2022

RN entry is suggested in the PR description. It might be useful to document the supported range of values but it's not urgent since the values will vary and change according to ClickHouse version being used and a useful error message with relevant information is shown in those cases.

@colebow
Copy link
Member

colebow commented Sep 13, 2022

Oops, somehow missed the release note in the PR description. And that sounds good to me.

@zhicwu
Copy link
Contributor

zhicwu commented Oct 18, 2022

FYI, the date range was changed again in new version - see ClickHouse/clickhouse-java#1103.

@tangjiangling
Copy link
Member Author

Thanks for the reminder, we already knew that (see #11148 (comment)) and I will continue this work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed docs needs-docs This pull request requires changes to the documentation
Development

Successfully merging this pull request may close these issues.

5 participants