-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support writing timestamp with time zone
type on partitioned column in Delta
#18353
Conversation
Co-Authored-By: alberic <[email protected]>
@@ -334,6 +338,10 @@ public static Object getField(DateTimeZone localZone, Type type, Block block, in | |||
if (type instanceof TimestampType timestampType) { | |||
return getHiveTimestamp(localZone, timestampType, block, position); | |||
} | |||
if (type instanceof TimestampWithTimeZoneType) { | |||
checkArgument(type.equals(TIMESTAMP_TZ_MILLIS)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good functionally from my perspective.
Code wise, I'm concerned about mixing Delta Lake specific code within HiveWriteUtils
.
Do you see anything against having DeltaLakeWriteUtils
with a slight handling for timestamp tz and call HiveWriterUtils.getField
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. We call here from io.trino.plugin.deltalake.AbstractDeltaLakePageSink#createPartitionValues
.
For example i am not convinced this is the appropriate check for Delta
trino/plugin/trino-hive/src/main/java/io/trino/plugin/hive/util/HiveWriteUtils.java
Lines 285 to 289 in 431f3e0
if (!CharMatcher.inRange((char) 0x20, (char) 0x7E).matchesAllOf(valueString)) { | |
throw new TrinoException(HIVE_INVALID_PARTITION_VALUE, | |
"Hive partition keys can only contain printable ASCII characters (0x20 - 0x7E). Invalid value: " + | |
base16().withSeparator(" ", 2).encode(valueString.getBytes(UTF_8))); | |
} |
in delta context the value gets used to
- make partition-like path
- write to addEntry json (right?)
so the logic could indeed be Delta-specific.
AFAIAC, this can go follow-up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will send a follow-up PR.
@@ -334,6 +338,10 @@ public static Object getField(DateTimeZone localZone, Type type, Block block, in | |||
if (type instanceof TimestampType timestampType) { | |||
return getHiveTimestamp(localZone, timestampType, block, position); | |||
} | |||
if (type instanceof TimestampWithTimeZoneType) { | |||
checkArgument(type.equals(TIMESTAMP_TZ_MILLIS)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. We call here from io.trino.plugin.deltalake.AbstractDeltaLakePageSink#createPartitionValues
.
For example i am not convinced this is the appropriate check for Delta
trino/plugin/trino-hive/src/main/java/io/trino/plugin/hive/util/HiveWriteUtils.java
Lines 285 to 289 in 431f3e0
if (!CharMatcher.inRange((char) 0x20, (char) 0x7E).matchesAllOf(valueString)) { | |
throw new TrinoException(HIVE_INVALID_PARTITION_VALUE, | |
"Hive partition keys can only contain printable ASCII characters (0x20 - 0x7E). Invalid value: " + | |
base16().withSeparator(" ", 2).encode(valueString.getBytes(UTF_8))); | |
} |
in delta context the value gets used to
- make partition-like path
- write to addEntry json (right?)
so the logic could indeed be Delta-specific.
AFAIAC, this can go follow-up
"(null, null, null, null, 4.0, null, null)"); | ||
|
||
assertThat((String) computeScalar("SELECT \"$path\" FROM " + tableName + " WHERE id = 1")) | ||
.contains("/part=__HIVE_DEFAULT_PARTITION__/"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is "HIVE_DEFAULT_PARTITION" also the Delta's convention for paths for null part key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's verified in a product test:
Lines 132 to 134 in 68ad854
assertThat(onTrino().executeQuery("SELECT \"$path\" FROM delta.default." + tableName + " WHERE a_number IS NULL").column(1)) | |
.hasSize(2) | |
.allMatch(path -> ((String) path).contains("/a_number=__HIVE_DEFAULT_PARTITION__/")); |
Description
Fix #16822
Release notes
(x) Release notes are required, with the following suggested text: