Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix insert to Clickhouse TimestampWithTimeZone #23785

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,6 @@
import io.trino.spi.type.Decimals;
import io.trino.spi.type.Int128;
import io.trino.spi.type.StandardTypes;
import io.trino.spi.type.TimeZoneKey;
import io.trino.spi.type.Type;
import io.trino.spi.type.TypeManager;
import io.trino.spi.type.TypeSignature;
Expand Down Expand Up @@ -104,6 +103,7 @@
import java.util.Map.Entry;
import java.util.Optional;
import java.util.OptionalLong;
import java.util.TimeZone;
import java.util.UUID;
import java.util.concurrent.atomic.AtomicReference;
import java.util.function.BiFunction;
Expand Down Expand Up @@ -167,7 +167,6 @@
import static io.trino.spi.type.BooleanType.BOOLEAN;
import static io.trino.spi.type.DateTimeEncoding.packDateTimeWithZone;
import static io.trino.spi.type.DateTimeEncoding.unpackMillisUtc;
import static io.trino.spi.type.DateTimeEncoding.unpackZoneKey;
import static io.trino.spi.type.DateType.DATE;
import static io.trino.spi.type.DecimalType.createDecimalType;
import static io.trino.spi.type.DoubleType.DOUBLE;
Expand Down Expand Up @@ -733,7 +732,7 @@ public Optional<ColumnMapping> toColumnMapping(ConnectorSession session, Connect
return Optional.of(ColumnMapping.longMapping(
TIMESTAMP_TZ_SECONDS,
shortTimestampWithTimeZoneReadFunction(),
shortTimestampWithTimeZoneWriteFunction()));
shortTimestampWithTimeZoneWriteFunction(column.getTimeZone())));
Praveen2112 marked this conversation as resolved.
Show resolved Hide resolved
}
}

Expand Down Expand Up @@ -915,12 +914,12 @@ private static LongReadFunction shortTimestampWithTimeZoneReadFunction()
};
}

private static LongWriteFunction shortTimestampWithTimeZoneWriteFunction()
private static LongWriteFunction shortTimestampWithTimeZoneWriteFunction(TimeZone columnTimeZone)
{
return (statement, index, value) -> {
long millisUtc = unpackMillisUtc(value);
TimeZoneKey timeZoneKey = unpackZoneKey(value);
statement.setObject(index, Instant.ofEpochMilli(millisUtc).atZone(timeZoneKey.getZoneId()));
// Clickhouse JDBC driver inserts datetime as string value as yyyy-MM-dd HH:mm:ss and zone from the Column metadata would be used.
statement.setObject(index, Instant.ofEpochMilli(millisUtc).atZone(columnTimeZone.toZoneId()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to perform this translation ? setObject receives a ZonedDateTime which is when handled by ClickhouseValues (as a part of InputBasedPreparedStatement -> ClickHousePreparedStatement) would convert them to UTC and writes them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other options I tried didn't work for me. Do you something concrete in mind?
https://clickhouse.com/docs/en/sql-reference/data-types/datetime#examples

  • When inserting datetime as an integer, it is treated as Unix Timestamp (UTC). 1546300800 represents '2019-01-01 00:00:00' UTC. However, as timestamp column has Asia/Istanbul (UTC+3) timezone specified, when outputting as string the value will be shown as '2019-01-01 03:00:00'
  • When inserting string value as datetime, it is treated as being in column timezone. '2019-01-01 00:00:00' will be treated as being in Asia/Istanbul timezone and saved as 1546290000.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant do we need this translation ? Currently Clickhouse's PreparedStatement implementation performs this translation i.e statement.setObject(index, Instant.ofEpochMilli(millisUtc).atZone(timeZoneKey.getZoneId())); - internally would operate on ZonedDataTime - to fetch the UTC time and would update them as BigDecimal - so I think this conversion to columnTimeZone - We could set statement.setObject(index, Instant.ofEpochMilli(millisUtc)) which could save us some cpu cycle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not work

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it fail or ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, tests fail.

Copy link
Contributor Author

@ssheikin ssheikin Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stacktrace

io.trino.testing.QueryFailedException: Code: 62. DB::Exception: Syntax error: failed at position 98 (':'): :56Z. Expected one of: token, DoubleColon. (SYNTAX_ERROR) (version 23.8.12.13 (official build))


	at io.trino.testing.AbstractTestingTrinoClient.execute(AbstractTestingTrinoClient.java:134)
	at io.trino.testing.DistributedQueryRunner.executeInternal(DistributedQueryRunner.java:565)
	at io.trino.testing.DistributedQueryRunner.execute(DistributedQueryRunner.java:548)
	at io.trino.testing.datatype.SqlDataTypeTest.verifyPredicate(SqlDataTypeTest.java:129)
	at io.trino.testing.datatype.SqlDataTypeTest.execute(SqlDataTypeTest.java:91)
	at io.trino.plugin.clickhouse.BaseClickHouseTypeMapping.testClickHouseDateTimeWithTimeZone(BaseClickHouseTypeMapping.java:1014)
	at io.trino.plugin.clickhouse.BaseClickHouseTypeMapping.testRepeated(BaseClickHouseTypeMapping.java:1202)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:212)
	at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:194)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:212)
	at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:212)
	at java.base/java.util.stream.IntPipeline$1$1.accept(IntPipeline.java:180)
	at java.base/java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:104)
	at java.base/java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:712)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:556)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:546)
	at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
	at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:265)
	at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:611)
	at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:291)
	at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1709)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:556)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:546)
	at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
	at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:265)
	at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:611)
	at java.base/java.util.concurrent.ForkJoinTask.doExec$$$capture(ForkJoinTask.java:507)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1489)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:2071)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2033)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
	Suppressed: java.lang.Exception: SQL: SELECT 'all found' FROM tpch.test_timestamp_with_time_zoned17ore1fpd WHERE col_0 IS NOT DISTINCT FROM TIMESTAMP '2024-01-01 05:19:56 +05:45'
		at io.trino.testing.DistributedQueryRunner.executeInternal(DistributedQueryRunner.java:572)
		... 35 more
Caused by: io.trino.spi.TrinoException: Code: 62. DB::Exception: Syntax error: failed at position 98 (':'): :56Z. Expected one of: token, DoubleColon. (SYNTAX_ERROR) (version 23.8.12.13 (official build))

	at io.trino.plugin.jdbc.JdbcRecordCursor.handleSqlException(JdbcRecordCursor.java:312)
	at io.trino.plugin.jdbc.JdbcRecordCursor.advanceNextPosition(JdbcRecordCursor.java:187)
	at io.trino.$gen.CursorProcessor_20241016_080419_108.process(Unknown Source)
	at io.trino.operator.ScanFilterAndProjectOperator$RecordCursorToPages.process(ScanFilterAndProjectOperator.java:329)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:423)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:261)
	at io.trino.operator.WorkProcessorUtils$YieldingProcess.process(WorkProcessorUtils.java:181)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:423)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:261)
	at io.trino.operator.WorkProcessorUtils$BlockingProcess.process(WorkProcessorUtils.java:207)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:423)
	at io.trino.operator.WorkProcessorUtils.lambda$flatten$6(WorkProcessorUtils.java:317)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:359)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:423)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:346)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:423)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:261)
	at io.trino.operator.WorkProcessorUtils.lambda$processStateMonitor$2(WorkProcessorUtils.java:240)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:423)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:261)
	at io.trino.operator.WorkProcessorUtils.lambda$finishWhen$3(WorkProcessorUtils.java:255)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:423)
	at io.trino.operator.WorkProcessorSourceOperatorAdapter.getOutput(WorkProcessorSourceOperatorAdapter.java:133)
	at io.trino.operator.Driver.processInternal(Driver.java:403)
	at io.trino.operator.Driver.lambda$process$8(Driver.java:306)
	at io.trino.operator.Driver.tryWithLock(Driver.java:709)
	at io.trino.operator.Driver.process(Driver.java:298)
	at io.trino.operator.Driver.processForDuration(Driver.java:269)
	at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:890)
	at io.trino.execution.executor.dedicated.SplitProcessor.run(SplitProcessor.java:77)
	at io.trino.execution.executor.dedicated.TaskEntry$VersionEmbedderBridge.lambda$run$0(TaskEntry.java:201)
	at io.trino.$gen.Trino_testversion____20241016_080414_36.run(Unknown Source)
	at io.trino.execution.executor.dedicated.TaskEntry$VersionEmbedderBridge.run(TaskEntry.java:202)
	at io.trino.execution.executor.scheduler.FairScheduler.runTask(FairScheduler.java:172)
	at io.trino.execution.executor.scheduler.FairScheduler.lambda$submit$0(FairScheduler.java:159)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:76)
	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1570)
Caused by: java.sql.SQLException: Code: 62. DB::Exception: Syntax error: failed at position 98 (':'): :56Z. Expected one of: token, DoubleColon. (SYNTAX_ERROR) (version 23.8.12.13 (official build))

	at io.trino.plugin.jdbc.JdbcRecordCursor.advanceNextPosition(JdbcRecordCursor.java:167)
	... 40 more
Caused by: java.util.concurrent.ExecutionException: java.sql.BatchUpdateException: Code: 62. DB::Exception: Syntax error: failed at position 98 (':'): :56Z. Expected one of: token, DoubleColon. (SYNTAX_ERROR) (version 23.8.12.13 (official build))

	at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:596)
	at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:555)
	at com.google.common.util.concurrent.FluentFuture$TrustedFuture.get(FluentFuture.java:91)
	at io.trino.plugin.jdbc.JdbcRecordCursor.advanceNextPosition(JdbcRecordCursor.java:163)
	... 40 more
Caused by: java.sql.BatchUpdateException: Code: 62. DB::Exception: Syntax error: failed at position 98 (':'): :56Z. Expected one of: token, DoubleColon. (SYNTAX_ERROR) (version 23.8.12.13 (official build))

	at com.clickhouse.jdbc.SqlExceptionUtils.batchUpdateError(SqlExceptionUtils.java:107)
	at com.clickhouse.jdbc.internal.SqlBasedPreparedStatement.executeAny(SqlBasedPreparedStatement.java:223)
	at com.clickhouse.jdbc.internal.SqlBasedPreparedStatement.executeQuery(SqlBasedPreparedStatement.java:286)
	at io.opentelemetry.instrumentation.jdbc.internal.OpenTelemetryStatement.wrapCall(OpenTelemetryStatement.java:304)
	at io.opentelemetry.instrumentation.jdbc.internal.OpenTelemetryPreparedStatement.executeQuery(OpenTelemetryPreparedStatement.java:64)
	at io.trino.plugin.jdbc.JdbcRecordCursor.lambda$advanceNextPosition$1(JdbcRecordCursor.java:158)
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:76)
	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
	at com.google.common.util.concurrent.DirectExecutorService.execute(DirectExecutorService.java:51)
	at java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:145)
	at com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:79)
	at com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:37)
	at io.trino.plugin.jdbc.JdbcRecordCursor.advanceNextPosition(JdbcRecordCursor.java:156)
	... 40 more


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we could try using

When inserting datetime as an integer, it is treated as Unix Timestamp (UTC). 1546300800 represents '2019-01-01 00:00:00' UTC. However, as timestamp column has Asia/Istanbul (UTC+3) timezone specified, when outputting as string the value will be shown as '2019-01-01 03:00:00'

We have the millisUtc we need to convert it to second and insert them. I think we could implement it as a follow up once we add support for timestamp with higher precision as well.

Can we add this

Clickhouse JDBC driver inserts datetime as string value as yyyy-MM-dd HH:mm:ssand zone from the Column metadata would be used.

As a code comment as it would be a bit confusing on why we need to convert it into a specific timezone as it would be same as Instant.ofEpochMilli(millisUtc).atZone(columnTimeZone.toZoneId())

};
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1003,6 +1003,12 @@ public void testClickHouseDateTimeWithTimeZone()
Session session = Session.builder(getSession())
.setTimeZoneKey(TimeZoneKey.getTimeZoneKey(sessionZone.getId()))
.build();
SqlDataTypeTest.create()
.addRoundTrip("DateTime('Asia/Kathmandu')", "timestamp '2024-01-01 12:34:56'", TIMESTAMP_TZ_SECONDS, "TIMESTAMP '2024-01-01 05:19:56 +05:45'")
.addRoundTrip("DateTime('Asia/Kathmandu')", "timestamp '2024-01-01 12:34:56 Asia/Kathmandu'", TIMESTAMP_TZ_SECONDS, "TIMESTAMP '2024-01-01 12:34:56 +05:45'")
.addRoundTrip("DateTime('Asia/Kathmandu')", "timestamp '2024-01-01 12:34:56 +00:00'", TIMESTAMP_TZ_SECONDS, "TIMESTAMP '2024-01-01 18:19:56 +05:45'")
.addRoundTrip("DateTime('Asia/Kathmandu')", "timestamp '2024-01-01 12:34:56 -01:00'", TIMESTAMP_TZ_SECONDS, "TIMESTAMP '2024-01-01 19:19:56 +05:45'")
.execute(getQueryRunner(), session, clickhouseCreateAndTrinoInsert("tpch.test_timestamp_with_time_zone"));
Comment on lines +1006 to +1011
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Praveen2112 without the change tests fail with:

java.lang.AssertionError: [Rows for query [SELECT * FROM tpch.test_timestamp_with_time_zonevvs0lcvaz4]] 
Expecting actual:
  (2024-01-01T12:34:56+05:45, 2024-01-01T12:34:56+05:45, 2024-01-01T12:34:56+05:45, 2024-01-01T12:34:56+05:45)
to contain exactly in any order:
  [(2024-01-01T05:19:56+05:45, 2024-01-01T12:34:56+05:45, 2024-01-01T18:19:56+05:45, 2024-01-01T19:19:56+05:45)]
elements not found:
  (2024-01-01T05:19:56+05:45, 2024-01-01T12:34:56+05:45, 2024-01-01T18:19:56+05:45, 2024-01-01T19:19:56+05:45)
and elements not expected:
  (2024-01-01T12:34:56+05:45, 2024-01-01T12:34:56+05:45, 2024-01-01T12:34:56+05:45, 2024-01-01T12:34:56+05:45)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about trinoCreateAsSelect , trinoCreateAndInsert as it could affect for insert operation and CTAS operation.

Copy link
Contributor Author

@ssheikin ssheikin Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clickhouse does not support Create table with TimestampWithTimeZone yet, thus only insert is tested to columns created by clickhouse.
Commit message updated.


dateTimeWithTimeZoneTest(clickhouseDateTimeInputTypeFactory("datetime"))
.execute(getQueryRunner(), session, clickhouseCreateAndInsert("tpch.datetime_tz"));
Expand Down