Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky TestDeltaLakeGcsConnectorSmokeTest.testOptimizeUsingForcedPartitioning #19943

Closed
ebyhr opened this issue Nov 28, 2023 · 1 comment · Fixed by #20003
Closed

Flaky TestDeltaLakeGcsConnectorSmokeTest.testOptimizeUsingForcedPartitioning #19943

ebyhr opened this issue Nov 28, 2023 · 1 comment · Fixed by #20003
Assignees

Comments

@ebyhr
Copy link
Member

ebyhr commented Nov 28, 2023

Error:  Tests run: 105, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 826.4 s <<< FAILURE! -- in io.trino.plugin.deltalake.TestDeltaLakeGcsConnectorSmokeTest
Error:  io.trino.plugin.deltalake.TestDeltaLakeGcsConnectorSmokeTest.testOptimizeUsingForcedPartitioning -- Time elapsed: 11.18 s <<< ERROR!
io.trino.testing.QueryFailedException: Failed to write Delta Lake transaction log entry
	at io.trino.testing.AbstractTestingTrinoClient.execute(AbstractTestingTrinoClient.java:133)
	at io.trino.testing.DistributedQueryRunner.executeWithQueryId(DistributedQueryRunner.java:505)
	at io.trino.testing.QueryAssertions.assertDistributedUpdate(QueryAssertions.java:107)
	at io.trino.testing.QueryAssertions.assertUpdate(QueryAssertions.java:61)
	at io.trino.testing.AbstractTestQueryFramework.assertUpdate(AbstractTestQueryFramework.java:424)
	at io.trino.testing.AbstractTestQueryFramework.assertUpdate(AbstractTestQueryFramework.java:419)
	at io.trino.plugin.deltalake.BaseDeltaLakeConnectorSmokeTest.testOptimizeUsingForcedPartitioning(BaseDeltaLakeConnectorSmokeTest.java:1961)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at java.base/java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:194)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
	Suppressed: java.lang.Exception: SQL: INSERT INTO test_optimize_partitioned_table_13kkuabawe VALUES ('one', 6, 'test2', 9)
		at io.trino.testing.DistributedQueryRunner.executeWithQueryId(DistributedQueryRunner.java:509)
		... 12 more
Caused by: io.trino.spi.TrinoException: Failed to write Delta Lake transaction log entry
	at io.trino.plugin.deltalake.DeltaLakeMetadata.finishInsert(DeltaLakeMetadata.java:1912)
	at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorMetadata.finishInsert(ClassLoaderSafeConnectorMetadata.java:617)
	at io.trino.tracing.TracingConnectorMetadata.finishInsert(TracingConnectorMetadata.java:697)
	at io.trino.metadata.MetadataManager.finishInsert(MetadataManager.java:1140)
	at io.trino.tracing.TracingMetadata.finishInsert(TracingMetadata.java:694)
	at io.trino.sql.planner.LocalExecutionPlanner.lambda$createTableFinisher$4(LocalExecutionPlanner.java:4164)
	at io.trino.operator.TableFinishOperator.getOutput(TableFinishOperator.java:319)
	at io.trino.operator.Driver.processInternal(Driver.java:395)
	at io.trino.operator.Driver.lambda$process$8(Driver.java:298)
	at io.trino.operator.Driver.tryWithLock(Driver.java:694)
	at io.trino.operator.Driver.process(Driver.java:290)
	at io.trino.operator.Driver.processForDuration(Driver.java:261)
	at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:887)
	at io.trino.execution.executor.timesharing.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:187)
	at io.trino.execution.executor.timesharing.TimeSharingTaskExecutor$TaskRunner.run(TimeSharingTaskExecutor.java:565)
	at io.trino.$gen.Trino_testversion____20231128_181115_1233.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: io.trino.spi.TrinoException: Failed to store statistics with table location: gs://trino-ci-test/test-delta-lake-integration-smoke-test-gu5i5s6vxd/test_optimize_partitioned_table_13kkuabawe
	at io.trino.plugin.deltalake.statistics.MetaDirStatisticsAccess.updateExtendedStatistics(MetaDirStatisticsAccess.java:109)
	at io.trino.plugin.deltalake.statistics.CachingExtendedStatisticsAccess.updateExtendedStatistics(CachingExtendedStatisticsAccess.java:74)
	at io.trino.plugin.deltalake.DeltaLakeMetadata.updateTableStatistics(DeltaLakeMetadata.java:3350)
	at io.trino.plugin.deltalake.DeltaLakeMetadata.finishInsert(DeltaLakeMetadata.java:1895)
	... 18 more
Caused by: java.io.IOException: Error writing file: gs://trino-ci-test/test-delta-lake-integration-smoke-test-gu5i5s6vxd/test_optimize_partitioned_table_13kkuabawe/_delta_log/_trino_meta/extended_stats.json
	at io.trino.filesystem.gcs.GcsUtils.handleGcsException(GcsUtils.java:39)
	at io.trino.filesystem.gcs.GcsOutputFile.createOutputStream(GcsOutputFile.java:94)
	at io.trino.filesystem.gcs.GcsOutputFile.createOrOverwrite(GcsOutputFile.java:63)
	at io.trino.filesystem.TrinoOutputFile.createOrOverwrite(TrinoOutputFile.java:35)
	at io.trino.filesystem.tracing.TracingOutputFile.lambda$createOrOverwrite$1(TracingOutputFile.java:57)
	at io.trino.filesystem.tracing.Tracing.withTracing(Tracing.java:47)
	at io.trino.filesystem.tracing.TracingOutputFile.createOrOverwrite(TracingOutputFile.java:57)
	at io.trino.plugin.deltalake.statistics.MetaDirStatisticsAccess.updateExtendedStatistics(MetaDirStatisticsAccess.java:98)
	... 21 more
Caused by: com.google.cloud.storage.StorageException: The object trino-ci-test/test-delta-lake-integration-smoke-test-gu5i5s6vxd/test_optimize_partitioned_table_13kkuabawe/_delta_log/_trino_meta/extended_stats.json exceeded the rate limit for object mutation operations (create, update, and delete). Please reduce your request rate. See https://cloud.google.com/storage/docs/gcs429.
	at com.google.cloud.storage.StorageException.translate(StorageException.java:170)
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.translate(HttpStorageRpc.java:313)
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.create(HttpStorageRpc.java:392)
	at com.google.cloud.storage.StorageImpl.lambda$internalCreate$2(StorageImpl.java:213)
	at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:103)
	at com.google.cloud.RetryHelper.run(RetryHelper.java:76)
	at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
	at com.google.cloud.storage.Retrying.run(Retrying.java:65)
	at com.google.cloud.storage.StorageImpl.run(StorageImpl.java:1514)
	at com.google.cloud.storage.StorageImpl.internalCreate(StorageImpl.java:210)
	at com.google.cloud.storage.StorageImpl.create(StorageImpl.java:142)
	at io.trino.filesystem.gcs.GcsOutputFile.createOutputStream(GcsOutputFile.java:84)
	... 27 more
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 429 Too Many Requests
POST https://storage.googleapis.com/upload/storage/v1/b/trino-ci-test/o?projection=full&uploadType=multipart
{
  "code" : 429,
  "errors" : [ {
    "domain" : "usageLimits",
    "message" : "The object trino-ci-test/test-delta-lake-integration-smoke-test-gu5i5s6vxd/test_optimize_partitioned_table_13kkuabawe/_delta_log/_trino_meta/extended_stats.json exceeded the rate limit for object mutation operations (create, update, and delete). Please reduce your request rate. See https://cloud.google.com/storage/docs/gcs429.",
    "reason" : "rateLimitExceeded"
  } ],
  "message" : "The object trino-ci-test/test-delta-lake-integration-smoke-test-gu5i5s6vxd/test_optimize_partitioned_table_13kkuabawe/_delta_log/_trino_meta/extended_stats.json exceeded the rate limit for object mutation operations (create, update, and delete). Please reduce your request rate. See https://cloud.google.com/storage/docs/gcs429."
}
	at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
	at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:118)
	at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:37)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:570)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:493)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:603)
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.create(HttpStorageRpc.java:389)
	... 36 more

https://github.com/trinodb/trino/actions/runs/7022450870/job/19107239986

@electrum
Copy link
Member

electrum commented Nov 29, 2023

According to https://cloud.google.com/storage/docs/objects#immutability

Note that there is a once-per-second limit for rapidly replacing the same object. Replacing the same object more frequently might result in 429 Too Many Requests errors. You should design your application to upload data for a particular object no more than once per second and handle occasional 429 Too Many Requests errors using an exponential backoff retry strategy.

Maybe we need to adjust the retry policy? I'm surprised that the client doesn't handle this appropriately by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

3 participants