Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writes to Delta table fail when location ends with two slashes #17966

Closed
findepi opened this issue Jun 20, 2023 · 4 comments · Fixed by #18177
Closed

Writes to Delta table fail when location ends with two slashes #17966

findepi opened this issue Jun 20, 2023 · 4 comments · Fixed by #18177
Assignees
Labels
bug Something isn't working

Comments

@findepi
Copy link
Member

findepi commented Jun 20, 2023

io.trino.testing.QueryFailedException: Location does not have parent: s3://trino-ci-test/test_glue_s3_0cm35ldrm5/two_trailing_slashes/test_basic_operations_i542auhzr7//

	at io.trino.testing.AbstractTestingTrinoClient.execute(AbstractTestingTrinoClient.java:122)
	at io.trino.testing.DistributedQueryRunner.executeWithQueryId(DistributedQueryRunner.java:493)
	at io.trino.testing.QueryAssertions.assertDistributedUpdate(QueryAssertions.java:107)
	at io.trino.testing.QueryAssertions.assertUpdate(QueryAssertions.java:63)
	at io.trino.testing.AbstractTestQueryFramework.assertUpdate(AbstractTestQueryFramework.java:410)
	at io.trino.testing.AbstractTestQueryFramework.assertUpdate(AbstractTestQueryFramework.java:405)
	at io.trino.plugin.hive.BaseS3AndGlueMetastoreTest.testBasicOperationsWithProvidedTableLocation(BaseS3AndGlueMetastoreTest.java:106)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:104)
	at org.testng.internal.Invoker.invokeMethod(Invoker.java:645)
	at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:851)
	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1177)
	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:129)
	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:112)
	at org.testng.TestRunner.privateRun(TestRunner.java:756)
	at org.testng.TestRunner.run(TestRunner.java:610)
	at org.testng.SuiteRunner.runTest(SuiteRunner.java:387)
	at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:382)
	at org.testng.SuiteRunner.privateRun(SuiteRunner.java:340)
	at org.testng.SuiteRunner.run(SuiteRunner.java:289)
	at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
	at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
	at org.testng.TestNG.runSuitesSequentially(TestNG.java:1293)
	at org.testng.TestNG.runSuitesLocally(TestNG.java:1218)
	at org.testng.TestNG.runSuites(TestNG.java:1133)
	at org.testng.TestNG.run(TestNG.java:1104)
	at com.intellij.rt.testng.IDEARemoteTestNG.run(IDEARemoteTestNG.java:66)
	at com.intellij.rt.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:109)
	Suppressed: java.lang.Exception: SQL: INSERT INTO test_basic_operations_i542auhzr7 VALUES ('str4', 4)
		at io.trino.testing.DistributedQueryRunner.executeWithQueryId(DistributedQueryRunner.java:497)
		... 29 more
Caused by: java.lang.IllegalArgumentException: Location does not have parent: s3://trino-ci-test/test_glue_s3_0cm35ldrm5/two_trailing_slashes/test_basic_operations_i542auhzr7//
	at io.trino.filesystem.Locations.getParent(Locations.java:51)
	at io.trino.plugin.deltalake.DeltaLakeMetadata.allowWrite(DeltaLakeMetadata.java:1906)
	at io.trino.plugin.deltalake.DeltaLakeMetadata.checkWriteAllowed(DeltaLakeMetadata.java:1894)
	at io.trino.plugin.deltalake.DeltaLakeMetadata.beginInsert(DeltaLakeMetadata.java:1421)
	at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorMetadata.beginInsert(ClassLoaderSafeConnectorMetadata.java:519)
	at io.trino.tracing.TracingConnectorMetadata.beginInsert(TracingConnectorMetadata.java:589)
	at io.trino.metadata.MetadataManager.beginInsert(MetadataManager.java:941)
	at io.trino.tracing.TracingMetadata.beginInsert(TracingMetadata.java:598)
	at io.trino.sql.planner.optimizations.BeginTableWrite$Rewriter.createWriterTarget(BeginTableWrite.java:253)
	at io.trino.sql.planner.optimizations.BeginTableWrite$Rewriter.visitTableFinish(BeginTableWrite.java:188)
	at io.trino.sql.planner.optimizations.BeginTableWrite$Rewriter.visitTableFinish(BeginTableWrite.java:106)
	at io.trino.sql.planner.plan.TableFinishNode.accept(TableFinishNode.java:106)
	at io.trino.sql.planner.plan.SimplePlanRewriter$RewriteContext.rewrite(SimplePlanRewriter.java:81)
	at io.trino.sql.planner.plan.SimplePlanRewriter$RewriteContext.lambda$defaultRewrite$0(SimplePlanRewriter.java:72)
	at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:422)
	at io.trino.sql.planner.plan.SimplePlanRewriter$RewriteContext.defaultRewrite(SimplePlanRewriter.java:72)
	at io.trino.sql.planner.plan.SimplePlanRewriter.visitPlan(SimplePlanRewriter.java:37)
	at io.trino.sql.planner.plan.SimplePlanRewriter.visitPlan(SimplePlanRewriter.java:21)
	at io.trino.sql.planner.plan.PlanVisitor.visitOutput(PlanVisitor.java:49)
	at io.trino.sql.planner.plan.OutputNode.accept(OutputNode.java:83)
	at io.trino.sql.planner.plan.SimplePlanRewriter.rewriteWith(SimplePlanRewriter.java:31)
	at io.trino.sql.planner.optimizations.BeginTableWrite.optimize(BeginTableWrite.java:91)
	at io.trino.sql.planner.LogicalPlanner.runOptimizer(LogicalPlanner.java:300)
	at io.trino.sql.planner.LogicalPlanner.plan(LogicalPlanner.java:270)
	at io.trino.sql.planner.LogicalPlanner.plan(LogicalPlanner.java:239)
	at io.trino.sql.planner.LogicalPlanner.plan(LogicalPlanner.java:234)
	at io.trino.execution.SqlQueryExecution.doPlanQuery(SqlQueryExecution.java:483)
	at io.trino.execution.SqlQueryExecution.planQuery(SqlQueryExecution.java:463)
	at io.trino.execution.SqlQueryExecution.start(SqlQueryExecution.java:401)
	at io.trino.execution.SqlQueryManager.createQuery(SqlQueryManager.java:257)
	at io.trino.dispatcher.LocalDispatchQuery.startExecution(LocalDispatchQuery.java:145)
	at io.trino.dispatcher.LocalDispatchQuery.lambda$waitForMinimumWorkers$2(LocalDispatchQuery.java:129)
	at io.airlift.concurrent.MoreFutures.lambda$addSuccessCallback$12(MoreFutures.java:569)
	at io.airlift.concurrent.MoreFutures$3.onSuccess(MoreFutures.java:544)
	at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1133)
	at io.trino.$gen.Trino_testversion____20230620_094117_75.run(Unknown Source)

repro

steps being added in #17964

@findepi
Copy link
Member Author

findepi commented Jun 21, 2023

#17980 fixed some problems but some remain.
search for "#17966" in the code.

@findepi
Copy link
Member Author

findepi commented Jul 6, 2023

search for "#17966" in the code.

existing test coverage shows that the problem occurs only when table is created with explicit location ending with //.
in iceberg, we simply strip trailing slashes during table create. I am fine doing same for Delta.

We can also reject two trailing slashes on CREATE (why would anyone want them?) and keep supporting one trailing slash (without stripping).

@findinpath findinpath self-assigned this Jul 6, 2023
@findinpath
Copy link
Contributor

@findepi
Copy link
Member Author

findepi commented Jul 6, 2023

Currently after removing to unblock the test

if (locationPattern == TWO_TRAILING_SLASHES && !partitioned && getClass().getName().contains(".deltalake.")) {

the test fails at
Location targetLocation = sourceLocation.parentDirectory().appendPath(session.getQueryId() + "_" + randomUUID());
String targetRelativePath = relativePath(tablePath, targetLocation.toString());

this operation fails

tablePath = 'foo//'
sourceLocation = 'foo//source-file'
sourceLocation.parentDirectory() = 'foo/'
targetLocation = 'foo/target-file'  // not under `foo//` anymore

the easy solution would be to workaround Location#parentDirectory controversy (#18155) by replacing parentDirectory().appendPath(...) with replaceFileName(...) (or sibling(...)). The new method would replace portion of the path after last slash with new content, thus faithfully preserving however many slashes there are.

The new operation would be equivalent to

sourceLocation.toString().replaceAll("[^/]+$", "") + "....")

eg

Location targetLocation = Location.of(sourceLocation.toString().replaceAll("[^/]+$", "") + session.getQueryId() + "_" + randomUUID());

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment