Hive cannot read ORC ACID table updated by Trino twice #8448

homar · 2021-07-01T14:36:52Z

Fixes #8268
The problem was caused by multiple rows having
the same (writeId, bucket, rowId). In order to fix this
it is necessary to ensure unique row IDs across writers.
To achieve it different writers will have separated
id ranges in the split assigned to them

losipiuk · 2021-07-01T17:20:37Z

Add "Fixes: issue_id" to PR desc.

losipiuk · 2021-07-01T17:22:17Z

...rino-product-tests/src/main/java/io/trino/tests/product/hive/TestHiveTransactionalTable.java

+        onTrino().executeQuery("INSERT INTO test_double_update VALUES(1, 'x');");
+        onTrino().executeQuery("INSERT INTO test_double_update VALUES(2, 'y');");
+        onTrino().executeQuery("UPDATE test_double_update SET column2 = 'xy1';");
+        onTrino().executeQuery("UPDATE test_double_update SET column2 = 'xy2';");
+        assertThat(onHive().executeQuery("SELECT * FROM test_double_update;")).containsOnly(row(1, "xy2"), row(2, "xy2"));
+        onHive().executeQuery("DROP TABLE IF EXISTS test_double_update");


drop ; from query strings

Can we check if it reads correctly in both Trino and Hive ?

sure, will do

plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveSplit.java

losipiuk · 2021-07-01T17:24:27Z

plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveSplit.java

@@ -75,7 +77,8 @@ public HiveSplit(
            @JsonProperty("bucketConversion") Optional<BucketConversion> bucketConversion,
            @JsonProperty("bucketValidation") Optional<BucketValidation> bucketValidation,
            @JsonProperty("s3SelectPushdownEnabled") boolean s3SelectPushdownEnabled,
-            @JsonProperty("acidInfo") Optional<AcidInfo> acidInfo)
+            @JsonProperty("acidInfo") Optional<AcidInfo> acidInfo,
+            @JsonProperty("startingRowId") OptionalLong initialRowId)


keep JSON key and parameter name consistent

losipiuk · 2021-07-01T17:37:32Z

plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveSplitSource.java

@@ -362,6 +364,7 @@ else if (maxSplitBytes * 2 >= remainingBlockBytes) {
                    splitBytes = internalSplit.getEnd() - internalSplit.getStart();
                }

+                OptionalLong initialRowId = OptionalLong.of((Long.MAX_VALUE / getBufferedInternalSplitCount()) * currentSplitIndex);


Using getBufferedInternalSplitCount() for slicing id space between splits is not correct. It is not how many splits will there be in total for table. But how many ware currently buffered for execution.
Splits are dynamically loaded in the background by BackgroundHiveSplitLoader and value returned by getBufferedInternalSplitCount() will change over time. It also decreases as splits are taken away from the queue for execution.

I do not think we can do any better than what @electrum suggested in #8268 (comment):

Allocate unique row IDs across writers by assigning ranges in the splits. For example, we could give each writer a large range of 2^42, which would allow both a huge number of rows and splits.

We need to statically decide how many splits we allow, and how many rows we allow for each split.
E.g. If we give each split 2^42 possible values we can have:

4,194,304 splits

4,398,046,511,104 rows per split

should be enough. Still we should control at runtime if we did not exceed either number of splits or number of rows generated for given split, and fail query nicely if that happens.

Ok thanks for the in depth analysis, I will rework this slightly.

losipiuk · 2021-07-05T08:36:13Z

plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveSplitSource.java

+    private static final long MAX_NUMBER_OF_SPLITS = 4194304; //we want to have 2^42 different ids per split so this is the maximum possible number of splits (2^64/2^42)
+    private static final long MAX_NUMBER_OF_ROWS_PER_SPLIT = 4398046511104L;


Suggested change

private static final long MAX_NUMBER_OF_SPLITS = 4194304; //we want to have 2^42 different ids per split so this is the maximum possible number of splits (2^64/2^42)

private static final long MAX_NUMBER_OF_ROWS_PER_SPLIT = 4398046511104L;

// We partitions the rowId splace between splits assigning each split 2^42 ids.

// As we need to encode the split number in id it allows us to have at most 2^22 splits per query

private static final long MAX_NUMBER_OF_ROWS_PER_SPLIT = 1L << 42;

private static final long MAX_NUMBER_OF_SPLITS = 1L << 22;

losipiuk · 2021-07-05T08:42:18Z

plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveSplitSource.java

@@ -362,6 +367,9 @@ else if (maxSplitBytes * 2 >= remainingBlockBytes) {
                    splitBytes = internalSplit.getEnd() - internalSplit.getStart();
                }

+                long currentSplitNumber = numberOfProcessedSplits.getAndIncrement();
+                checkState(currentSplitNumber < MAX_NUMBER_OF_SPLITS, "Number of splits is higher than maximum possible number of splits");
+                OptionalLong initialRowId = OptionalLong.of(currentSplitNumber * MAX_NUMBER_OF_ROWS_PER_SPLIT);


I would prefer currentSplitNumber << 42.

Also please extract 42 as PER_SPLIT_ROW_ID_BITS and 64 - PER_SPLIT_ROW_ID_BITS as SPLIT_ID_BITS

losipiuk · 2021-07-05T08:49:06Z

plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveUpdatablePageSource.java

+            OptionalLong initialRowId,
+            OptionalLong maxNumberOfRowsPerSplit)


why optional

good point, thanks

losipiuk · 2021-07-05T08:52:40Z

plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveUpdatablePageSource.java

        }
+        checkState(maxNumberOfRowsPerSplit > insertRowCounter, "Trying to insert too many rows in a single split");


Add maxNumberOfRowsPerSplit to error message. Also make it throw TrinoException with GENERIC_INSUFFICIENT_RESOURCES error code

losipiuk · 2021-07-05T08:53:20Z

plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveSplitSource.java

@@ -362,6 +367,9 @@ else if (maxSplitBytes * 2 >= remainingBlockBytes) {
                    splitBytes = internalSplit.getEnd() - internalSplit.getStart();
                }

+                long currentSplitNumber = numberOfProcessedSplits.getAndIncrement();
+                checkState(currentSplitNumber < MAX_NUMBER_OF_SPLITS, "Number of splits is higher than maximum possible number of splits");


Throw TrinoException with GENERIC_INSUFFICIENT_RESOURCES error code

homar · 2021-07-05T16:02:54Z

@losipiuk comments addressed

losipiuk · 2021-07-06T07:23:04Z

plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveSplitSource.java

+    // We partitions the rowId space between splits assigning each split 2^42 ids.
+    // As we need to encode the split number in id it allows us to have at most 2^22 splits per query


nit: move comment to the top of section which devices 4 constants

losipiuk · 2021-07-06T07:43:53Z

plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveSplitSource.java

+                if (currentSplitNumber >= MAX_NUMBER_OF_SPLITS) {
+                    throw new TrinoException(GENERIC_INSUFFICIENT_RESOURCES, format("Number of splits is higher than maximum possible number of splits %d", MAX_NUMBER_OF_SPLITS));
+                }


Actually, I have second thoughts about that.
I did not think about this before, but I do not like the fact that current code it limits the number of splits that can be processed by a query, even if we are not doing UPDATE and HiveUpdatablePageSource is not being constructed.
What I think would be better, to just record splitNumber in HiveSplit here (instead intialRowId and maxNumberOfRowsPerSplit).

Then we can compute do the validation and compute initialRowId and maxNumberOfRowsPerSplit in HivePageSourceProvider where HiveUpdatablePageSource is created.
Then we can make the splitNumber in HiveSplit non-optional.
And we should move constants definitions to HivePageSourceProvider.

WDYT @homar ?

sounds really good, thanks

homar · 2021-07-06T08:56:28Z

@losipiuk I addressed all the comments

Fixes trinodb#8268 The problem was caused by multiple rows having the same (writeId, bucket, rowId). In order to fix this it is necessary to ensure unique row IDs across writers. To achieve it different writers will have separated id ranges in the split assigned to them

losipiuk

Thx

losipiuk · 2021-07-06T11:45:49Z

CI: #8478

findepi · 2021-07-12T12:10:02Z

plugin/trino-hive/src/main/java/io/trino/plugin/hive/HivePageSourceProvider.java

+
+                long currentSplitNumber = hiveSplit.getSplitNumber();
+                if (currentSplitNumber >= MAX_NUMBER_OF_SPLITS) {
+                    throw new TrinoException(GENERIC_INSUFFICIENT_RESOURCES, format("Number of splits is higher than maximum possible number of splits %d", MAX_NUMBER_OF_SPLITS));


Provide actual value of currentSplitNumber in the exception message as well

Not sure if super beneficial. Most of the time it will be == MAX_NUMBER_OF_SPLITS as we are generating splits from sequence; and we throw as soon as we cross the boundary.

cla-bot bot added the cla-signed label Jul 1, 2021

homar requested a review from losipiuk July 1, 2021 14:37

losipiuk reviewed Jul 1, 2021

View reviewed changes

plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveSplit.java Outdated Show resolved Hide resolved

losipiuk reviewed Jul 1, 2021

View reviewed changes

homar force-pushed the hive-cannot-read-orc-acid-table-updated-by-trino-twice branch 3 times, most recently from d0d60ee to 8bc3ee8 Compare July 2, 2021 15:28

losipiuk reviewed Jul 5, 2021

View reviewed changes

homar force-pushed the hive-cannot-read-orc-acid-table-updated-by-trino-twice branch 3 times, most recently from 883d1b3 to e0b84f1 Compare July 5, 2021 14:54

losipiuk reviewed Jul 6, 2021

View reviewed changes

homar force-pushed the hive-cannot-read-orc-acid-table-updated-by-trino-twice branch from e0b84f1 to 7b83bf6 Compare July 6, 2021 07:43

losipiuk reviewed Jul 6, 2021

View reviewed changes

homar force-pushed the hive-cannot-read-orc-acid-table-updated-by-trino-twice branch 2 times, most recently from 736ddc6 to 776ed3b Compare July 6, 2021 08:41

homar force-pushed the hive-cannot-read-orc-acid-table-updated-by-trino-twice branch from 776ed3b to 10768f1 Compare July 6, 2021 10:13

losipiuk approved these changes Jul 6, 2021

View reviewed changes

losipiuk merged commit fcd6b8e into trinodb:master Jul 6, 2021

losipiuk mentioned this pull request Jul 6, 2021

Release notes for 360 #8455

Closed

11 tasks

losipiuk added this to the 360 milestone Jul 6, 2021

findepi reviewed Jul 12, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hive cannot read ORC ACID table updated by Trino twice #8448

Hive cannot read ORC ACID table updated by Trino twice #8448

homar commented Jul 1, 2021 •

edited

Loading

losipiuk commented Jul 1, 2021

losipiuk Jul 1, 2021

homar Jul 1, 2021

Praveen2112 Jul 2, 2021

homar Jul 2, 2021

losipiuk Jul 1, 2021

homar Jul 1, 2021

losipiuk Jul 1, 2021

homar Jul 1, 2021

losipiuk Jul 5, 2021

homar Jul 5, 2021

losipiuk Jul 5, 2021

homar Jul 5, 2021

losipiuk Jul 5, 2021

homar Jul 5, 2021

losipiuk Jul 5, 2021

homar Jul 5, 2021

losipiuk Jul 5, 2021

homar Jul 5, 2021

homar commented Jul 5, 2021

losipiuk Jul 6, 2021

homar Jul 6, 2021

losipiuk Jul 6, 2021

homar Jul 6, 2021

homar commented Jul 6, 2021

losipiuk left a comment

losipiuk commented Jul 6, 2021 •

edited

Loading

findepi Jul 12, 2021

losipiuk Jul 13, 2021

		private static final long MAX_NUMBER_OF_SPLITS = 4194304; //we want to have 2^42 different ids per split so this is the maximum possible number of splits (2^64/2^42)
		private static final long MAX_NUMBER_OF_ROWS_PER_SPLIT = 4398046511104L;

		OptionalLong initialRowId,
		OptionalLong maxNumberOfRowsPerSplit)

		}
		checkState(maxNumberOfRowsPerSplit > insertRowCounter, "Trying to insert too many rows in a single split");

		// We partitions the rowId space between splits assigning each split 2^42 ids.
		// As we need to encode the split number in id it allows us to have at most 2^22 splits per query

Hive cannot read ORC ACID table updated by Trino twice #8448

Hive cannot read ORC ACID table updated by Trino twice #8448

Conversation

homar commented Jul 1, 2021 • edited Loading

losipiuk commented Jul 1, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

homar commented Jul 5, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

homar commented Jul 6, 2021

losipiuk left a comment

Choose a reason for hiding this comment

losipiuk commented Jul 6, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

homar commented Jul 1, 2021 •

edited

Loading

losipiuk commented Jul 6, 2021 •

edited

Loading