Skip to content

Commit

Permalink
Fix Iceberg read when a batch doesn't have matching delete positions
Browse files Browse the repository at this point in the history
Co-authored-by: Ying Su <[email protected]>
  • Loading branch information
Naveen Kumar Mahadevuni and yingsu00 committed Jul 18, 2024
1 parent 49b57ca commit 1f44741
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 1 deletion.
4 changes: 3 additions & 1 deletion velox/connectors/hive/iceberg/PositionalDeleteFileReader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -231,7 +231,9 @@ void PositionalDeleteFileReader::updateDeleteBitmap(
// the deleteBitmapBuffer should be the largest position among all delte files
deleteBitmapBuffer->setSize(std::max(
(uint64_t)deleteBitmapBuffer->size(),
deletePositionsOffset_ == 0
deletePositionsOffset_ == 0 ||
(deletePositionsOffset_ < deletePositionsVector->size() &&
deletePositions[deletePositionsOffset_] > rowNumberUpperBound)
? 0
: bits::nbytes(
deletePositions[deletePositionsOffset_ - 1] + 1 - offset)));
Expand Down
2 changes: 2 additions & 0 deletions velox/connectors/hive/iceberg/tests/IcebergReadTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -538,6 +538,8 @@ TEST_F(HiveIcebergTest, singleBaseFileMultiplePositionalDeleteFiles) {
// Delete the first and last row in each batch (10000 rows per batch).
assertSingleBaseFileMultipleDeleteFiles({{0}, {9999}, {10000}, {19999}});

assertSingleBaseFileMultipleDeleteFiles({{500, 21000}});

assertSingleBaseFileMultipleDeleteFiles(
{makeRandomIncreasingValues(0, 10000),
makeRandomIncreasingValues(10000, 20000),
Expand Down

0 comments on commit 1f44741

Please sign in to comment.