[BUG] Limit size of buffer read by batched multi-source JSON lines reader to be at most INT_MAX
bytes
#17058
Labels
bug
Something isn't working
Milestone
Describe the bug
With the implementation of the reallocate-and-retry logic when the initial buffer size estimate fails for byte range reading (PR #16687), the total buffer size read per batch can exceed 2GiB.
Steps/Code to reproduce bug
diff file:
Expected behavior
The size of the buffer returned by
get_record_range_raw_input
should not exceed 2GiB.Proposed solution: stop at the line end before the 2GB limit and adjust the batch offsets for the remaining batches.
The text was updated successfully, but these errors were encountered: