Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
79020: kv/bulk: chunk SSTs to row boundaries r=dt a=dt Previously an sstable might end due to size at /table/i/rowX/col/Y, if some, but not all, families for rowX fit in that file. This is OK as far as KV and SQL are concerned, since after we add the next file which will start with rowX/colZ, the row is complete from the point of view of any scan. However it does mean that if, after adding this file we determine that we need to split before adding the next file, that split, as it must be at a row boundary, will be at rowX, not rowX/colZ. This too is OK, but has the slight downside of meaning that when we scatter the new RHS, starting at rowX, we have to move the colY family KV we just added in the prior prior file. While it is typically a trivial amount of data, it does make the RHS non-empty and thus require _some_ cost to move. This changes the size-based limit that triggers a file flush to wait for the next row boundary after the size is exceeded, so that SST bounds now also fall on row, and thus any future range split, bounds. This is particularly relevant in conjunction with #78218. Release note: none. Co-authored-by: David Taylor <[email protected]>
- Loading branch information