PageStorage: improvement for the workload some pages not updated for a long time #2473

JaySon-Huang · 2021-07-23T04:00:05Z

After #2436, we can keep the number of legacy files around 100.
But it may generate a HUGE PageFile with lots of pages when GC keeps pushing forward.

I split the migrate entries into smaller batches, trying to control the memory peak while running DataCompactor:: mergeValidPages.
However, if there are some pages not updated for a long time. We need to rewrite those pages to another PageFile with a higher "compact_sequence" so that it won't stop LegacyCompactor from compacting the meta part. It will cause high read && write amplification while doing this GC.

Assume that:

there is a PageFile (generated by DataCompactor) (naming pf1_0) whose size is 2GiB, and all pages on it are in a WriteBatch with sequence=100. The pages on pf1_0 are not updated for a long time, the valid rate of it is high (0.8).
there are some PageFiles with a lower valid rate (0.2) (naming pf2_0,pf3_0) with the highest WriteBatch sequence=900
there are many legacy PageFile and we need to compact pf1_0,pf2_0,pf3_0 into a new PageFile pf3_1 with the sequence=900 to push the GC forward. Or it will cause problems like this Reduce the memory cost when there are stale snapshots for PageStorage #2199 (comment)

The key points are:

Need to generate pf3_1 with sequence=900 instead of sequence=100, or it will block legacy compactor from compacting those legacy files
Can not directly rename pf1_0/data to pf3_1/data, cause it may be reading by other threads
There are still some valid pages on pf2_0,pf3_0 we need to do migration

Here may be a solution for reducing the read/write amplification without rewriting the whole PageStorage:

Create .tmp.pf3_1/data as a hard link to pf1_0/data
Read those entries on pf1_0/meta and let those entries point to .tmp.pf3_1/data, with sequence=900, saving to .tmp.pf3_1/meta
Rename .tmp.pf3_1 to pf3_1 (data "compaction" for high valid rate PageFiles)
Create another PageFile .tmp.pf3_2 and migrate those pages on pf2_0,pf3_0 to .tmp.pf3_2 with sequence=900
Rename .tmp.pf3_2 to pf3_2 (data compaction for low valid rate PageFiles)

Cause pf3_1/data is a hard link to pf1_0/data, they share the same inode. So the read/write amplification is greatly reduced.
But we should not append those valid pages on pf2_0,pf3_0 to pf3_1, or it may cause other problems(?) for those threads already reading on pf1_0.

@flowbehappy @jiaqizho PTAL

Finally, we need to redesign the PageStorage in the near future. Splitting the "meta" part from the "data" part could greatly reduce the complexity of GC strategies. But it is not discussed in this issue.

Originally posted by @JaySon-Huang in #2382 (comment)

The text was updated successfully, but these errors were encountered:

JaySon-Huang · 2021-12-28T08:38:21Z

Following bug(#3438) and its fix(#3645)

JaySon-Huang assigned jiaqizho Jul 23, 2021

JaySon-Huang added the type/bugfix This PR fixes a bug. label Jul 23, 2021

JaySon-Huang mentioned this issue Jul 23, 2021

PageStorage GC may not able to cleanup as expected under heavy update workload #2382

Closed

jiaqizho mentioned this issue Jul 27, 2021

improvement for the PageStorage GC workload #2503

Closed

jiaqizho mentioned this issue Sep 16, 2021

PageStorage GC improvement: update sid in data compact #3094

Merged

ti-chi-bot closed this as completed in #3094 Oct 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PageStorage: improvement for the workload some pages not updated for a long time #2473

PageStorage: improvement for the workload some pages not updated for a long time #2473

JaySon-Huang commented Jul 23, 2021 •

edited

Loading

JaySon-Huang commented Dec 28, 2021 •

edited

Loading

PageStorage: improvement for the workload some pages not updated for a long time #2473

PageStorage: improvement for the workload some pages not updated for a long time #2473

Comments

JaySon-Huang commented Jul 23, 2021 • edited Loading

JaySon-Huang commented Dec 28, 2021 • edited Loading

JaySon-Huang commented Jul 23, 2021 •

edited

Loading

JaySon-Huang commented Dec 28, 2021 •

edited

Loading