Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the memory cost when there are stale snapshots for PageStorage #2199

Closed
JaySon-Huang opened this issue Jun 17, 2021 · 3 comments
Closed
Assignees

Comments

@JaySon-Huang
Copy link
Contributor

JaySon-Huang commented Jun 17, 2021

We meet some extreme cases that block PageStorage GC from running normally, for example:

These extreme cases/bugs leave lots of PageFile on disks. When running GC, we open lots of (several hundred to thousands of) PageFiles and read all theirs meta parts from data in PageFile::MetaMergingReader::initialize.

https://github.com/pingcap/tics/blob/ec5f976a8fb85db497d3f9f67cd0717885d8075a/dbms/src/Storages/Page/gc/LegacyCompactor.cpp#L110-L119
https://github.com/pingcap/tics/blob/ec5f976a8fb85db497d3f9f67cd0717885d8075a/dbms/src/Storages/Page/PageFile.cpp#L249-L255

Assume that each meta part of one PageFile is 500KiB, if there are 2000 PageFiles left on disk, then each round of GC we need to scan 1 GiB.
Instead of reading all meta parts of (thousands of) PageFiles once, we can allocate a smaller buffer size in PageFile::MetaMergingReader::initialize, and read the rest data from the disk while running PageFile::MetaMergingReader::moveNext.
This change will keep some file descriptor opened for a while and call ::read several times, but it can reduce the memory cost when there are lots of PageFiles.

@JaySon-Huang
Copy link
Contributor Author

Instead of reading all meta parts of (thousands of) PageFiles once, we can allocate a smaller buffer size in PageFile::MetaMergingReader::initialize, and read the rest data from the disk while running PageFile::MetaMergingReader::moveNext.
This change will keep some file descriptor opened for a while and call ::read several times, but it can reduce the memory cost when there are lots of PageFiles.

  • Related intro doc for PageStorage (not up-to-date): DeltaMerge Storage Engine in Alpha
  • Check dbms/src/IO/ReadBufferFromFileDescriptor.h, it should be helpful for this task

@jiangyuzhao

@JaySon-Huang
Copy link
Contributor Author

An example for if we don't resolve this issue:
image

@JaySon-Huang
Copy link
Contributor Author

close as PageStorage V3 is released on v6.2.0 and this problem is addressed in #4989

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants