-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
util/chunk: optimize (*ListInDisk).GetChunk
and add a fast row container reader
#45130
Conversation
c24a452
to
2b1480b
Compare
GetChunk
and add a fast row container reader
GetChunk
and add a fast row container reader(*ListInDisk).GetChunk
and add a fast row container reader
Signed-off-by: Yang Keao <[email protected]>
2b1480b
to
108efd2
Compare
As tested, spawning a new goroutine to read is faster for long row, but slower for short rows. It's because creating a new goroutine to just read 64KB is a waste 🤦.
But both method is much faster than the original implementation. |
} | ||
} | ||
|
||
func TestCloseRowContainerReader(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What the test means? seems the same as the last one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It only reads 8.5 chunks, but doesn't drain out the whole row container. It tests whether the reader can be closed successfully (and no goroutine leaks) when it doesn't reach the end.
Signed-off-by: Yang Keao <[email protected]>
Signed-off-by: Yang Keao <[email protected]>
/retest |
/retest |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: wshwsh12, xhebox The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
In response to a cherrypick label: new pull request created to branch |
In response to a cherrypick label: new pull request created to branch |
In response to a cherrypick label: new pull request created to branch |
What problem does this PR solve?
Issue Number: close #45125
Problem Summary:
The existing reading method of
RowContainer
(GetChunk(...)
) is not fast enough for dumping a lot of rows from disk (for thecursorFetch
use case).The existing
Iterator4RowContainer
is even slower, as it allocates a new chunk for each row 🤦.This PR is extracted from #44730 (with a some refractor).
What is changed and how it works?
This PR pipelines the IO and CPU calculation, to make full use of the IO bandwidth. It should also help other features using
rowContainer
, asGetChunk
is now much faster.The performance of existing benchmark
BenchmarkListInDisk_GetChunk
increases from2877471ns/op
to462864ns/op
Check List
Tests