-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Memory limited merge join #5632
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
assert_contains!( | ||
err.to_string(), | ||
"Resources exhausted: Failed to allocate additional" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the unit test
.map(|arr| arr.get_array_memory_size()) | ||
.sum::<usize>() | ||
+ batch.num_rows().next_power_of_two() * 8 | ||
+ 24; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+ 24; | |
+ sizeof::<Range>() | |
+ sizeof::<usize>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, it may vary. I've fixed it in separate commit, thank you!
Which issue does this PR close?
Closes #5220.
Rationale for this change
Control over memory allocation in
SortMergeJoinExec
What changes are included in this PR?
SMJStream.reservation
-- memory reservation to control buffered data size -- it may require significant amount of memory, for example, in case of multiple batches with the same join key valuespeak_memory_used
-- during join execution, normally, bothtry_grow
andshrink
methods are called -- on adding new buffered batches and on flushing them if join key has changed, respectively. In this case it seems reasonable to track only peak memory for each partition -- after that it's summed up across all partitions which is not really precise, but still valuable in terms of showing "worst case" (in fact peak memory used, probably, can be lower)SessionConfig
and perform multiple checks on returned error (used to ensure that memory allocation failed in specific operator/stream)Are these changes tested?
Are there any user-facing changes?
SortMergeJoinExec
should now respect runtime memory limitations.