-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory limited nested-loop join #5564
Conversation
Thank you @korowa -- I plan to review this tomorrow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @korowa -- I went through this PR commit by commit and I think it looks great. Breaking it into two commits made it much easier to review.
cc @liukun4515 who contributed this operator originally in #4562
/// Operator-level memory reservation for left data | ||
reservation: OperatorMemoryReservation, | ||
/// Execution metrics | ||
metrics: ExecutionPlanMetricsSet, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for unifying the metrics at the same time
); | ||
let filter = prepare_join_filter(); | ||
|
||
let join_types = vec![ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
Benchmark runs are scheduled for baseline = 26eb406 and contender = 612eb1d. 612eb1d is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Part of #5220.
Rationale for this change
Control over memory allocations in
NestedLoopJoinExec
What changes are included in this PR?
First commit modifies join operator:
NestedLoopJoinExec
(for build-side data) and toNestedLoopJoinStream
(forvisited_left_side
)NestedLoopJoinExec
now usesBuildProbeJoinMetrics
and is able to expose metrics viaexplain analyze
Second commit is an attempt to avoid nested mutexes (discussion) -- it's solved via structs for
SharedMemoryReservation
&SharedOptionalMemoryReservation
(instead ofOperatorMemoryReservation
), andTryGrow
trait, used in build-side data collection inHashJoinExec
(both types of reservations potentially could be passed intocollect_left
closure).Are these changes tested?
There are test cases for
MemoryReservation
wrappersAre there any user-facing changes?
Nested-loop joins should respect memory pool limits