Skip to content

Commit

Permalink
memory pool example (#12849)
Browse files Browse the repository at this point in the history
  • Loading branch information
2010YOUY01 authored Oct 11, 2024
1 parent 3b6aac2 commit 3bc7714
Showing 1 changed file with 25 additions and 1 deletion.
26 changes: 25 additions & 1 deletion datafusion/execution/src/memory_pool/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -68,11 +68,35 @@ pub use pool::*;
/// Note that a `MemoryPool` can be shared by concurrently executing plans,
/// which can be used to control memory usage in a multi-tenant system.
///
/// # How MemoryPool works by example
///
/// Scenario 1:
/// For `Filter` operator, `RecordBatch`es will stream through it, so it
/// don't have to keep track of memory usage through [`MemoryPool`].
///
/// Scenario 2:
/// For `CrossJoin` operator, if the input size gets larger, the intermediate
/// state will also grow. So `CrossJoin` operator will use [`MemoryPool`] to
/// limit the memory usage.
/// 2.1 `CrossJoin` operator has read a new batch, asked memory pool for
/// additional memory. Memory pool updates the usage and returns success.
/// 2.2 `CrossJoin` has read another batch, and tries to reserve more memory
/// again, memory pool does not have enough memory. Since `CrossJoin` operator
/// has not implemented spilling, it will stop execution and return an error.
///
/// Scenario 3:
/// For `Aggregate` operator, its intermediate states will also accumulate as
/// the input size gets larger, but with spilling capability. When it tries to
/// reserve more memory from the memory pool, and the memory pool has already
/// reached the memory limit, it will return an error. Then, `Aggregate`
/// operator will spill the intermediate buffers to disk, and release memory
/// from the memory pool, and continue to retry memory reservation.
///
/// # Implementing `MemoryPool`
///
/// You can implement a custom allocation policy by implementing the
/// [`MemoryPool`] trait and configuring a `SessionContext` appropriately.
/// However, mDataFusion comes with the following simple memory pool implementations that
/// However, DataFusion comes with the following simple memory pool implementations that
/// handle many common cases:
///
/// * [`UnboundedMemoryPool`]: no memory limits (the default)
Expand Down

0 comments on commit 3bc7714

Please sign in to comment.