Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add options for growable memory and single state buffers #104

Merged
merged 7 commits into from
Nov 15, 2023
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions protobuf/model_config.proto
Original file line number Diff line number Diff line change
Expand Up @@ -1382,6 +1382,36 @@ message ModelSequenceBatching
//@@ The optional field to specify the initial state for the model.
//@@
repeated InitialState initial_state = 5;

//@@ .. cpp:var:: bool use_same_buffer_for_input_output
//@@
//@@ The optional field to use a single buffer for both input and output
//@@ state. Without this option, Triton uses separate buffers for input
nnshah1 marked this conversation as resolved.
Show resolved Hide resolved
//@@ and output state which can be problematic if the state size is
//@@ large. This option can help reduce the memory usage when
nnshah1 marked this conversation as resolved.
Show resolved Hide resolved
//@@ the state size is large. There is no harm in always enabling
nnshah1 marked this conversation as resolved.
Show resolved Hide resolved
//@@ this option if the output state will be written after
//@@ the input state is read by the framework. Note that when using this
nnshah1 marked this conversation as resolved.
Show resolved Hide resolved
//@@ option, `TRITONBACKEND_StateUpdate` has no effect and the state
//@@ will be always updated.
//@@ The default value is false.
//@@
bool use_same_buffer_for_input_output = 6;

//@@ .. cpp:var:: bool use_growable_memory
//@@
//@@ The optional field to allow an implicit state buffer to grow the
Copy link
Contributor

@nnshah1 nnshah1 Nov 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a user visible option? Wondering if we could always grow memory if using CUDA memory in the implementation.

nnshah1 marked this conversation as resolved.
Show resolved Hide resolved
//@@ buffer when the size changes during a sequence. When using this
nnshah1 marked this conversation as resolved.
Show resolved Hide resolved
//@@ option Triton guarantess that it will use the same allocations
//@@ even if the state size changes.
//@@ The added size will be appended to the end of the buffer and the
//@@ will be preserved. Currently, this option only applies for
//@@ implicit state that uses CUDA memory and use_single_buffer must be
//@@ existing data enabled.
//@@
//@@ The default value is false.
//@@
bool use_growable_memory = 7;
nnshah1 marked this conversation as resolved.
Show resolved Hide resolved
}

//@@ .. cpp:var:: message StrategyDirect
Expand Down