-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
which gpu will instance model exist if not set gpus:[0] #1609
Comments
|
1.Triton will load all models in the model repository in model control model "poll" and "none". |
Our pipeline will run many models or multiple identical models concurrently.
When trtserver starts up, I use default the model control mode poll and our model respository has lots of models.
If we only use one GPU, when inference, it will be OOM sometimes.
If we use two GPUs like the following and set difference gpus:[] for different models, it can work OK.
export CUDA_VISIBLE_DEVICES=6,7
#/bin/bash
trtserver --model-store=/data/ --grpc-infer-thread-count=16 --grpc-stream-infer-thread-count=16
instance_group [
{
count: 8
kind: KIND_GPU
gpus: [ 1 ]
}
]
My question is:
If I don't set gpus:[], which gpu will the instance models exist?
If there is not enough GPU memory, which models will unload?
When trt server starts up, loads all models?(if our model respository large)?
The text was updated successfully, but these errors were encountered: