Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

warm up issue #1810

Closed
loveppdog opened this issue Jul 20, 2020 · 1 comment
Closed

warm up issue #1810

loveppdog opened this issue Jul 20, 2020 · 1 comment

Comments

@loveppdog
Copy link

loveppdog commented Jul 20, 2020

I test warm up in triton 20.03 with my models "tensorflow_savedmodel" to solve oom issues.
I have some tests like this:(models loading successfully for all tests)

  1. warm up model A with instance_group 1. After trt start-up, GPU memory usage shows:3200MB。When one inference is doing, GPU memory is still 3200MB.
  2. warm up model A with imstance_group 2. After trt start-up, GPU just shows:3209MB. (I think it should be 6000MB+, why not? If two inferences are simultaneous, does gpu memory change to 6000MB+? )
  3. warm up models A,B,C,...F with diffenrent instance_group . After trt start-up, GPU just shows:9000MB. (When inferences, I find GPU becomes larger by 11000MB and it will be OOM. I have warmed up, why gpu memory still changes ?)

note: our models input/output shape are fixed.

instance_group [
{
count: 1
kind: KIND_GPU
}
]
model_warmup {
batch_size: 1
inputs {
key: "input_1"
value {
data_type: TYPE_FP32
dims: [1, 128, 128, 128 ]
random_data: true
}

@deadeyegoodwin
Copy link
Contributor

When you say "trt" do you mean "triton"? trt is TensorRT which is a different thing than Triton and so it is confusing.

Warming up models will not necessarily fix OOM issues. Please see #1507 for a discussion about how frameworks allocate some memory at load time and then additional memory at inference time. Tensorflow framework dynamically allocates memory at inference time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants