Memory Overflows when Using CodeLlama7B Model on Titan XP #587

ClarkWain · 2023-10-18T10:39:35Z

Hello,

I have been using your Tabby project and it's been very helpful. However, I've encountered an issue regarding memory management when I use the CodeLlama7B model.

Here's a detailed description of the problem:

Description

After starting the server with CodeLlama7B model on a Titan XP, the initial memory usage is around 6-7GB. After a single code completion, the memory usage increases to roughly 8GB. Following a few more code completions, the system throws an Out-Of-Memory (OOM) error.

Here is the error log:

2023-10-18T08:50:49.589409Z INFO tabby::serve: crates/tabby/src/serve/mod.rs:165: Starting server, this might takes a few minutes...
2023-10-18T08:52:40.522802Z INFO tabby::serve: crates/tabby/src/serve/mod.rs:183: Listening at 0.0.0.0:8080
terminate called after throwing an instance of 'std::runtime_error'
what(): CUDA failed with error out of memory

I've tested other models including StarCoder1B, StarCoder3B, and StarCoder7B, and they all work well.

The text was updated successfully, but these errors were encountered:

ClarkWain · 2023-10-18T10:41:41Z

here is the command:

docker run -it \
  --gpus all -p 8080:8080 -v /data1/docker_main:/data \
  tabbyml/tabby \
  serve --model TabbyML/CodeLlama-7B --device cuda

wsxiaoys · 2023-10-18T10:52:20Z

Thank you for reporting this. It seems to be a recurring issue, as reported on #541 (comment).

I'm investigating to identify the culprit between versions 0.2.0 and 0.3.0.

wsxiaoys · 2023-10-19T04:03:37Z

Would you mind sharing the output of your health endpoint? You can acquire it using the following command:

curl -X POST http://localhost:8080/v1/health

ClarkWain · 2023-10-19T06:13:02Z

Would you mind sharing the output of your health endpoint? You can acquire it using the following command:
curl -X POST http://localhost:8080/v1/health

{"model":"TabbyML/CodeLlama-7B","device":"cuda","compute_type":"auto","arch":"x86_64","cpu_info":"Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz","cpu_count":32,"cuda_devices":["NVIDIA TITAN Xp","NVIDIA TITAN Xp","NVIDIA TITAN Xp","NVIDIA TITAN Xp"],"version":{"build_date":"2023-10-10","build_timestamp":"2023-10-10T02:46:12.584424280Z","git_sha":"3580d6f5510060714266d3031ee352d43826d56d","git_describe":"v0.2.2"}}

wsxiaoys · 2023-10-19T06:16:48Z

Since the issue mentioned in #541 (comment) states that the out-of-memory (OOM) problem is resolved with version 0.3.0, would you mind trying out version 0.3.0 to check if the OOM issue persists?

ClarkWain · 2023-10-19T06:18:29Z

Since the issue mentioned in #541 (comment) states that the out-of-memory (OOM) problem is resolved with version 0.3.0, would you mind trying out version 0.3.0 to check if the OOM issue persists?

OK, I will try.

ClarkWain · 2023-10-21T08:59:08Z

I have tried it, the memory usage of CodaLlama7B is within the range of 7GB~8GB, and there is no OOM anymore. Thanks

wsxiaoys · 2023-10-21T19:34:46Z

Released in https://github.com/TabbyML/tabby/releases/tag/v0.3.1

ClarkWain added the bug Something isn't working label Oct 18, 2023

wsxiaoys mentioned this issue Oct 20, 2023

fix: cap parallelism to 4 for cuda to avoid oom #601

Merged

wsxiaoys closed this as completed Oct 21, 2023

wsxiaoys mentioned this issue Oct 27, 2023

Out of Memory on 2x Titan XP #648

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory Overflows when Using CodeLlama7B Model on Titan XP #587

Memory Overflows when Using CodeLlama7B Model on Titan XP #587

ClarkWain commented Oct 18, 2023

ClarkWain commented Oct 18, 2023

wsxiaoys commented Oct 18, 2023

wsxiaoys commented Oct 19, 2023

ClarkWain commented Oct 19, 2023

wsxiaoys commented Oct 19, 2023

ClarkWain commented Oct 19, 2023

ClarkWain commented Oct 21, 2023

wsxiaoys commented Oct 21, 2023

Memory Overflows when Using CodeLlama7B Model on Titan XP #587

Memory Overflows when Using CodeLlama7B Model on Titan XP #587

Comments

ClarkWain commented Oct 18, 2023

Description

ClarkWain commented Oct 18, 2023

wsxiaoys commented Oct 18, 2023

wsxiaoys commented Oct 19, 2023

ClarkWain commented Oct 19, 2023

wsxiaoys commented Oct 19, 2023

ClarkWain commented Oct 19, 2023

ClarkWain commented Oct 21, 2023

wsxiaoys commented Oct 21, 2023