-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory Overflows when Using CodeLlama7B Model on Titan XP #587
Comments
here is the command:
|
Thank you for reporting this. It seems to be a recurring issue, as reported on #541 (comment). I'm investigating to identify the culprit between versions 0.2.0 and 0.3.0. |
Would you mind sharing the output of your health endpoint? You can acquire it using the following command:
|
{"model":"TabbyML/CodeLlama-7B","device":"cuda","compute_type":"auto","arch":"x86_64","cpu_info":"Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz","cpu_count":32,"cuda_devices":["NVIDIA TITAN Xp","NVIDIA TITAN Xp","NVIDIA TITAN Xp","NVIDIA TITAN Xp"],"version":{"build_date":"2023-10-10","build_timestamp":"2023-10-10T02:46:12.584424280Z","git_sha":"3580d6f5510060714266d3031ee352d43826d56d","git_describe":"v0.2.2"}} |
Since the issue mentioned in #541 (comment) states that the out-of-memory (OOM) problem is resolved with version 0.3.0, would you mind trying out version 0.3.0 to check if the OOM issue persists? |
OK, I will try. |
I have tried it, the memory usage of CodaLlama7B is within the range of 7GB~8GB, and there is no OOM anymore. Thanks |
Hello,
I have been using your Tabby project and it's been very helpful. However, I've encountered an issue regarding memory management when I use the CodeLlama7B model.
Here's a detailed description of the problem:
Description
After starting the server with CodeLlama7B model on a Titan XP, the initial memory usage is around 6-7GB. After a single code completion, the memory usage increases to roughly 8GB. Following a few more code completions, the system throws an Out-Of-Memory (OOM) error.
Here is the error log:
I've tested other models including StarCoder1B, StarCoder3B, and StarCoder7B, and they all work well.
The text was updated successfully, but these errors were encountered: