-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request: Deepseek Coder V2 model #2451
Comments
Thanks for the feature request.
Could you turn on |
Sorry, but it can not log anything, I added environment variables and it's empty at all, hangs as running status and nothing working. Command line It utilize 98,42% CPU (%) in docker desktop and task manager's hardware monitor looks like this |
to pass environment flag to docker, you need to do something like It's very likely the stuck is caused by model loading / computation, though |
I guess it because of version of llama.cpp tabby using model using this for quantization https://github.com/ggerganov/llama.cpp/releases/tag/b3166 Logs |
Sorry for external links, but I found some people stuck with the problem on russian forum named Habr: I attach the link with text of error they stuck (did i receive the same message? idk) |
Right - this means the support of DeepseekCoder v2 in llama.cpp is only added very recently, will try include it in the upcoming 0.13 release |
Just for added context, ollama just started support for deepseekcoder v2. See https://github.com/ollama/ollama/releases/tag/v0.1.45 I was wondering the same from tabby. Thanks again, looking forward to release 0.13 |
For context - you can actually connect tabby to ollama by using config.toml based model configuration: https://tabby.tabbyml.com/docs/administration/model/#ollama |
Can you give an example shown how to create tabby server with model configuration by |
Here is a simple example im currently using to run tabby with an config.toml [model.completion.http]
kind = "ollama/completion"
model_name = "deepseek-coder"
api_endpoint = "http://ollama:11434" # Insert your URL here
prompt_template = "<|fim▁begin|>{prefix}<|fim▁hole|>{suffix}<|fim▁end|>"
[model.chat.http]
kind = "ollama/chat"
model_name = "deepseek-coder-v2"
api_endpoint = "http://ollama:11434" # Insert your URL here docker-compose.yml version: '3.5'
services:
tabby:
restart: always
image: ghcr.io/tabbyml/tabby:0.13.0-rc.3
command: serve
volumes:
- "./tabby:/data"
- "./config.toml:/data/config.toml"
ports:
- 8080:8080 Basically to use the |
Thanks for contributing such an example @LLukas22 Right - since ollama is a backend able to serve multiple models concurrently. If interested, consider make an edit at https://github.com/TabbyML/tabby/edit/main/website/docs/administration/model.md to contribute a PR, thank you! |
Hi, @LLukas22, I noticed you're specifying "prompt_template" for Ollama. As far as I know, Ollama expects pure prompt text. It maintains it's own templates in it's modelfiles. Is Tabby ignoring "prompt_template" for Ollama? Otherwise, if Tabby is formatting it's prompts using "prompt_template" and passing that to Ollama, the results won't be correct. Edit: Oh, the state of prompt templates are still a total mess! Ollama doesn't support FIM in prompt templates yet. See unit-mesh/auto-dev-vscode#61 and ollama/ollama#5207. It looks like CodeGPT is trying to make some Ollama changes carlrobertoh/ProxyAI#510 but they realized llama.cpp can't get it right either carlrobertoh/ProxyAI#510 (comment). What a mess! I guess defining "prompt_template" is the only reliable way to implement FIM with Ollama and llama.cpp? |
As far as i can tell, the
Then to perform the fill in the middle (FIM) task, tabby has to format the instruction as a fill in the middle task by applying the
This basically results in the following combined prompt:
Where the prefix and suffix are inserted by tabby. But i could be wrong here since tabby is using the completions endpoint instead of the chat endpoint of |
@LLukas22 Thanks for the response. I dug in deeper and figured out a few things:
The new lines matter.
This allows Tabby's prompt to pass through without anything being added to it. System context is not supported (I read it on the deepseek's github).
You must leave off the begin▁of▁sentence. |
@LLukas22 The debug log from ollama tell that all is ok with that. I used a different model for now:
However, stop words is common problem with ollama. starcoder2 have same issue, for example, and creating modelfile is required. |
Supported since https://github.com/TabbyML/tabby/releases/tag/v0.13.1 (though we haven't add it to official registry) |
I would like to use bartowski/DeepSeek-Coder-V2-Lite-Instruct-GGUF, but I have a problem with launching it properly with TabbyML.
I forked a registry and start downloading it correctly, but after downloading tabby is not responding anyway, I tried to wait, but it doesn't work and logs are empty at all.
Additional context
Registry https://github.com/kba-tmn3/registry-tabby
Command line
Source https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct

GGUF https://huggingface.co/bartowski/DeepSeek-Coder-V2-Lite-Instruct-GGUF
Accurancy of the model
Please reply with a 👍 if you want this feature.
The text was updated successfully, but these errors were encountered: