-
-
Notifications
You must be signed in to change notification settings - Fork 324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support openai compatible APIs #106
Comments
This almost works already - the code you linked to here isn't the code that talks to the language model to activate prompts, it's just the code that powers the LLM talks to OpenAI directly like this: llm/llm/default_plugins/openai_models.py Lines 173 to 194 in 3f1388a
Since it's using the
But... that's not going to get you all of the way there, because like you pointed out you need to be able to specify a different model name. The current official way of solving that is to write a plugin, as detailed here: https://llm.datasette.io/en/stable/plugins/tutorial-model-plugin.html So a Bit it would be nice if you could use the existing OpenAI plugin to access other OpenAI-compatible models. The challenge is how best to design that feature. One option would be to use the existing options mechanism: llm -m chatgpt "Say hello" -o apI_base "http://localhost:8080/" -o custom_model "name-of-model" But I don't like that, because it would result in all of those other models being logged in the same place as Really we want to be able to define new models - I'll have a think about ways that might work. |
I got LocalAI working on my Mac: git clone https://github.com/go-skynet/LocalAI
cd LocalAI
cp ~/.cache/gpt4all/orca-mini-3b.ggmlv3.q4_0.bin models/orca-mini-3b.ggmlv3
cp prompt-templates/alpaca.tmpl models/orca-mini-3b.ggmlv3.tmpl
docker-compose up -d --pull always At this point it didn't seem to work. It turned out it has a LOT of things it needs to do on first launch before the web server becomes ready - running GCC a bunch of times etc. I ran this to find it's process ID: docker ps Then repeatedly ran this to see how far it had got: docker logs c0041c248973 Eventually it started properly and this worked: curl http://localhost:8080/v1/models | jq {
"object": "list",
"data": [
{
"id": "ggml-gpt4all-j",
"object": "model"
},
{
"id": "orca-mini-3b.ggmlv3",
"object": "model"
}
]
} And this: curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "orca-mini-3b.ggmlv3",
"messages": [{"role": "user", "content": "Say this is a test!"}],
"temperature": 0.7
}' | jq {
"object": "chat.completion",
"model": "orca-mini-3b.ggmlv3",
"choices": [
{
"message": {
"role": "assistant",
"content": " No, this is not a test!"
}
}
],
"usage": {
"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0
}
} |
projects such as LocalAI offer an openai compatible web API
https://github.com/go-skynet/LocalAI
maybe the hardcoded api endpoint can be parameterized using a new environment variable?
llm/llm/default_plugins/openai_models.py
Line 36 in 3f1388a
for example, see this post about using the app ChatWizard as a front-end to LocalAI hosted models: https://www.reddit.com/r/LocalLLaMA/comments/14w2767/recommendation_an_ingenious_frontend_localai/
The text was updated successfully, but these errors were encountered: