Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support openai compatible APIs #106

Closed
tmm1 opened this issue Jul 14, 2023 · 4 comments
Closed

Support openai compatible APIs #106

tmm1 opened this issue Jul 14, 2023 · 4 comments
Labels
enhancement New feature or request
Milestone

Comments

@tmm1
Copy link

tmm1 commented Jul 14, 2023

projects such as LocalAI offer an openai compatible web API

https://github.com/go-skynet/LocalAI

maybe the hardcoded api endpoint can be parameterized using a new environment variable?

"https://api.openai.com/v1/models",

for example, see this post about using the app ChatWizard as a front-end to LocalAI hosted models: https://www.reddit.com/r/LocalLLaMA/comments/14w2767/recommendation_an_ingenious_frontend_localai/

@simonw
Copy link
Owner

simonw commented Jul 14, 2023

This almost works already - the code you linked to here isn't the code that talks to the language model to activate prompts, it's just the code that powers the llm openai models command.

LLM talks to OpenAI directly like this:

if stream:
completion = openai.ChatCompletion.create(
model=prompt.model.model_id,
messages=messages,
stream=True,
**not_nulls(prompt.options),
)
chunks = []
for chunk in completion:
chunks.append(chunk)
content = chunk["choices"][0].get("delta", {}).get("content")
if content is not None:
yield content
response.response_json = combine_chunks(chunks)
else:
completion = openai.ChatCompletion.create(
model=prompt.model.model_id,
messages=messages,
stream=False,
)
response.response_json = completion.to_dict_recursive()
yield completion.choices[0].message.content

Since it's using the openai.ChatCompletion library directly, you should be able to point it at other endpoint URLs by setting an environment variable:

export OPENAI_API_BASE='http://localhost:8080/'

But... that's not going to get you all of the way there, because like you pointed out you need to be able to specify a different model name.

The current official way of solving that is to write a plugin, as detailed here: https://llm.datasette.io/en/stable/plugins/tutorial-model-plugin.html

So a llm-localai plugin could be one way forward here.

Bit it would be nice if you could use the existing OpenAI plugin to access other OpenAI-compatible models.

The challenge is how best to design that feature. One option would be to use the existing options mechanism:

llm -m chatgpt "Say hello" -o apI_base "http://localhost:8080/" -o custom_model "name-of-model"

But I don't like that, because it would result in all of those other models being logged in the same place as gpt-3.5-turbo completions.

Really we want to be able to define new models - llm -m NAME - which under the hood use the existing OpenAI plugin code but with those extra settings.

I'll have a think about ways that might work.

@simonw
Copy link
Owner

simonw commented Jul 14, 2023

I got LocalAI working on my Mac:

git clone https://github.com/go-skynet/LocalAI
cd LocalAI
cp ~/.cache/gpt4all/orca-mini-3b.ggmlv3.q4_0.bin models/orca-mini-3b.ggmlv3
cp prompt-templates/alpaca.tmpl models/orca-mini-3b.ggmlv3.tmpl
docker-compose up -d --pull always

At this point it didn't seem to work. It turned out it has a LOT of things it needs to do on first launch before the web server becomes ready - running GCC a bunch of times etc.

I ran this to find it's process ID:

docker ps

Then repeatedly ran this to see how far it had got:

docker logs c0041c248973

Eventually it started properly and this worked:

curl http://localhost:8080/v1/models | jq
{
  "object": "list",
  "data": [
    {
      "id": "ggml-gpt4all-j",
      "object": "model"
    },
    {
      "id": "orca-mini-3b.ggmlv3",
      "object": "model"
    }
  ]
}

And this:

curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
     "model": "orca-mini-3b.ggmlv3",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }' | jq
{
  "object": "chat.completion",
  "model": "orca-mini-3b.ggmlv3",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": " No, this is not a test!"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}

@simonw simonw closed this as completed in e2072f7 Jul 15, 2023
@simonw
Copy link
Owner

simonw commented Jul 15, 2023

Documentation: https://llm.datasette.io/en/latest/other-models.html#openai-compatible-models

@simonw simonw added the enhancement New feature or request label Jul 15, 2023
@simonw simonw added this to the 0.6 milestone Jul 15, 2023
simonw added a commit that referenced this issue Jul 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants