Support openai compatible APIs #106

tmm1 · 2023-07-14T16:34:34Z

projects such as LocalAI offer an openai compatible web API

https://github.com/go-skynet/LocalAI

maybe the hardcoded api endpoint can be parameterized using a new environment variable?

llm/llm/default_plugins/openai_models.py

Line 36 in 3f1388a

"https://api.openai.com/v1/models",

for example, see this post about using the app ChatWizard as a front-end to LocalAI hosted models: https://www.reddit.com/r/LocalLLaMA/comments/14w2767/recommendation_an_ingenious_frontend_localai/

tmm1 · 2023-07-14T20:33:57Z

another option is https://github.com/lhenault/simpleAI

h/t https://github.com/paul-gauthier/aider/blob/743d3f0d1c6a301cdd74cb5f22d5bcddc6535bef/docs/faq.md?plain=1#L85-L98

simonw · 2023-07-14T20:52:02Z

This almost works already - the code you linked to here isn't the code that talks to the language model to activate prompts, it's just the code that powers the llm openai models command.

LLM talks to OpenAI directly like this:

llm/llm/default_plugins/openai_models.py

Lines 173 to 194 in 3f1388a

    
           if stream: 
        
               completion = openai.ChatCompletion.create( 
        
                   model=prompt.model.model_id, 
        
                   messages=messages, 
        
                   stream=True, 
        
                   **not_nulls(prompt.options), 
        
               ) 
        
               chunks = [] 
        
               for chunk in completion: 
        
                   chunks.append(chunk) 
        
                   content = chunk["choices"][0].get("delta", {}).get("content") 
        
                   if content is not None: 
        
                       yield content 
        
               response.response_json = combine_chunks(chunks) 
        
           else: 
        
               completion = openai.ChatCompletion.create( 
        
                   model=prompt.model.model_id, 
        
                   messages=messages, 
        
                   stream=False, 
        
               ) 
        
               response.response_json = completion.to_dict_recursive() 
        
               yield completion.choices[0].message.content

Since it's using the openai.ChatCompletion library directly, you should be able to point it at other endpoint URLs by setting an environment variable:

export OPENAI_API_BASE='http://localhost:8080/'

But... that's not going to get you all of the way there, because like you pointed out you need to be able to specify a different model name.

The current official way of solving that is to write a plugin, as detailed here: https://llm.datasette.io/en/stable/plugins/tutorial-model-plugin.html

So a llm-localai plugin could be one way forward here.

Bit it would be nice if you could use the existing OpenAI plugin to access other OpenAI-compatible models.

The challenge is how best to design that feature. One option would be to use the existing options mechanism:

llm -m chatgpt "Say hello" -o apI_base "http://localhost:8080/" -o custom_model "name-of-model"

But I don't like that, because it would result in all of those other models being logged in the same place as gpt-3.5-turbo completions.

Really we want to be able to define new models - llm -m NAME - which under the hood use the existing OpenAI plugin code but with those extra settings.

I'll have a think about ways that might work.

simonw · 2023-07-14T21:57:12Z

I got LocalAI working on my Mac:

git clone https://github.com/go-skynet/LocalAI
cd LocalAI
cp ~/.cache/gpt4all/orca-mini-3b.ggmlv3.q4_0.bin models/orca-mini-3b.ggmlv3
cp prompt-templates/alpaca.tmpl models/orca-mini-3b.ggmlv3.tmpl
docker-compose up -d --pull always

At this point it didn't seem to work. It turned out it has a LOT of things it needs to do on first launch before the web server becomes ready - running GCC a bunch of times etc.

I ran this to find it's process ID:

docker ps

Then repeatedly ran this to see how far it had got:

docker logs c0041c248973

Eventually it started properly and this worked:

curl http://localhost:8080/v1/models | jq

{
  "object": "list",
  "data": [
    {
      "id": "ggml-gpt4all-j",
      "object": "model"
    },
    {
      "id": "orca-mini-3b.ggmlv3",
      "object": "model"
    }
  ]
}

And this:

curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
     "model": "orca-mini-3b.ggmlv3",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }' | jq

{
  "object": "chat.completion",
  "model": "orca-mini-3b.ggmlv3",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": " No, this is not a test!"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}

simonw · 2023-07-15T17:03:49Z

Documentation: https://llm.datasette.io/en/latest/other-models.html#openai-compatible-models

Refs #106, #107, #108, #109

simonw mentioned this issue Jul 14, 2023

OpenAI default plugin should support registering additional models #107

Closed

simonw closed this as completed in e2072f7 Jul 15, 2023

simonw added the enhancement New feature or request label Jul 15, 2023

simonw added this to the 0.6 milestone Jul 15, 2023

simonw added a commit that referenced this issue Jul 15, 2023

Improved compatible models documentation, refs #106, #107

723d349

simonw added a commit that referenced this issue Jul 18, 2023

Release 0.6

9a177ab

Refs #106, #107, #108, #109

lun-4 mentioned this issue Sep 7, 2023

Server mode simonw/llm-llama-cpp#13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support openai compatible APIs #106

Support openai compatible APIs #106

tmm1 commented Jul 14, 2023 •

edited by simonw

Loading

tmm1 commented Jul 14, 2023

simonw commented Jul 14, 2023

simonw commented Jul 14, 2023

simonw commented Jul 15, 2023

Support openai compatible APIs #106

Support openai compatible APIs #106

Comments

tmm1 commented Jul 14, 2023 • edited by simonw Loading

tmm1 commented Jul 14, 2023

simonw commented Jul 14, 2023

simonw commented Jul 14, 2023

simonw commented Jul 15, 2023

tmm1 commented Jul 14, 2023 •

edited by simonw

Loading