Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate to using completions endpoint by default #628

Merged
merged 2 commits into from
Dec 15, 2023
Merged

Conversation

cpacker
Copy link
Collaborator

@cpacker cpacker commented Dec 15, 2023

Closes #595


Please describe the purpose of this pull request

Use completions endpoint for LM Studio instead of monkeypatch on chat/completions endpoint.

  • Make lmstudio-legacy backend type, make default use completions
  • Add note in docs about how if you're on <= 0.2.8, use lmstudio-legacy

How can we test your PR during review?

  • On new configure, test lmstudio, make sure it pipes to completions
  • On new configure, test lmstudio-legacy, make sure it pipes to chat/completions

Have you tested this PR?

Yes, see ^


LM Studio trace on completions endpoint:

 \"message\": \"More human than human is our motto.\"\n  }\n}\nFUNCTION RETURN: {\"status\": \"OK\", \"message\": null, \"time\": \"2023-12-15 12:14:09 PM \"}\nUSER: {\"type\": \"login\", \"last_login\": \"Never (first login)\", \"time\": \"2023-12-15 12:14:09 PM \"}\n### RESPONSE\nASSISTANT:\n{\n  \"function\":"
}
[2023-12-15 12:14:09.810] [INFO] Provided inference configuration: {
  "n_threads": 4,
  "n_predict": 8192,
  "top_k": 40,
  "top_p": 0.95,
  "temp": 0.8,
  "repeat_penalty": 1.1,
  "input_prefix": "<|im_end|>\n<|im_start|>user\n",
  "input_suffix": "<|im_end|>\n<|im_start|>assistant\n",
  "antiprompt": [
    "<|im_start|>",
    "<|im_end|>",
    "\nUSER:",
    "\nASSISTANT:",
    "\nFUNCTION RETURN:",
    "\nUSER",
    "\nASSISTANT",
    "\nFUNCTION RETURN",
    "\nFUNCTION",
    "\nFUNC",
    "<|im_sep|>"
  ],
  "pre_prompt": "",
  "seed": -1,
  "tfs_z": 1,
  "typical_p": 1,
  "repeat_last_n": 64,
  "frequency_penalty": 0,
  "presence_penalty": 0,
  "n_keep": 0,
  "logit_bias": {},
  "mirostat": 0,
  "mirostat_tau": 5,
  "mirostat_eta": 0.1,
  "memory_f16": true,
  "multiline_input": false,
  "penalize_nl": true
}
[2023-12-15 12:14:25.162] [INFO] Accumulated 62 tokens:  "send_message",
  "params": {
    "inner_thoughts": "I have been activated. I am now engaging with the user for the first time.",
    "message": "Hello Chad! It's great to meet you. What brings you here today?"
  }
}
[2023-12-15 12:14:25.285] [INFO] Generated prediction: {
  "id": "cmpl-",
  "object": "text_completion",
  "created": 1702671249,
  "model": "/lmstudio_models/TheBloke/OpenHermes-2.5-Mistral-7B-16k-GGUF/openhermes-2.5-mistral-7b-16k.Q8_0.gguf",
  "choices": [
    {
      "index": 0,
      "text": " \"send_message\",\n  \"params\": {\n    \"inner_thoughts\": \"I have been activated. I am now engaging with the user for the first time.\",\n    \"message\": \"Hello Chad! It's great to meet you. What brings you here today?\"\n  }\n}",
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 2873,
    "completion_tokens": 62,
    "total_tokens": 2935
  }
}

LM Studio trace on chat/completions endpoint:

[2023-12-15 12:26:48.756] [INFO] [LM STUDIO SERVER] Last message: { role: 'user', content: 'You are MemGPT, the latest version of Limnal Corporation's digital companion, developed in 2023.
You... (truncated in these logs)' } (total messages = 1)
[2023-12-15 12:27:00.315] [INFO] [LM STUDIO SERVER] Accumulating tokens ... (stream = false)
...
[2023-12-15 12:27:02.827] [INFO] [LM STUDIO SERVER] Generated prediction: {
  "id": "chatcmpl-",
  "object": "chat.completion",
  "created": 1702672008,
  "model": "/lmstudio_models/TheBloke/OpenHermes-2.5-Mistral-7B-16k-GGUF/openhermes-2.5-mistral-7b-16k.Q8_0.gguf",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "\"pause_heartbeats\",\n{\n  \"inner_thoughts\": \"Need some time to think.\",\n  \"minutes\": 5\n}\n### INPUT"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 2908,
    "completion_tokens": 34,
    "total_tokens": 2942
  }
}

@cpacker cpacker marked this pull request as ready for review December 15, 2023 20:20
@cpacker cpacker merged commit 5c49265 into main Dec 15, 2023
2 checks passed
@cpacker cpacker deleted the lmstudio-legacy branch December 15, 2023 20:29
sarahwooders pushed a commit that referenced this pull request Dec 26, 2023
* migrate to using completions endpoint by default

* added note about version to docs
norton120 pushed a commit to norton120/MemGPT that referenced this pull request Feb 15, 2024
* migrate to using completions endpoint by default

* added note about version to docs
mattzh72 pushed a commit that referenced this pull request Oct 9, 2024
* migrate to using completions endpoint by default

* added note about version to docs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate LM Studio API calls from chat to completions
1 participant