Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate to using completions endpoint by default #628

Merged
merged 2 commits into from
Dec 15, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/lmstudio.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,12 @@

**Context Overflow Policy = Stop at limit**: If you see "Context Overflow Policy" inside LM Studio's "Tools" panel on the right side (below "Server Model Settings"), set it to **Stop at limit**. The default setting "Keep the system prompt ... truncate middle" will break MemGPT.

!!! note "Update your LM Studio"

The current `lmstudio` backend will only work if your LM Studio is version 0.2.9 or newer.

If you are on a version of LM Studio older than 0.2.9 (<= 0.2.8), select `lmstudio-legacy` as your backend type.

<img width="911" alt="image" src="https://github.com/cpacker/MemGPT/assets/5475622/d499e82e-348c-4468-9ea6-fd15a13eb7fa">

1. Download [LM Studio](https://lmstudio.ai/) and the model you want to test with
Expand Down
2 changes: 1 addition & 1 deletion memgpt/cli/cli_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ def configure_llm_endpoint(config: MemGPTConfig):
model_endpoint_type = "azure"
model_endpoint = get_azure_credentials()["azure_endpoint"]
else: # local models
backend_options = ["webui", "webui-legacy", "llamacpp", "koboldcpp", "ollama", "lmstudio", "vllm", "openai"]
backend_options = ["webui", "webui-legacy", "llamacpp", "koboldcpp", "ollama", "lmstudio", "lmstudio-legacy", "vllm", "openai"]
default_model_endpoint_type = None
if config.model_endpoint_type in backend_options:
# set from previous config
Expand Down
4 changes: 3 additions & 1 deletion memgpt/local_llm/chat_completion_proxy.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,9 @@ def get_chat_completion(
elif endpoint_type == "webui-legacy":
result, usage = get_webui_completion_legacy(endpoint, prompt, context_window, grammar=grammar_name)
elif endpoint_type == "lmstudio":
result, usage = get_lmstudio_completion(endpoint, prompt, context_window)
result, usage = get_lmstudio_completion(endpoint, prompt, context_window, api="completions")
elif endpoint_type == "lmstudio-legacy":
result, usage = get_lmstudio_completion(endpoint, prompt, context_window, api="chat")
elif endpoint_type == "llamacpp":
result, usage = get_llamacpp_completion(endpoint, prompt, context_window, grammar=grammar_name)
elif endpoint_type == "koboldcpp":
Expand Down
1 change: 1 addition & 0 deletions memgpt/local_llm/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
"koboldcpp": "http://localhost:5001",
"llamacpp": "http://localhost:8080",
"lmstudio": "http://localhost:1234",
"lmstudio-legacy": "http://localhost:1234",
"ollama": "http://localhost:11434",
"webui-legacy": "http://localhost:5000",
"webui": "http://localhost:5000",
Expand Down
2 changes: 1 addition & 1 deletion memgpt/local_llm/lmstudio/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@


# TODO move to "completions" by default, not "chat"
def get_lmstudio_completion(endpoint, prompt, context_window, settings=SIMPLE, api="chat"):
def get_lmstudio_completion(endpoint, prompt, context_window, settings=SIMPLE, api="completions"):
"""Based on the example for using LM Studio as a backend from https://github.com/lmstudio-ai/examples/tree/main/Hello%2C%20world%20-%20OpenAI%20python%20client"""
from memgpt.utils import printd

Expand Down
Loading