Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set the ollama_models list based on locally available models when litellm starts. #8116

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

sakoht
Copy link

@sakoht sakoht commented Jan 30, 2025

Title

Set ollama_models from the locally installed list at bootup.

Relevant issues

Fixes #8095

Type

🆕 New Feature
🐛 Bug Fix

Changes

Previously: the ollama_models list was hard-coded to one model that might or might not be available on the host.
Now: if the host has an ollama install available, it gets the list at bootup, and if not it has a zero-size list

Previously: using ollama/* or ollama_chat/* failed wildcard expansion when used in a config.
Now: Either work to match all ollama models available at litellm start.

Attach a screenshot of any new tests passing locally

No new tests.

Functionality Demonstration

A CLI-based demonstration of litellm finding the full list of local ollama models:

> curl http://localhost:4000/models | jq '.data[].id' | grep ollama 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 11154  100 11154    0     0  3370k      0 --:--:-- --:--:-- --:--:-- 3630k
"ollama/chsword/deepseek-v3:latest"
"ollama/command-r7b:7b"
"ollama/command-r7b:latest"
"ollama/deepseek-coder:6.7b"
"ollama/deepseek-r1:14b"
"ollama/dolphin3:8b"
"ollama/granite3.1-dense:8b"
"ollama/granite3.1-moe:3b"
"ollama/llama3-groq-tool-use:8b"
"ollama/llama3.2:3b"
"ollama/llama3.2:latest"
"ollama/nezahatkorkmaz/deepseek-v3:latest"
"ollama/olmo2:13b"
"ollama/orca2:7b"
"ollama/phi4:14b"
"ollama/qwen2.5-coder:14b"
"ollama/qwen2.5-coder:32b"

Demonstration of equivalent data in the local ollama system:

> ollama ls
NAME                                 ID              SIZE      MODIFIED     
deepseek-r1:14b                      ea35dfe18182    9.0 GB    9 days ago      
phi4:14b                             ac896e5b8b34    9.1 GB    12 days ago     
llama3.2:3b                          a80c4f17acd5    2.0 GB    12 days ago     
qwen2.5-coder:32b                    4bd6cbf2d094    19 GB     2 weeks ago     
command-r7b:latest                   bb9fe394b3e9    5.1 GB    2 weeks ago     
command-r7b:7b                       bb9fe394b3e9    5.1 GB    2 weeks ago     
chsword/deepseek-v3:latest           2ef1746094f1    2.0 GB    2 weeks ago     
olmo2:13b                            6c279ebc980f    8.4 GB    2 weeks ago     
nezahatkorkmaz/deepseek-v3:latest    2ef1746094f1    2.0 GB    2 weeks ago     
granite3.1-moe:3b                    df6f6578dba8    2.0 GB    2 weeks ago     
granite3.1-dense:8b                  86ac4cf0cb84    4.9 GB    2 weeks ago     
llama3-groq-tool-use:8b              36211dad2b15    4.7 GB    2 weeks ago     
dolphin3:8b                          d5ab9ae8e1f2    4.9 GB    2 weeks ago     
deepseek-coder:6.7b                  ce298d984115    3.8 GB    3 weeks ago     
qwen2.5-coder:14b                    3028237cc8c5    9.0 GB    2 months ago    
orca2:7b                             ea98cc422de3    3.8 GB    2 months ago    
llama3.2:latest                      a80c4f17acd5    2.0 GB    2 months ago    

Demonstration of the model counts matching (1-off b/c of the header):

> ollama ls | wc -l
      18
>
> curl http://localhost:4000/models | jq '.data[].id' | grep ollama | sort | wc -l
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 11154  100 11154    0     0  2829k      0 --:--:-- --:--:-- --:--:-- 3630k
      17

Copy link

vercel bot commented Jan 30, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 30, 2025 8:23pm

@sakoht sakoht changed the title Ollama refresh at bootup Set the ollama_models list based on locally available models when litellm starts. Jan 30, 2025
@sakoht sakoht marked this pull request as ready for review January 30, 2025 19:22
@@ -4146,7 +4146,7 @@ def _get_model_info_helper( # noqa: PLR0915
supports_prompt_caching=None,
supports_pdf_input=None,
)
elif custom_llm_provider == "ollama" or custom_llm_provider == "ollama_chat":
elif (custom_llm_provider == "ollama" or custom_llm_provider == "ollama_chat") and "*" not in model:
return litellm.OllamaConfig().get_model_info(model)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows ollama/* and ollama_chat/* to fall through to the else clause, where the entry becomes a placeholder that expands to the full model list later.

# Since Ollama models are local, it is inexpensive to refresh the list at startup.
# Also, this list can change very quickly.

ollama_dir = os.environ.get("OLLAMA_MODELS", None)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the canonical way that the directory to use might be specified, though when not set has the home dir default.

url = "http://localhost:11434/api/tags"
try:
response = requests.get(url)
json_dict = response.json()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this is fast and light, and only runs if an Ollama directory is specified and present. So it should be a no-op on systems that do not use ollama.

@@ -925,6 +949,7 @@ def add_known_models():
"bedrock": bedrock_models + bedrock_converse_models,
"petals": petals_models,
"ollama": ollama_models,
"ollama_chat": ollama_models,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was required for ollama_chat/* to work, keeping it symmetrical w/ support for ollama_chat/MODELNAME.

@sakoht
Copy link
Author

sakoht commented Jan 30, 2025

@ishaan-jaff I just pushed a change to:

  • set the ollama_models list in initialize()
  • do it by calling a function refresh_local_model_lists() that can cover Ollama and other local model services (huggingface, etc.)

@@ -2833,6 +2836,39 @@ async def initialize( # noqa: PLR0915
if experimental:
pass
user_telemetry = telemetry
refresh_local_model_lists()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this assume the user always has ollama models setup?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be a no-op if there are no local models accessible?

from litellm import ollama_models

# Use the same list reference in the module to avoid import order problems.
ollama_models.clear()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list remains empty unless there are local models found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wildcard not expanding for ollama_chat to find models dynamically.
3 participants