Improve model token limit detection #3292

Weves · 2024-11-29T23:20:54Z

A few fixes:

(1) Ollama previously used a default context size of 2048. Upping that a bit.
(2) The older version of LiteLLM had a bug which made ollama + danswer error out intermittently. Fixed by upgrading (required some small changes to chat_llm accommodate for small behavior changes).

vercel · 2024-11-29T23:20:58Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
internal-search	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Nov 30, 2024 0:14am

Weves added 2 commits November 29, 2024 14:07

Properly find context window for ollama llama

213c6d5

Better ollama support + upgrade litellm

55a1f4d

vercel bot deployed to Preview November 29, 2024 23:21 View deployment

Ugprade OpenAI as well

8e13e17

vercel bot deployed to Preview November 29, 2024 23:25 View deployment

Fix mypy

e1920ab

vercel bot deployed to Preview November 30, 2024 00:14 View deployment

pablonyx approved these changes Nov 30, 2024

View reviewed changes

pablonyx added this pull request to the merge queue Nov 30, 2024

Merged via the queue into main with commit 16863de Nov 30, 2024
12 of 13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve model token limit detection #3292

Improve model token limit detection #3292

Weves commented Nov 29, 2024

vercel bot commented Nov 29, 2024 •

edited

Loading

Improve model token limit detection #3292

Improve model token limit detection #3292

Conversation

Weves commented Nov 29, 2024

vercel bot commented Nov 29, 2024 • edited Loading

vercel bot commented Nov 29, 2024 •

edited

Loading