[BUG] ollama models context size not properly imported/reflected #309

XReyRobert · 2023-12-27T14:03:55Z

Describe the bug
ollama models context size not properly imported/reflected

Where is it happening?
To Reproduce
import 128K ollama model (ex Yarn-mistral 7b-128k) show model details / max model tokens in UI

Expected behavior

Screenshots / context

If applicable, please add screenshots or additional context

enricoros · 2023-12-27T23:18:37Z

Thanks @XReyRobert . Unfortunately Ollama does not usually provide the context size, so it's assumed to be 4k across the board.

The /models API does not provide it, and the models list did not.

In your particular case, the name of the model has the context size, but that's a rarity.

What's the best way to deal with this, or to get context sizes for all models?

XReyRobert · 2023-12-27T23:30:46Z

Hi @enricoros,

There's a "show" endpoint that gives additional parameters when available:
for example mistrallite:latest and yarn-mistral:7b-128k will display this "num_ctx" parameter.

curl http://localhost:11434/api/show -d '{
  "name": "mistrallite:latest"
}' | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   992  100   958  100    34   656k  23876 --:--:-- --:--:-- --:--:--  968k
{
  "modelfile": "# Modelfile generated by \"ollama show\"\n# To build a new Modelfile based on this one, replace the FROM line with:\n# FROM mistrallite:latest\n\nFROM /usr/share/ollama/.ollama/models/blobs/sha256:fcfc737faf6b2bb5050752602ca341e92ec4d8208f2b5762bd656d447be9910e\nTEMPLATE \"\"\"<|prompter|>{{ .System }} {{ .Prompt }}</s><|assistant|>\n\"\"\"\nPARAMETER num_ctx 32768\nPARAMETER stop \"<|prompter|>\"\nPARAMETER stop \"<|assistant|>\"\nPARAMETER stop \"</s>\"",
  "parameters": "num_ctx                        32768\nstop                           <|prompter|>\nstop                           <|assistant|>\nstop                           </s>",
  "template": "<|prompter|>{{ .System }} {{ .Prompt }}</s><|assistant|>\n",
  "details": {
    "format": "gguf",
    "family": "llama",
    "families": null,
    "parameter_size": "7B",
    "quantization_level": "Q4_0"
  }
}

curl http://localhost:11434/api/show -d '{
  "name": "yarn-mistral:7b-128k"
}' | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   568  100   532  100    36   423k  29315 --:--:-- --:--:-- --:--:--  554k
{
  "modelfile": "# Modelfile generated by \"ollama show\"\n# To build a new Modelfile based on this one, replace the FROM line with:\n# FROM yarn-mistral:7b-128k\n\nFROM /usr/share/ollama/.ollama/models/blobs/sha256:14f2e225961b80d791d14c88def05fca31abc44ab1a7a12ba8e8f2365442e6e6\nTEMPLATE \"\"\"{{ .Prompt }}\"\"\"\nPARAMETER num_ctx 131072",
  "parameters": "num_ctx                        131072",
  "template": "{{ .Prompt }}",
  "details": {
    "format": "gguf",
    "family": "llama",
    "families": null,
    "parameter_size": "7B",
    "quantization_level": "Q4_0"
  }
}

peperunas · 2024-01-10T13:03:04Z

I confirm the bug. Also, for what it's worth, this Ollama release changelog specifies how to pass a 32k context window to Mixtral (and I suppose other models as well). https://github.com/jmorganca/ollama/releases/tag/v0.1.19

enricoros · 2024-01-10T19:02:14Z

I confirm the bug. Also, for what it's worth, this Ollama release changelog specifies how to pass a 32k context window to Mixtral (and I suppose other models as well). https://github.com/jmorganca/ollama/releases/tag/v0.1.19

Thanks! I'll prioritize this issue. I can quickly fix it as far as knowing the context size.

For the "32k Mixtral" the weird part is that it should not be the developer to tell the API what the context window is, but the other way around. Commonly, APIs usually pass a "max_tokens" parameter as a hard limit to the response length - I'm sure the Ollama folks will make the API more standard. Their recent /chat endpoint shows that they're on a good path.

Prioritized.

enricoros · 2024-01-26T09:50:05Z

@XReyRobert implemented, releasing in 3 hours in 1.12.0. Context size is inferred from num_ctx where available and set correctly. Please refer to Ollama / Jeffrey's post (https://github.com/jmorganca/ollama/releases/tag/v0.1.19) to alter that on your Ollama files.

Note that from testing, only yarn-mistral has a number set that's not 4096, while some models don't have parameters, don't have a 'num_ctx' value to parse within, or have it set to 4096.

XReyRobert changed the title ~~[BUG]~~ [BUG] ollama models context size not properly imported/reflected Dec 27, 2023

enricoros added this to big-AGI build-in-public roadmap Dec 28, 2023

github-project-automation bot moved this to Requests in big-AGI build-in-public roadmap Dec 28, 2023

enricoros added this to the 1.10.0 milestone Dec 28, 2023

enricoros added the good first issue Good for newcomers label Dec 28, 2023

enricoros removed this from the 1.10.0 milestone Jan 6, 2024

enricoros moved this from Requests to In Progress in big-AGI build-in-public roadmap Jan 10, 2024

enricoros modified the milestones: 1.11.0, 1.12.0 Jan 10, 2024

peperunas mentioned this issue Jan 17, 2024

🛍️ API for model information on ollama.ai ollama/ollama#1473

Closed

enricoros moved this from In Progress to Committed in big-AGI build-in-public roadmap Jan 19, 2024

enricoros moved this from Committed to Requests in big-AGI build-in-public roadmap Jan 19, 2024

enricoros moved this from Requests to Committed in big-AGI build-in-public roadmap Jan 23, 2024

enricoros closed this as completed in 18a1629 Jan 26, 2024

github-project-automation bot moved this from Committed to Ready in big-AGI build-in-public roadmap Jan 26, 2024

enricoros removed the good first issue Good for newcomers label Jan 26, 2024

enricoros mentioned this issue Jan 26, 2024

Release 1.12.0 #363

Closed

23 tasks

enricoros removed this from big-AGI build-in-public roadmap Jan 26, 2024

enricoros mentioned this issue Apr 9, 2024

Allow modification of context length for Ollama - [Roadmap] #495

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] ollama models context size not properly imported/reflected #309

[BUG] ollama models context size not properly imported/reflected #309

XReyRobert commented Dec 27, 2023

enricoros commented Dec 27, 2023

XReyRobert commented Dec 27, 2023 •

edited

Loading

peperunas commented Jan 10, 2024

enricoros commented Jan 10, 2024

enricoros commented Jan 26, 2024

[BUG] ollama models context size not properly imported/reflected #309

[BUG] ollama models context size not properly imported/reflected #309

Comments

XReyRobert commented Dec 27, 2023

enricoros commented Dec 27, 2023

XReyRobert commented Dec 27, 2023 • edited Loading

peperunas commented Jan 10, 2024

enricoros commented Jan 10, 2024

enricoros commented Jan 26, 2024

XReyRobert commented Dec 27, 2023 •

edited

Loading