Llama3.2-vision Run Error #7300

mruckman1 · 2024-10-21T16:40:09Z

What is the issue?

Updated Ollama this morning.
Entered ollama run x/llama3.2-vision on macbook
Got below output:

pulling manifest
pulling 652e85aa1e14... 100% ▕████████████████▏ 6.0 GB
pulling 622429e8d318... 100% ▕████████████████▏ 1.9 GB
pulling 962e0f69a367... 100% ▕████████████████▏ 163 B
pulling dc49c86b8ebb... 100% ▕████████████████▏ 30 B
pulling 6a50468ba2a8... 100% ▕████████████████▏ 498 B
verifying sha256 digest
writing manifest
success
> Error: llama runner process has terminated: error:Missing required key: clip.has_text_encoder

Expected: Ollama download without error.

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.3.14

The text was updated successfully, but these errors were encountered:

rick-github · 2024-10-21T17:39:23Z

Vision support was merged recently (#6963), 0.3.14 doesn't include it.

silasalves · 2024-10-21T20:05:38Z

What does "vision support" mean? Does it enabling "submitting multiple images for inference" or "video inference"? Or is it just the support for this particular model?

AFAIK, video or multiple images are still an open issue #3184

rick-github · 2024-10-21T20:12:41Z

Vision support for llama3.2. llama3.2 doesn't do video, and doesn't work reliably with multiple images.

pavan-otthi123 · 2024-10-22T04:51:53Z

Does this mean that llama3.2-vision can't be used in the current version of Ollama?

I'm also getting the same error when attempting to run the model

rick-github · 2024-10-22T07:57:42Z

Version 0.4.0 will support llama3.2-vision.

Animaxx · 2024-10-22T16:21:57Z

Thank you for the hard work, could we also this change to Llama.cpp repo as well?
How can we convert the model from HF to GGUF with llama vision structure?

silasalves · 2024-10-22T17:37:23Z

@rick-github thanks for the clarification! Also, any plans for making it run on the GPU? Llama3.2 runs on my GPU (GTX1660Ti), but llama3.2-vision runs on CPU only.

jessegross · 2024-10-22T17:54:13Z

@rick-github thanks for the clarification! Also, any plans for making it run on the GPU? Llama3.2 runs on my GPU (GTX1660Ti), but llama3.2-vision runs on CPU only.

It can run on the GPU but it needs more RAM than the text-only versions, so it has likely exceed the limit of your GPU.

rick-github · 2024-10-22T17:55:30Z

It should run on GPU if it fits:

$ ollama ps
NAME                            ID              SIZE    PROCESSOR       UNTIL   
x/llama3.2-vision:latest        25e973636a29    11 GB   100% GPU        Forever

If you can provide server logs perhaps we can see why it's not working for you.

silasalves · 2024-10-22T18:33:50Z

@jessegross Thanks for pointing that out. That sounds correct, my GPU is quite old and has only 4GB RAM.

@rick-github Thanks for the support, this is my server.log https://gist.github.com/silasalves/f2bdfc195618f19ecd557b945cab32b9

I think this is the important part?

time=2024-10-22T14:22:10.644-04:00 level=INFO source=llama-server.go:72 msg="system memory" total="31.9 GiB" free="13.6 GiB" free_swap="19.0 GiB"
time=2024-10-22T14:22:10.649-04:00 level=INFO source=memory.go:346 msg="offload to cuda" projector.weights="1.8 GiB" projector.graph="2.8 GiB" layers.requested=-1 layers.model=41 layers.offload=0 layers.split="" memory.available="[4.1 GiB]" memory.gpu_overhead="0 B" memory.required.full="5.9 GiB" memory.required.partial="0 B" memory.required.kv="320.0 MiB" memory.required.allocations="[0 B]" memory.weights.total="5.2 GiB" memory.weights.repeating="4.8 GiB" memory.weights.nonrepeating="411.0 MiB" memory.graph.full="213.3 MiB" memory.graph.partial="213.3 MiB"

rick-github · 2024-10-22T18:39:47Z

Yep, too big for your card.

pdevine · 2024-10-23T01:29:12Z

@Animaxx unfortunately backporting it to work with llama.cpp would be tricky because the image preparsing step is written in golang, and not c++.

I'm going to go ahead and close the issue since things are working as expected. You just need to use the pre-release to make it work.

ludos1978 · 2024-10-25T07:10:16Z

i've read that ollama 0.4 should support vision tasks.
but also i understood that 0.3.14 should be able to load the x/llama-vision model. Is that correct?

if it's correct i am getting the same error as mentioned above, on a 90GByte M2 Macbook using 0.3.14:
Error: llama runner process has terminated: error:Missing required key: clip.has_text_encoder

rick-github · 2024-10-25T14:45:58Z

0.3.14 cannot load x/llama3.2-vision.

eulercat · 2024-10-26T01:57:53Z

@pdevine
Is it possible to use REST API like this on the latest?

curl -X POST http://127.0.0.1:11434/api/chat \
-H "Content-Type: application/json" \
-d '{ "model": "x/llama3.2-vision", 
 "message": [
     {"role": "user", 
      "content": [
        {"type": "text", "text": "What’s in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
          },
        },
      ],
     }
] }'

pdevine · 2024-10-28T22:49:44Z

@eulercat we don't support pulling images w/ image_url. You'll have to base64 encode your image, so it looks like:

curl http://localhost:11434/api/chat -d '{
  "model": "x/llama3.2-vision",
  "messages": [
    {
      "role": "user",
      "content": "what is in this image?",
      "images": ["iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz/6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrJCIW98pzvxpAWyyo3HYwqS0+H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM/nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO+L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA+F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+vZToUQgzhkHXudb/PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX+b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2/FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/f7mgoAbYkAAAAAElFTkSuQmCC"]
    }
  ]
}'

You can find out more information here

pdevine · 2024-10-28T22:51:37Z

@ludos1978 you'll need 0.4.0 for it to work. Unfortunately we're still working through some issues w/ the release candidates.

rick-github · 2024-10-28T23:19:49Z

If the image is large, it will exceed the maximum argument length of the shell.

(echo '{
         "model":"x/llama3.2-vision",
         "messages":[
           { "role":"user",
             "content":"describe this image",
             "images":["' ;
               curl -s https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg | base64 -w0 ; echo '"
             ]
           }
         ],
         "stream":false
       }') | curl -s localhost:11434/api/chat -d @- | jq

{
  "model": "x/llama3.2-vision",
  "created_at": "2024-10-28T23:14:35.376161501Z",
  "message": {
    "role": "assistant",
    "content": "The image depicts a serene and peaceful scene, with a wooden boardwalk winding its way through a lush grassy field. The boardwalk is made of light-colored wood and features a simple design, with no visible railings or obstacles to obstruct the view.\n\nAs the boardwalk stretches out into the distance, it disappears from sight, inviting the viewer to imagine where it might lead. The surrounding grass is tall and green, swaying gently in the breeze, while trees dot the horizon, adding depth and texture to the landscape.\n\nAbove, a brilliant blue sky with white clouds provides a stunning backdrop, casting dappled shadows across the boardwalk and creating a sense of warmth and tranquility. Overall, the image exudes a sense of calmness and serenity, inviting the viewer to step into its peaceful world."
  },
  "done_reason": "stop",
  "done": true,
  "total_duration": 3744887728,
  "load_duration": 34980268,
  "prompt_eval_count": 13,
  "prompt_eval_duration": 45000000,
  "eval_count": 164,
  "eval_duration": 3302000000
}

jhowilbur · 2024-11-02T02:21:12Z

@Animaxx unfortunately backporting it to work with llama.cpp would be tricky because the image preparsing step is written in golang, and not c++.

I'm going to go ahead and close the issue since things are working as expected. You just need to use the pre-release to make it work.

But with some effort, I believe it will be possible to use their Golang binding to c++
they did it with whisper.cpp
https://github.com/ggerganov/whisper.cpp/tree/master/bindings/go

To our surprise, it's calling the same libraries as those used in llama.cpp, the core to do the tensor computations, the lib GGML written in cpp.

delenius · 2024-11-05T16:11:07Z

I am getting the same error on a M3 Macbook with 64gb, with Ollama 0.4.0-rc8.

rick-github · 2024-11-05T16:16:28Z

Server logs will help in debugging.

$ curl localhost:11434/api/version
{"version":"0.4.0-rc8"}
$ (echo '{
         "model":"x/llama3.2-vision",
         "messages":[
           { "role":"user",
             "content":"describe this image",
             "images":["' ;
               curl -s https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg | base64 -w0 ; echo '"
             ]
           }
         ],
         "stream":false
       }') | curl -s localhost:11434/api/chat -d @- | jq
{
  "model": "x/llama3.2-vision",
  "created_at": "2024-11-05T16:15:16.856668179Z",
  "message": {
    "role": "assistant",
    "content": "The image depicts a serene and peaceful scene, with a wooden boardwalk winding its way through a lush grassy field. The purpose of the image is to showcase the beauty of nature and the tranquility that can be found in such settings.\n\n* A wooden boardwalk:\n\t+ Winding its way through a grassy field\n\t+ Made of light-colored wood planks\n\t+ Surrounded by tall blades of grass on either side\n* Tall grass:\n\t+ Swaying gently in the breeze\n\t+ Varying shades of green, from light to dark\n\t+ Creating a sense of depth and texture in the image\n* Trees in the background:\n\t+ Scattered throughout the field\n\t+ Providing shade and shelter for wildlife\n\t+ Adding to the overall sense of serenity and calmness\n\nThe image effectively captures the beauty and tranquility of nature, inviting the viewer to step into the peaceful atmosphere. The use of natural colors and textures adds to the sense of realism, making the scene feel more immersive and engaging."
  },
  "done_reason": "stop",
  "done": true,
  "total_duration": 79628322199,
  "load_duration": 70623694007,
  "prompt_eval_count": 14,
  "prompt_eval_duration": 2349000000,
  "eval_count": 212,
  "eval_duration": 6235000000
}

mruckman1 added the bug Something isn't working label Oct 21, 2024

pdevine closed this as completed Oct 23, 2024

Animaxx mentioned this issue Oct 24, 2024

Llama-3.2 11B Vision Support ggerganov/llama.cpp#9643

Open

pdufour mentioned this issue Oct 26, 2024

Ollama Support abi/screenshot-to-code#354

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llama3.2-vision Run Error #7300

Llama3.2-vision Run Error #7300

mruckman1 commented Oct 21, 2024

rick-github commented Oct 21, 2024

silasalves commented Oct 21, 2024

rick-github commented Oct 21, 2024

pavan-otthi123 commented Oct 22, 2024 •

edited

Loading

rick-github commented Oct 22, 2024

Animaxx commented Oct 22, 2024

silasalves commented Oct 22, 2024

jessegross commented Oct 22, 2024

rick-github commented Oct 22, 2024

silasalves commented Oct 22, 2024

rick-github commented Oct 22, 2024

pdevine commented Oct 23, 2024

ludos1978 commented Oct 25, 2024

rick-github commented Oct 25, 2024 •

edited

Loading

eulercat commented Oct 26, 2024 •

edited

Loading

pdevine commented Oct 28, 2024

pdevine commented Oct 28, 2024

rick-github commented Oct 28, 2024

jhowilbur commented Nov 2, 2024

delenius commented Nov 5, 2024

rick-github commented Nov 5, 2024

Llama3.2-vision Run Error #7300

Llama3.2-vision Run Error #7300

Comments

mruckman1 commented Oct 21, 2024

What is the issue?

OS

GPU

CPU

Ollama version

rick-github commented Oct 21, 2024

silasalves commented Oct 21, 2024

rick-github commented Oct 21, 2024

pavan-otthi123 commented Oct 22, 2024 • edited Loading

rick-github commented Oct 22, 2024

Animaxx commented Oct 22, 2024

silasalves commented Oct 22, 2024

jessegross commented Oct 22, 2024

rick-github commented Oct 22, 2024

silasalves commented Oct 22, 2024

rick-github commented Oct 22, 2024

pdevine commented Oct 23, 2024

ludos1978 commented Oct 25, 2024

rick-github commented Oct 25, 2024 • edited Loading

eulercat commented Oct 26, 2024 • edited Loading

pdevine commented Oct 28, 2024

pdevine commented Oct 28, 2024

rick-github commented Oct 28, 2024

jhowilbur commented Nov 2, 2024

delenius commented Nov 5, 2024

rick-github commented Nov 5, 2024

pavan-otthi123 commented Oct 22, 2024 •

edited

Loading

rick-github commented Oct 25, 2024 •

edited

Loading

eulercat commented Oct 26, 2024 •

edited

Loading