-
Notifications
You must be signed in to change notification settings - Fork 7.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Llama3.2-vision Run Error #7300
Comments
Vision support was merged recently (#6963), 0.3.14 doesn't include it. |
What does "vision support" mean? Does it enabling "submitting multiple images for inference" or "video inference"? Or is it just the support for this particular model? AFAIK, video or multiple images are still an open issue #3184 |
Vision support for llama3.2. llama3.2 doesn't do video, and doesn't work reliably with multiple images. |
Does this mean that llama3.2-vision can't be used in the current version of Ollama? I'm also getting the same error when attempting to run the model |
Version 0.4.0 will support llama3.2-vision. |
Thank you for the hard work, could we also this change to Llama.cpp repo as well? |
@rick-github thanks for the clarification! Also, any plans for making it run on the GPU? Llama3.2 runs on my GPU (GTX1660Ti), but llama3.2-vision runs on CPU only. |
It can run on the GPU but it needs more RAM than the text-only versions, so it has likely exceed the limit of your GPU. |
It should run on GPU if it fits: $ ollama ps
NAME ID SIZE PROCESSOR UNTIL
x/llama3.2-vision:latest 25e973636a29 11 GB 100% GPU Forever If you can provide server logs perhaps we can see why it's not working for you. |
@jessegross Thanks for pointing that out. That sounds correct, my GPU is quite old and has only 4GB RAM. @rick-github Thanks for the support, this is my server.log https://gist.github.com/silasalves/f2bdfc195618f19ecd557b945cab32b9 I think this is the important part?
|
Yep, too big for your card. |
@Animaxx unfortunately backporting it to work with llama.cpp would be tricky because the image preparsing step is written in golang, and not c++. I'm going to go ahead and close the issue since things are working as expected. You just need to use the pre-release to make it work. |
i've read that ollama 0.4 should support vision tasks. if it's correct i am getting the same error as mentioned above, on a 90GByte M2 Macbook using 0.3.14: |
0.3.14 cannot load x/llama3.2-vision. |
@pdevine
|
@eulercat we don't support pulling images w/
You can find out more information here |
@ludos1978 you'll need |
If the image is large, it will exceed the maximum argument length of the shell. (echo '{
"model":"x/llama3.2-vision",
"messages":[
{ "role":"user",
"content":"describe this image",
"images":["' ;
curl -s https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg | base64 -w0 ; echo '"
]
}
],
"stream":false
}') | curl -s localhost:11434/api/chat -d @- | jq {
"model": "x/llama3.2-vision",
"created_at": "2024-10-28T23:14:35.376161501Z",
"message": {
"role": "assistant",
"content": "The image depicts a serene and peaceful scene, with a wooden boardwalk winding its way through a lush grassy field. The boardwalk is made of light-colored wood and features a simple design, with no visible railings or obstacles to obstruct the view.\n\nAs the boardwalk stretches out into the distance, it disappears from sight, inviting the viewer to imagine where it might lead. The surrounding grass is tall and green, swaying gently in the breeze, while trees dot the horizon, adding depth and texture to the landscape.\n\nAbove, a brilliant blue sky with white clouds provides a stunning backdrop, casting dappled shadows across the boardwalk and creating a sense of warmth and tranquility. Overall, the image exudes a sense of calmness and serenity, inviting the viewer to step into its peaceful world."
},
"done_reason": "stop",
"done": true,
"total_duration": 3744887728,
"load_duration": 34980268,
"prompt_eval_count": 13,
"prompt_eval_duration": 45000000,
"eval_count": 164,
"eval_duration": 3302000000
} |
But with some effort, I believe it will be possible to use their Golang binding to c++ To our surprise, it's calling the same libraries as those used in llama.cpp, the core to do the tensor computations, the lib GGML written in cpp. |
I am getting the same error on a M3 Macbook with 64gb, with Ollama 0.4.0-rc8. |
Server logs will help in debugging. $ curl localhost:11434/api/version
{"version":"0.4.0-rc8"}
$ (echo '{
"model":"x/llama3.2-vision",
"messages":[
{ "role":"user",
"content":"describe this image",
"images":["' ;
curl -s https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg | base64 -w0 ; echo '"
]
}
],
"stream":false
}') | curl -s localhost:11434/api/chat -d @- | jq
{
"model": "x/llama3.2-vision",
"created_at": "2024-11-05T16:15:16.856668179Z",
"message": {
"role": "assistant",
"content": "The image depicts a serene and peaceful scene, with a wooden boardwalk winding its way through a lush grassy field. The purpose of the image is to showcase the beauty of nature and the tranquility that can be found in such settings.\n\n* A wooden boardwalk:\n\t+ Winding its way through a grassy field\n\t+ Made of light-colored wood planks\n\t+ Surrounded by tall blades of grass on either side\n* Tall grass:\n\t+ Swaying gently in the breeze\n\t+ Varying shades of green, from light to dark\n\t+ Creating a sense of depth and texture in the image\n* Trees in the background:\n\t+ Scattered throughout the field\n\t+ Providing shade and shelter for wildlife\n\t+ Adding to the overall sense of serenity and calmness\n\nThe image effectively captures the beauty and tranquility of nature, inviting the viewer to step into the peaceful atmosphere. The use of natural colors and textures adds to the sense of realism, making the scene feel more immersive and engaging."
},
"done_reason": "stop",
"done": true,
"total_duration": 79628322199,
"load_duration": 70623694007,
"prompt_eval_count": 14,
"prompt_eval_duration": 2349000000,
"eval_count": 212,
"eval_duration": 6235000000
}
|
What is the issue?
ollama run x/llama3.2-vision
on macbookExpected: Ollama download without error.
OS
macOS
GPU
Apple
CPU
Apple
Ollama version
0.3.14
The text was updated successfully, but these errors were encountered: