Frequent model reloading on Ollama #54

7shi · 2024-11-27T07:28:43Z

I am using Ollama on Windows 11.

When I execute a Local GPT action, Ollama frequently performs unnecessary model reloads before processing the action. This occurs even at intervals shorter than the model's unload timeout.

This issue occurs with all models regardless of type or size, although it seems to happen more frequently with larger models. In terms of impact, with a 2B model, the reload time is short enough to be tolerable, but with a 32B model, the waiting time before generation becomes significant and cannot be ignored.

Has anyone else encountered this issue?

pfrankov · 2024-12-01T20:54:32Z

I'm afraid that your description gives me nothing to investigate.

7shi · 2024-12-01T22:58:25Z

I apologize for not being clear. I wasn't referring to any specific operation - this is something I noticed during normal usage, which is why I wanted to ask about it.

Since I'm not sure what information would be helpful, I've recorded a video showing my usage pattern. I'm using the gemma2:9b-instruct-q4_K_M model in the video, though this behavior occurs with all models regardless of which one I use.

https://www.youtube.com/watch?v=cT1qd-1YrJ4

When executing an action, sometimes it starts generating immediately, while other times there's about a 6-second delay. During these delays, I notice the dedicated GPU memory temporarily decreases before increasing again. This appears to be model reloading.

With the 9B model, it fits in the cache so there's no disk access, but when I use a 27B model in the same situation, disk access occurs, which led me to conclude it's reloading the model. (this is what I showed in the screenshot in my original post)

7shi · 2024-12-03T09:31:55Z

Thank you for fixing this issue! The unnecessary model reloading has been eliminated.

pfrankov added the need info label Dec 1, 2024

pfrankov removed the need info label Dec 2, 2024

pfrankov self-assigned this Dec 2, 2024

pfrankov added the bug Something isn't working label Dec 2, 2024

pfrankov closed this as completed in 0b4cd6f Dec 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Frequent model reloading on Ollama #54

Frequent model reloading on Ollama #54

7shi commented Nov 27, 2024 •

edited

Loading

pfrankov commented Dec 1, 2024

7shi commented Dec 1, 2024

7shi commented Dec 3, 2024

Frequent model reloading on Ollama #54

Frequent model reloading on Ollama #54

Comments

7shi commented Nov 27, 2024 • edited Loading

pfrankov commented Dec 1, 2024

7shi commented Dec 1, 2024

7shi commented Dec 3, 2024

7shi commented Nov 27, 2024 •

edited

Loading