You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be nice to be able to run ggufs on cpu like you can with llama gguf I don't know of what the speed would look like but could be better for people with low vram gpus
Also i haven't looked at the code but i believe ggufs have more efficient mem allocation built in i.e if you choose to split the model between gpu and cpu it wont be as bad as typical mem overflow from pytorch if its possible for this to be implemented it would also be a nice feature to have for those with low vram gpus
The text was updated successfully, but these errors were encountered:
We're only using gguf as a storage medium here without the surrounding llama.cpp library, so we would have to rely on the ComfyUI lowvram mode (which will need some extra changes to work).
It would be nice to be able to run ggufs on cpu like you can with llama gguf I don't know of what the speed would look like but could be better for people with low vram gpus
Also i haven't looked at the code but i believe ggufs have more efficient mem allocation built in i.e if you choose to split the model between gpu and cpu it wont be as bad as typical mem overflow from pytorch if its possible for this to be implemented it would also be a nice feature to have for those with low vram gpus
The text was updated successfully, but these errors were encountered: