Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to run the ggufs on cpu like llama? #10

Open
TingTingin opened this issue Aug 15, 2024 · 1 comment
Open

Is it possible to run the ggufs on cpu like llama? #10

TingTingin opened this issue Aug 15, 2024 · 1 comment

Comments

@TingTingin
Copy link

It would be nice to be able to run ggufs on cpu like you can with llama gguf I don't know of what the speed would look like but could be better for people with low vram gpus

Also i haven't looked at the code but i believe ggufs have more efficient mem allocation built in i.e if you choose to split the model between gpu and cpu it wont be as bad as typical mem overflow from pytorch if its possible for this to be implemented it would also be a nice feature to have for those with low vram gpus

@city96
Copy link
Owner

city96 commented Aug 15, 2024

We're only using gguf as a storage medium here without the surrounding llama.cpp library, so we would have to rely on the ComfyUI lowvram mode (which will need some extra changes to work).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants