-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
miniSD/nanoSD (256x256 and 128x128 image generation) results #28
Comments
Discussions is indeed a good place for discussions and sharing. I've set up Discussions for this project. Would you mind moving this issue to Discussions? |
By the way, I took a look at these two projects, and they seem to be fine-tuning weights on official models. So, they should be applicable to this project. |
Sure, I don't mind.
Yes, I've even tested one of them and got perfectly coherent outputs. I was just wondering if this implementation matches the official implementation perfectly, and if quantization has any special effects when lowering the resolution. |
Quantization doesn't have any special effects when lowering the resolution. Quantization mainly involves sacrificing some computational precision in exchange for lower memory and storage usage. |
Yeah no doubt. IMO, it's because right now stable diffusion has not been optimized for such devices yet. My phone, which comes with a chipset released 2 years ago, took 55 mins to generate a 512x512 image using code in this repo (which runs purely on CPU). But in a different approach to stable diffusion in which Vulkan is enabled, it took only 13 mins. In a paper publish by researchers from Google (link, link), the result is even more impressive when GPU support is there. Sadly there has been no code released yet but there is clearly potential for such devices.
|
I made ckpt and ggml versions: P.s: I tried to run f32 model but got this error: I used this script to make ckpt: I ran it this way: |
I also managed to find a ckpt for older version of SD nano: https://huggingface.co/Simona198710/stable-diffusion-nano-ckpt/resolve/main/stable-diffusion-nano.ckpt.ckpt I converted it to f32 ggml and it worked. I ran it this way: sd.exe -m stable-diffusion-nano.ckpt-ggml-model-f32.bin -t 8 --steps 40 --height 128 --width 128 --seed -1 -p "a female face" It take 63 seconds to make the image at Ryzen 7 4700u (turbo-boost was off): |
https://huggingface.co/NikolayKozloff/stable-diffusion-nano-2-1-ckpt/resolve/main/stable-diffusion-nano-2-1.ckpt is a model from SD2.x, which is currently not supported. I suggest you use models from SD1.x for now (for example, https://huggingface.co/Simona198710/stable-diffusion-nano-ckpt/resolve/main/stable-diffusion-nano.ckpt.ckpt), as I plan to add support for SD2.x models in the near future. |
I did as you adviced but got this error: C:\sd>sd.exe -m stable-diffusion-nano-2-1-ggml-model-f32.bin -t 8 --steps 10 --height 128 --width 128 --seed -1 -p "a wolf wearing sun glasses, highly detailed" |
It raised a new error: C:\sd>sd.exe -m stable-diffusion-nano-2-1-ggml-model-f32.bin -t 8 --steps 10 --height 128 --width 128 --seed -1 -p "a wolf wearing sun glasses, highly detailed" Sampling failed. Image wasn't created. |
It should be fixed by now. You can pull the latest code and rebuild the executable. |
Thank you. It works. An image was generated. However error message still remains in console log: [WARN] stable-diffusion.cpp:2991 - unknown tensor 'cond_stage_model.model.transformer.text_model.embeddings.position_ids' in model file |
Don't worry, this won't have any effect on image generation, you can just ignore it |
Hi, Im have the same error with latest version: This model: Error:
Code:
Model list: What models i can use with this code? There is a list of it? 128, 256 or 512x512 models? Thanks. |
This is because the model was converted using the older models/convert.py, which did not support SD2.x at the time. You can use the latest models/convert.py for the conversion. |
Im using the current master script:
|
What was the original model you used before the conversion? I used the latest conversion script to convert this model (https://huggingface.co/NikolayKozloff/stable-diffusion-nano-2-1-ckpt/resolve/main/stable-diffusion-nano-2-1.ckpt), and I didn't encounter this issue. By the way, please don't download the ggml model (https://huggingface.co/NikolayKozloff/stable-diffusion-nano-2-1-ggml/tree/main) directly. You need to download the ckpt(https://huggingface.co/NikolayKozloff/stable-diffusion-nano-2-1-ckpt/resolve/main/stable-diffusion-nano-2-1.ckpt) file and then convert it. |
Hi, Now i can convert and it generate the model. But when i try on my Mac M1 it took around 87 secs for each sample:
Im doing something wrong? |
I'm sorry for polluting the GitHub issues with non-bugs, but since that precedent was already set by #1 and there's no Discussions enabled, I thought it may appropriate to share it here.
Laptop CPUs are always rather underpowered. As said in #15, even old desktop CPUs perform much better than modern mid-range laptops. Even more so, phones and ARM micro-computers are laughably slow.
The sampling can be much sped up by using a lower resolution, but models expectedly perform very poorly at resolutions lower than trained, resulting in colorful abstract shapes only vaguely resembling the expected objects.
But someone on HuggingFace managed to fine-tuned on 256x256 and 128x128 images to the point of getting coherent outputs!
.ckpt
first)This is great news for CPU inference, since the sampling time was cut in half! The outputs might have looked slightly less detailed, but were perfectly coherent.
I haven't investigated if there are any differences in outputs between stable-diffusion.cpp and official implementation, or if quantization has greater impact at lower resolution, but it does seem promising for real-life usage of this project.
The text was updated successfully, but these errors were encountered: