miniSD/nanoSD (256x256 and 128x128 image generation) results #28

walking-octopus · 2023-08-24T12:01:21Z

I'm sorry for polluting the GitHub issues with non-bugs, but since that precedent was already set by #1 and there's no Discussions enabled, I thought it may appropriate to share it here.

Laptop CPUs are always rather underpowered. As said in #15, even old desktop CPUs perform much better than modern mid-range laptops. Even more so, phones and ARM micro-computers are laughably slow.

The sampling can be much sped up by using a lower resolution, but models expectedly perform very poorly at resolutions lower than trained, resulting in colorful abstract shapes only vaguely resembling the expected objects.

But someone on HuggingFace managed to fine-tuned on 256x256 and 128x128 images to the point of getting coherent outputs!

https://huggingface.co/justinpinkney/miniSD
https://huggingface.co/bguisard/stable-diffusion-nano-2-1 (Not yet tested, need to convert to .ckpt first)

This is great news for CPU inference, since the sampling time was cut in half! The outputs might have looked slightly less detailed, but were perfectly coherent.

I haven't investigated if there are any differences in outputs between stable-diffusion.cpp and official implementation, or if quantization has greater impact at lower resolution, but it does seem promising for real-life usage of this project.

The text was updated successfully, but these errors were encountered:

leejet · 2023-08-24T12:48:04Z

Discussions is indeed a good place for discussions and sharing. I've set up Discussions for this project. Would you mind moving this issue to Discussions?

leejet · 2023-08-24T12:50:15Z

By the way, I took a look at these two projects, and they seem to be fine-tuning weights on official models. So, they should be applicable to this project.

walking-octopus · 2023-08-24T12:55:31Z

Discussions is indeed a good place for discussions and sharing. I've set up Discussions for this project. Would you mind moving this issue to Discussions?

Sure, I don't mind.

By the way, I took a look at these two projects, and they seem to be fine-tuning weights on official models. So, they should be applicable to this project.

Yes, I've even tested one of them and got perfectly coherent outputs. I was just wondering if this implementation matches the official implementation perfectly, and if quantization has any special effects when lowering the resolution.

leejet · 2023-08-24T13:02:34Z

Quantization doesn't have any special effects when lowering the resolution. Quantization mainly involves sacrificing some computational precision in exchange for lower memory and storage usage.

nviet · 2023-08-26T17:25:19Z

Even more so, phones and ARM micro-computers are laughably slow.

Yeah no doubt. IMO, it's because right now stable diffusion has not been optimized for such devices yet. My phone, which comes with a chipset released 2 years ago, took 55 mins to generate a 512x512 image using code in this repo (which runs purely on CPU). But in a different approach to stable diffusion in which Vulkan is enabled, it took only 13 mins.

In a paper publish by researchers from Google (link, link), the result is even more impressive when GPU support is there. Sadly there has been no code released yet but there is clearly potential for such devices.

After applying all of these optimizations, we conducted tests of Stable Diffusion 1.5 (image resolution 512x512, 20 iterations) on high-end mobile devices. Running Stable Diffusion with our GPU-accelerated ML inference model uses 2,093MB for the weights and 84MB for the intermediate tensors. With latest high-end smartphones, Stable Diffusion can be run in under 12 seconds.

JohnClaw · 2023-08-26T19:24:49Z

(Not yet tested, need to convert to .ckpt first)

I made ckpt and ggml versions:
https://huggingface.co/NikolayKozloff/stable-diffusion-nano-2-1-ckpt/resolve/main/stable-diffusion-nano-2-1.ckpt
https://huggingface.co/NikolayKozloff/stable-diffusion-nano-2-1-ggml/resolve/main/stable-diffusion-nano-2-1-ggml-f32.bin

P.s: I tried to run f32 model but got this error:
[ERROR] stable-diffusion.cpp:2893 - tensor 'model.diffusion_model.input_blocks.1.1.proj_in.weight' has wrong shape in model file: got [320, 320, 1, 1], expected [1, 1, 320, 320]
Can it be fixed somehow?

I used this script to make ckpt:
https://raw.githubusercontent.com/huggingface/diffusers/main/scripts/convert_diffusers_to_original_stable_diffusion.py

I ran it this way:
python convert_diffusers_to_original_stable_diffusion.py --model_path SD_nano/ --checkpoint_path SD_nano.ckpt"

JohnClaw · 2023-08-26T20:12:51Z

I also managed to find a ckpt for older version of SD nano: https://huggingface.co/Simona198710/stable-diffusion-nano-ckpt/resolve/main/stable-diffusion-nano.ckpt.ckpt

I converted it to f32 ggml and it worked.

I ran it this way: sd.exe -m stable-diffusion-nano.ckpt-ggml-model-f32.bin -t 8 --steps 40 --height 128 --width 128 --seed -1 -p "a female face"

It take 63 seconds to make the image at Ryzen 7 4700u (turbo-boost was off):

leejet · 2023-08-28T15:07:41Z

P.s: I tried to run f32 model but got this error:
[ERROR] stable-diffusion.cpp:2893 - tensor 'model.diffusion_model.input_blocks.1.1.proj_in.weight' has wrong shape in model file: got [320, 320, 1, 1], expected [1, 1, 320, 320]
Can it be fixed somehow?

https://huggingface.co/NikolayKozloff/stable-diffusion-nano-2-1-ckpt/resolve/main/stable-diffusion-nano-2-1.ckpt is a model from SD2.x, which is currently not supported. I suggest you use models from SD1.x for now (for example, https://huggingface.co/Simona198710/stable-diffusion-nano-ckpt/resolve/main/stable-diffusion-nano.ckpt.ckpt), as I plan to add support for SD2.x models in the near future.

leejet · 2023-09-03T08:07:16Z

@JohnClaw
Support for SD2.x has been added, and you can try it out by pulling the latest code, converting the model file, and rebuilding the executable.

Here is an example #41

JohnClaw · 2023-09-03T12:50:16Z

Support for SD2.x has been added, and you can try it out by pulling the latest code, converting the model file, and rebuilding the executable.

I did as you adviced but got this error:

C:\sd>sd.exe -m stable-diffusion-nano-2-1-ggml-model-f32.bin -t 8 --steps 10 --height 128 --width 128 --seed -1 -p "a wolf wearing sun glasses, highly detailed"
[INFO] stable-diffusion.cpp:2793 - loading model from 'stable-diffusion-nano-2-1-ggml-model-f32.bin'
[INFO] stable-diffusion.cpp:2821 - model type: SD2.x
[INFO] stable-diffusion.cpp:2829 - ftype: f32
[WARN] stable-diffusion.cpp:2991 - unknown tensor 'cond_stage_model.model.transformer.text_model.embeddings.position_ids' in model file
[ERROR] stable-diffusion.cpp:3033 - tensor 'cond_stage_model.transformer.text_model.encoder.layers.23.layer_norm1.bias' not in model file
[ERROR] stable-diffusion.cpp:3033 - tensor 'cond_stage_model.transformer.text_model.encoder.layers.23.layer_norm1.weight' not in model file
[ERROR] stable-diffusion.cpp:3033 - tensor 'cond_stage_model.transformer.text_model.encoder.layers.23.layer_norm2.bias' not in model file
[ERROR] stable-diffusion.cpp:3033 - tensor 'cond_stage_model.transformer.text_model.encoder.layers.23.layer_norm2.weight' not in model file
[ERROR] stable-diffusion.cpp:3033 - tensor 'cond_stage_model.transformer.text_model.encoder.layers.23.mlp.fc1.bias' not in model file
[ERROR] stable-diffusion.cpp:3033 - tensor 'cond_stage_model.transformer.text_model.encoder.layers.23.mlp.fc1.weight' not in model file
[ERROR] stable-diffusion.cpp:3033 - tensor 'cond_stage_model.transformer.text_model.encoder.layers.23.mlp.fc2.bias' not in model file
[ERROR] stable-diffusion.cpp:3033 - tensor 'cond_stage_model.transformer.text_model.encoder.layers.23.mlp.fc2.weight' not in model file
[ERROR] stable-diffusion.cpp:3033 - tensor 'cond_stage_model.transformer.text_model.encoder.layers.23.self_attn.k_proj.bias' not in model file
[ERROR] stable-diffusion.cpp:3033 - tensor 'cond_stage_model.transformer.text_model.encoder.layers.23.self_attn.k_proj.weight' not in model file
[ERROR] stable-diffusion.cpp:3033 - tensor 'cond_stage_model.transformer.text_model.encoder.layers.23.self_attn.out_proj.bias' not in model file
[ERROR] stable-diffusion.cpp:3033 - tensor 'cond_stage_model.transformer.text_model.encoder.layers.23.self_attn.out_proj.weight' not in model file
[ERROR] stable-diffusion.cpp:3033 - tensor 'cond_stage_model.transformer.text_model.encoder.layers.23.self_attn.q_proj.bias' not in model file
[ERROR] stable-diffusion.cpp:3033 - tensor 'cond_stage_model.transformer.text_model.encoder.layers.23.self_attn.q_proj.weight' not in model file
[ERROR] stable-diffusion.cpp:3033 - tensor 'cond_stage_model.transformer.text_model.encoder.layers.23.self_attn.v_proj.bias' not in model file
[ERROR] stable-diffusion.cpp:3033 - tensor 'cond_stage_model.transformer.text_model.encoder.layers.23.self_attn.v_proj.weight' not in model file

leejet · 2023-09-03T13:13:02Z

I have fixed the problem. You can pull the latest code and rebuild the executable.

./bin/sd -m ../models/stable-diffusion-nano-2-1-ggml-model-f32.bin -t 8 --steps 10 --height 128 --width 128 --seed -1 -p "a wolf wearing sun glasses, highly detailed"

JohnClaw · 2023-09-03T13:52:41Z

I have fixed the problem. You can pull the latest code and rebuild the executable.

It raised a new error:

C:\sd>sd.exe -m stable-diffusion-nano-2-1-ggml-model-f32.bin -t 8 --steps 10 --height 128 --width 128 --seed -1 -p "a wolf wearing sun glasses, highly detailed"
[INFO] stable-diffusion.cpp:2793 - loading model from 'stable-diffusion-nano-2-1-ggml-model-f32.bin'
[INFO] stable-diffusion.cpp:2821 - model type: SD2.x
[INFO] stable-diffusion.cpp:2829 - ftype: f32
[WARN] stable-diffusion.cpp:2991 - unknown tensor 'cond_stage_model.model.transformer.text_model.embeddings.position_ids' in model file
[INFO] stable-diffusion.cpp:3057 - total params size = 3621.07MB (clip 1346.65MB, unet 2179.92MB, vae 94.51MB)
[INFO] stable-diffusion.cpp:3059 - loading model from 'stable-diffusion-nano-2-1-ggml-model-f32.bin' completed, taking 4.99s
[INFO] stable-diffusion.cpp:3188 - check is_using_v_parameterization_for_sd2 completed, taking 0.25s
[INFO] stable-diffusion.cpp:3084 - running in eps-prediction mode
[INFO] stable-diffusion.cpp:3316 - condition graph use 1351.93MB of memory: params 1346.65MB, runtime 5.29MB (static 1.38MB, dynamic 3.91MB)
[INFO] stable-diffusion.cpp:3316 - condition graph use 1351.93MB of memory: params 1346.65MB, runtime 5.29MB (static 1.38MB, dynamic 3.91MB)
[INFO] stable-diffusion.cpp:3854 - get_learned_condition completed, taking 1.34s
[INFO] stable-diffusion.cpp:3870 - start sampling

Sampling failed. Image wasn't created.

leejet · 2023-09-03T15:07:04Z

It should be fixed by now. You can pull the latest code and rebuild the executable.

JohnClaw · 2023-09-03T15:28:34Z

It should be fixed by now. You can pull the latest code and rebuild the executable.

Thank you. It works. An image was generated. However error message still remains in console log:

[WARN] stable-diffusion.cpp:2991 - unknown tensor 'cond_stage_model.model.transformer.text_model.embeddings.position_ids' in model file

leejet · 2023-09-03T15:36:32Z

[WARN] stable-diffusion.cpp:2991 - unknown tensor 'cond_stage_model.model.transformer.text_model.embeddings.position_ids' in model file

Don't worry, this won't have any effect on image generation, you can just ignore it

paulocoutinhox · 2023-10-27T18:07:54Z

Hi,

Im have the same error with latest version:

This model:
https://huggingface.co/NikolayKozloff/stable-diffusion-nano-2-1-ggml/tree/main

Error:

[INFO]  stable-diffusion.cpp:2832 - loading model from '/Users/paulo/Downloads/stable-diffusion-nano-2-1-ggml-f32.bin'
[INFO]  stable-diffusion.cpp:2860 - model type: SD1.x
[INFO]  stable-diffusion.cpp:2868 - ftype: f32
[ERROR] stable-diffusion.cpp:3047 - tensor 'model.diffusion_model.input_blocks.1.1.proj_in.weight' has wrong shape in model file: got [320, 320, 1, 1], expected [1, 1, 320, 320]
[2023-10-27 15:00:07.824] [error] [MappingStableDiffusion :: callbackGenerate] Error while load model: /Users/paulo/Downloads/stable-diffusion-nano-2-1-ggml-f32.bin

Code:

const std::string &model = modelOpt.value();
const std::string &prompt = promptOpt.value();

bool vaeDecodeOnly = true;

const char *rngTypeToStr[] = {
    "std_default",
};

Schedule schedule = DEFAULT;
std::string negative_prompt;
float cfg_scale = 7.0f;
int w = 512;
int h = 512;
SampleMethod sample_method = EULER_A;
int sample_steps = 20;
float strength = 0.75f;
RNGType rng_type = CUDA_RNG;
int64_t seed = 42;

StableDiffusion sd(4, vaeDecodeOnly, true);

if (!sd.load_from_file(model, schedule))
{
    spdlog::error("[MappingStableDiffusion :: callbackGenerate] Error while load model: {}", model);
    r(Image{"ERROR-LOAD-MODEL"});
    return;
}

std::vector<uint8_t> img = sd.txt2img(
    prompt,
    negative_prompt,
    cfg_scale,
    w,
    h,
    sample_method,
    sample_steps,
    seed);

Model list:

What models i can use with this code? There is a list of it? 128, 256 or 512x512 models?

Thanks.

leejet · 2023-10-28T09:39:37Z

This is because the model was converted using the older models/convert.py, which did not support SD2.x at the time. You can use the latest models/convert.py for the conversion.
https://huggingface.co/NikolayKozloff/stable-diffusion-nano-2-1-ckpt/resolve/main/stable-diffusion-nano-2-1.ckpt

paulocoutinhox · 2023-10-28T09:55:36Z

Im using the current master script:

paulo ~/Developer/workspaces/python/stable-diffusion.cpp/models [master] $ python3 convert.py                                                                                                                            
usage: convert.py [-h] [--out_type {f32,f16,q4_0,q4_1,q5_0,q5_1,q8_0}] [--out_file OUT_FILE] model_path

leejet · 2023-10-28T10:23:36Z

What was the original model you used before the conversion? I used the latest conversion script to convert this model (https://huggingface.co/NikolayKozloff/stable-diffusion-nano-2-1-ckpt/resolve/main/stable-diffusion-nano-2-1.ckpt), and I didn't encounter this issue.

By the way, please don't download the ggml model (https://huggingface.co/NikolayKozloff/stable-diffusion-nano-2-1-ggml/tree/main) directly. You need to download the ckpt(https://huggingface.co/NikolayKozloff/stable-diffusion-nano-2-1-ckpt/resolve/main/stable-diffusion-nano-2-1.ckpt) file and then convert it.

paulocoutinhox · 2023-11-22T00:30:08Z

Hi,

Now i can convert and it generate the model.

But when i try on my Mac M1 it took around 87 secs for each sample:

2023-11-21 21:26:42.053] [debug] [MappingStableDiffusion :: callbackGenerate] Generating subtitles...
[INFO]  stable-diffusion.cpp:2897 - loading model from '/Users/paulo/Downloads/stable-diffusion-cpp/models/stable-diffusion-nano-2-1.bin'
[INFO]  stable-diffusion.cpp:2925 - model type: SD2.x
[INFO]  stable-diffusion.cpp:2933 - ftype: f32
[WARN]  stable-diffusion.cpp:3093 - unknown tensor 'cond_stage_model.model.transformer.text_model.embeddings.position_ids' in model file
[INFO]  stable-diffusion.cpp:3159 - total params size = 3686.36MB (clip 1346.66MB, unet 2179.94MB, vae 159.77MB)
[INFO]  stable-diffusion.cpp:3161 - loading model from '/Users/paulo/Downloads/stable-diffusion-cpp/models/stable-diffusion-nano-2-1.bin' completed, taking 1.85s
[INFO]  stable-diffusion.cpp:3311 - check is_using_v_parameterization_for_sd2 completed, taking 1.24s
[INFO]  stable-diffusion.cpp:3186 - running in eps-prediction mode
[INFO]  stable-diffusion.cpp:3196 - running with Karras schedule
[INFO]  stable-diffusion.cpp:4653 - apply_loras completed, taking 0.00s
[INFO]  stable-diffusion.cpp:3740 - condition graph use 1421.76MB of memory: params 1346.66MB, runtime 75.11MB (static 11.04MB, dynamic 64.07MB)
[INFO]  stable-diffusion.cpp:3740 - condition graph use 1421.76MB of memory: params 1346.66MB, runtime 75.11MB (static 11.04MB, dynamic 64.07MB)
[INFO]  stable-diffusion.cpp:4662 - get_learned_condition completed, taking 0.79s
[INFO]  stable-diffusion.cpp:4678 - start sampling
[INFO]  stable-diffusion.cpp:3936 - sampling using Euler A method
[INFO]  stable-diffusion.cpp:3924 - step 1 sampling completed, taking 87.21s
[INFO]  stable-diffusion.cpp:3924 - step 2 sampling completed, taking 87.21s

Im doing something wrong?

walking-octopus changed the title ~~miniSD (256x256 image generation) results~~ miniSD/nanoSD (256x256 and 128x128 image generation) results Aug 24, 2023

JohnClaw mentioned this issue Aug 28, 2023

Please, add support for Segmind Distilled diffusion models #33

Open

liquidzym mentioned this issue Oct 5, 2023

the model issue Jonathhhan/ofStableDiffusionExample#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

miniSD/nanoSD (256x256 and 128x128 image generation) results #28

miniSD/nanoSD (256x256 and 128x128 image generation) results #28

walking-octopus commented Aug 24, 2023 •

edited

Loading

leejet commented Aug 24, 2023

leejet commented Aug 24, 2023

walking-octopus commented Aug 24, 2023

leejet commented Aug 24, 2023

nviet commented Aug 26, 2023

JohnClaw commented Aug 26, 2023 •

edited

Loading

JohnClaw commented Aug 26, 2023

leejet commented Aug 28, 2023

leejet commented Sep 3, 2023 •

edited

Loading

JohnClaw commented Sep 3, 2023

leejet commented Sep 3, 2023

JohnClaw commented Sep 3, 2023 •

edited

Loading

leejet commented Sep 3, 2023

JohnClaw commented Sep 3, 2023

leejet commented Sep 3, 2023

paulocoutinhox commented Oct 27, 2023

leejet commented Oct 28, 2023

paulocoutinhox commented Oct 28, 2023 •

edited

Loading

leejet commented Oct 28, 2023

paulocoutinhox commented Nov 22, 2023

miniSD/nanoSD (256x256 and 128x128 image generation) results #28

miniSD/nanoSD (256x256 and 128x128 image generation) results #28

Comments

walking-octopus commented Aug 24, 2023 • edited Loading

leejet commented Aug 24, 2023

leejet commented Aug 24, 2023

walking-octopus commented Aug 24, 2023

leejet commented Aug 24, 2023

nviet commented Aug 26, 2023

JohnClaw commented Aug 26, 2023 • edited Loading

JohnClaw commented Aug 26, 2023

leejet commented Aug 28, 2023

leejet commented Sep 3, 2023 • edited Loading

JohnClaw commented Sep 3, 2023

leejet commented Sep 3, 2023

JohnClaw commented Sep 3, 2023 • edited Loading

leejet commented Sep 3, 2023

JohnClaw commented Sep 3, 2023

leejet commented Sep 3, 2023

paulocoutinhox commented Oct 27, 2023

leejet commented Oct 28, 2023

paulocoutinhox commented Oct 28, 2023 • edited Loading

leejet commented Oct 28, 2023

paulocoutinhox commented Nov 22, 2023

walking-octopus commented Aug 24, 2023 •

edited

Loading

JohnClaw commented Aug 26, 2023 •

edited

Loading

leejet commented Sep 3, 2023 •

edited

Loading

JohnClaw commented Sep 3, 2023 •

edited

Loading

paulocoutinhox commented Oct 28, 2023 •

edited

Loading