DDIM Sampler on Stable-Diffusion do not work well with CFG guidance scale large than 6~7 #1602

Randolph-zeng · 2022-12-08T07:22:40Z

Describe the bug

I have noticed that the SD v1.4 or v1.5 works poorly if I swap the scheduler to DDIM and have guidance scale larger than 7.
This behavior does not seems to be obvious on other sampler such as the default PNDM scheduler though. At first I suspect it is related to the "train-test-mismatch" mentioned in Imagen paper Sec 2.3. However, I found that that the same DDIM sampler in WebUI does not suffer from the same performance degradation with the same guidance scale. I have manually traced the scales of the predicted epsilon value under the same prompt/guidance scale in both WebUI DDIM sampler and diffuser DDIM sampler, they are all within simialr range of [-4, 4]. I have also printed out the beta schedules of both but they are very very close( of course I tried replacing diffusers DDIM betas with WebUI's DDIM betas but it does not help )

Reproduction

Reproduce is easy, and the behavior of following code snippet is consistent across diffusers versions from 0.3 to 0.9

from diffusers import StableDiffusionPipeline, DDIMScheduler
# swap any SD model here
pipe = StableDiffusionPipeline.from_pretrained('/xxxx/stable-diffusion-v1-5').to(0)
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
prompt = "A photograph of an astronaut riding a horse on Mars, high resolution, high definition."
# change the guidance scale from 1 - 15 and observe the performance degradation
image = pipe(prompt=prompt, num_inference_steps=100, guidance_scale=10., strength=1.).images[0]

guidance_scale = 2.5

guidance_scale = 5.0

guidance_scale = 7.5

guidance_scale = 10.

Logs

No response

System Info

Both diffusers == 0.3.0 and diffusers == 0.9.0 suffer from such issue
Also SD v1-4 and v1-5 suffer from such issue

The text was updated successfully, but these errors were encountered:

anton-l · 2022-12-08T14:04:10Z

cc @patrickvonplaten @patil-suraj this could be related to the missing dynamic thresholding

Randolph-zeng · 2022-12-09T02:11:34Z

@anton-l @patrickvonplaten @patil-suraj Hi thanks for the response! However I don't think this is related to the missing dynamic thresholding(other samplers work just fine for same CFG scale ). I did the same thing using the same model, same prompt, same sampler in webUI(where DDIM functions normally there). I used a debugger to trace through the code and I believe there is no dynamic thresholding there. In fact they are using the sampler implemented in the runwayml/stable-diffusion repo. I suspect it is the formula but I don't see any inconsistence and beta calculation is very close.

patrickvonplaten · 2022-12-12T15:46:54Z

Hey @Randolph-zeng,

I could not reproduce the bug you've shown in your code example above.
The following code snippet works well for me and gives subjectively good results:

from diffusers import StableDiffusionPipeline, DDIMScheduler
import torch

# swap any SD model here
pipe = StableDiffusionPipeline.from_pretrained('runwayml/stable-diffusion-v1-5').to(0)
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
prompt = "A photograph of an astronaut riding a horse on Mars, high resolution, high definition."

generator = torch.Generator(device="cuda").manual_seed(33)

# change the guidance scale from 1 - 15 and observe the performance degradation
image = pipe(prompt=prompt, num_inference_steps=100, guidance_scale=7.5, generator=generator).images[0]
image.save("/home/patrick_huggingface_co/images/aa.png")

E.g.:

A reason for why your scheduler might now have been working correctly could have been that you were using the fp16 branch which previously had incorrectly set clip_sample. This has now been corrected however: https://huggingface.co/runwayml/stable-diffusion-v1-5/commit/ded79e214aa69e42c24d3f5ac14b76d568679cc2

We will also make sure to update the conversion script accordingly!

patrickvonplaten · 2022-12-12T15:58:01Z

Here the PR that updates the conversion script: #1667

Randolph-zeng · 2022-12-13T02:39:46Z

@patrickvonplaten OMG!! Yes this is exactly the reason why it is failing !!! Thanks a lot for finding it out. This bug really tortures me for a week ! I checked everything but just did not check the clipping and that's why the DDIM is failing because its normal range is [-4,4] and clipping makes all the guidance signal go away!
Thanks a lot for finding it out, I will close this issue now.

Randolph-zeng added the bug Something isn't working label Dec 8, 2022

This was referenced Dec 8, 2022

[Community Pipeline] RePaint + Stable Diffusion #1333

Closed

Does RePaintPipeline work with Stable diffusion? #1213

Closed

Randolph-zeng mentioned this issue Dec 9, 2022

Add StableDiffusion repaint pipeline #1341

Closed

Randolph-zeng closed this as completed Dec 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DDIM Sampler on Stable-Diffusion do not work well with CFG guidance scale large than 6~7 #1602

DDIM Sampler on Stable-Diffusion do not work well with CFG guidance scale large than 6~7 #1602

Randolph-zeng commented Dec 8, 2022 •

edited

Loading

anton-l commented Dec 8, 2022

Randolph-zeng commented Dec 9, 2022 •

edited

Loading

patrickvonplaten commented Dec 12, 2022

patrickvonplaten commented Dec 12, 2022

Randolph-zeng commented Dec 13, 2022

DDIM Sampler on Stable-Diffusion do not work well with CFG guidance scale large than 6~7 #1602

DDIM Sampler on Stable-Diffusion do not work well with CFG guidance scale large than 6~7 #1602

Comments

Randolph-zeng commented Dec 8, 2022 • edited Loading

Describe the bug

Reproduction

Logs

System Info

anton-l commented Dec 8, 2022

Randolph-zeng commented Dec 9, 2022 • edited Loading

patrickvonplaten commented Dec 12, 2022

patrickvonplaten commented Dec 12, 2022

Randolph-zeng commented Dec 13, 2022

Randolph-zeng commented Dec 8, 2022 •

edited

Loading

Randolph-zeng commented Dec 9, 2022 •

edited

Loading