-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DDIM Sampler on Stable-Diffusion do not work well with CFG guidance scale large than 6~7 #1602
Comments
cc @patrickvonplaten @patil-suraj this could be related to the missing dynamic thresholding |
@anton-l @patrickvonplaten @patil-suraj Hi thanks for the response! However I don't think this is related to the missing dynamic thresholding(other samplers work just fine for same CFG scale ). I did the same thing using the same model, same prompt, same sampler in webUI(where DDIM functions normally there). I used a debugger to trace through the code and I believe there is no dynamic thresholding there. In fact they are using the sampler implemented in the runwayml/stable-diffusion repo. I suspect it is the formula but I don't see any inconsistence and beta calculation is very close. |
Hey @Randolph-zeng, I could not reproduce the bug you've shown in your code example above. from diffusers import StableDiffusionPipeline, DDIMScheduler
import torch
# swap any SD model here
pipe = StableDiffusionPipeline.from_pretrained('runwayml/stable-diffusion-v1-5').to(0)
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
prompt = "A photograph of an astronaut riding a horse on Mars, high resolution, high definition."
generator = torch.Generator(device="cuda").manual_seed(33)
# change the guidance scale from 1 - 15 and observe the performance degradation
image = pipe(prompt=prompt, num_inference_steps=100, guidance_scale=7.5, generator=generator).images[0]
image.save("/home/patrick_huggingface_co/images/aa.png") A reason for why your scheduler might now have been working correctly could have been that you were using the fp16 branch which previously had incorrectly set We will also make sure to update the conversion script accordingly! |
Here the PR that updates the conversion script: #1667 |
@patrickvonplaten OMG!! Yes this is exactly the reason why it is failing !!! Thanks a lot for finding it out. This bug really tortures me for a week ! I checked everything but just did not check the clipping and that's why the DDIM is failing because its normal range is [-4,4] and clipping makes all the guidance signal go away! |
Describe the bug
I have noticed that the SD v1.4 or v1.5 works poorly if I swap the scheduler to DDIM and have guidance scale larger than 7.
![Screenshot 2022-12-08 at 15 08 04](https://user-images.githubusercontent.com/11933185/206381558-8c4ff9a1-dbd4-4073-bcf6-a3d62d8ced76.png)
This behavior does not seems to be obvious on other sampler such as the default PNDM scheduler though. At first I suspect it is related to the "train-test-mismatch" mentioned in Imagen paper Sec 2.3. However, I found that that the same DDIM sampler in WebUI does not suffer from the same performance degradation with the same guidance scale. I have manually traced the scales of the predicted epsilon value under the same prompt/guidance scale in both WebUI DDIM sampler and diffuser DDIM sampler, they are all within simialr range of [-4, 4]. I have also printed out the beta schedules of both but they are very very close( of course I tried replacing diffusers DDIM betas with WebUI's DDIM betas but it does not help )
Reproduction
Reproduce is easy, and the behavior of following code snippet is consistent across diffusers versions from 0.3 to 0.9
guidance_scale = 2.5
![Screenshot 2022-12-08 at 15 19 25](https://user-images.githubusercontent.com/11933185/206383653-4d8e4a86-ef97-4676-8d8f-6efa5d3d2214.png)
guidance_scale = 5.0
![Screenshot 2022-12-08 at 15 20 10](https://user-images.githubusercontent.com/11933185/206383789-c4b13e05-aff9-4244-a0ac-5b775a64f4e2.png)
guidance_scale = 7.5
![Screenshot 2022-12-08 at 15 22 16](https://user-images.githubusercontent.com/11933185/206384175-7cb5959f-9001-40f4-a645-937312c0cfd2.png)
guidance_scale = 10.
![Screenshot 2022-12-08 at 15 21 10](https://user-images.githubusercontent.com/11933185/206383953-ce7d1b10-511c-4a73-bacc-1b683933f5f7.png)
Logs
No response
System Info
Both diffusers == 0.3.0 and diffusers == 0.9.0 suffer from such issue
Also SD v1-4 and v1-5 suffer from such issue
The text was updated successfully, but these errors were encountered: