Generating preview images with Stable Diffusion XL pipeline results in black images #6810

MonkeeMan1 · 2024-02-01T14:06:20Z

MonkeeMan1
Feb 1, 2024

I'm working with the Stable Diffusion XL (SDXL) model from Hugging Face's diffusers library and encountering an issue where my callback function, intended to generate preview images during the diffusion process, only produces black images. This setup used to work with Stable Diffusion 1.5, but seems to have issues with SDXL.

The main difference I've noticed is in the handling of callbacks in SDXL, where latents are now stored in callback_kwargs. I've tried to adapt my code accordingly, but the previews are still not generated correctly.

Here's a minimal example of my current implementation:
`python
from diffusers import StableDiffusionXLPipeline
import torch

pipe = StableDiffusionXLPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda")

prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"

def callback(pipe, step_index, timestep, callback_kwargs):
latents = callback_kwargs.get("latents")

with torch.no_grad():
    latents = 1 / 0.18215 * latents
    image = pipe.vae.decode(latents).sample
    image = (image / 2 + 0.5).clamp(0, 1)
    
    image = image.cpu().permute(0, 2, 3, 1).float().numpy()
    
    image = pipe.numpy_to_pil(image)[0]
    image.save(f"./imgs/{step_index}.png")
    
return callback_kwargs

image = pipe(prompt=prompt, callback_on_step_end=callback).images[0]
`

The resulting images saved in ./imgs/ are just black. I suspect the issue might be related to the handling of latents or the image conversion process, but I'm not sure what specifically is going wrong.

Has anyone experienced a similar issue or can provide insight into why this might be happening with the SDXL model?

asomoza · 2024-02-01T21:19:16Z

asomoza
Feb 1, 2024
Maintainer

there's a known issue with the default VAE from SAI, it needs to be loaded in full precision or else most of the time it will output black images, this is managed automatically inside the pipelines:

diffusers/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

Lines 1290 to 1301 in adcbe67

    
           # make sure the VAE is in float32 mode, as it overflows in float16 
        
           needs_upcasting = self.vae.dtype == torch.float16 and self.vae.config.force_upcast 
        
           if needs_upcasting: 
        
               self.upcast_vae() 
        
               latents = latents.to(next(iter(self.vae.post_quant_conv.parameters())).dtype) 
        
           image = self.vae.decode(latents / self.vae.config.scaling_factor, return_dict=False)[0] 
        
           # cast back to fp16 if needed 
        
           if needs_upcasting: 
        
               self.vae.to(dtype=torch.float16)

If you're going to manually decode the latents, you'll need to implement the same solution or you can just use a VAE that doesn't have this problem like this one:

https://huggingface.co/madebyollin/sdxl-vae-fp16-fix

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generating preview images with Stable Diffusion XL pipeline results in black images #6810

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Generating preview images with Stable Diffusion XL pipeline results in black images #6810

MonkeeMan1 Feb 1, 2024

Replies: 1 comment

asomoza Feb 1, 2024 Maintainer

MonkeeMan1
Feb 1, 2024

asomoza
Feb 1, 2024
Maintainer