Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flux Control Lora not unloaded correctly #10202

Closed
christopher5106 opened this issue Dec 12, 2024 · 14 comments · Fixed by #10206
Closed

Flux Control Lora not unloaded correctly #10202

christopher5106 opened this issue Dec 12, 2024 · 14 comments · Fixed by #10206
Labels
bug Something isn't working lora

Comments

@christopher5106
Copy link

Describe the bug

Hi,

There is a bug while switching pipeline from Flux dev when lora control has been loaded:

Reproduction

import torch
from controlnet_aux import CannyDetector
from diffusers import FluxControlPipeline
from diffusers.utils import load_image
from diffusers import FluxImg2ImgPipeline


model = "black-forest-labs/FLUX.1-dev"
pipe = FluxControlPipeline.from_pretrained(
    model, 
    torch_dtype=torch.bfloat16
).to("cuda")

pipe.load_lora_weights(
    "black-forest-labs/FLUX.1-Canny-dev-lora"
)

prompt = "A robot made of exotic candies and chocolates of different kinds. The background is filled with confetti and celebratory gifts."
control_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/robot.png")

processor = CannyDetector()
control_image = processor(control_image, low_threshold=50, high_threshold=200, detect_resolution=1024, image_resolution=1024)

image = pipe(
    prompt=prompt,
    control_image=control_image,
    num_inference_steps=50,
    guidance_scale=30.0,
).images[0]

# pipe = FluxImg2ImgPipeline.from_pretrained(
#     "black-forest-labs/FLUX.1-dev", 
#     torch_dtype=torch.bfloat16
# )

pipe = FluxImg2ImgPipeline.from_pipe(
    pipe,
    torch_dtype=torch.bfloat16
)

pipe = pipe.to("cuda")

init_image = load_image("https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg").resize((1024, 1024))
prompt = "cat wizard, gandalf, lord of the rings, detailed, fantasy, cute, adorable, Pixar, Disney, 8k"
image = pipe(
    prompt=prompt, 
    image=init_image, 
    num_inference_steps=28, 
    strength=0.5, 
    guidance_scale=2.5
).images[0]

Remplacing the from_pipe loading by standard loading shows the previous code should work

pipe = FluxImg2ImgPipeline.from_pretrained(
     "black-forest-labs/FLUX.1-dev", 
     torch_dtype=torch.bfloat16
)

Logs

No response

System Info

Ubuntu

Who can help?

@yiyixuxu @sayakpaul @DN6 @asomoza

@christopher5106 christopher5106 added the bug Something isn't working label Dec 12, 2024
@yiyixuxu yiyixuxu added the lora label Dec 13, 2024
@sayakpaul
Copy link
Member

I think this is expected.

If you're doing a from_pipe on the Control pipeline the transformer has 128 input channels, which is not compatible with the Flux img2img pipeline.

We expand the transformer state dict to support loading the LoRA in question and we make explicit logging about that too.

So, we'd want to call pipe.unload_lora_weights() before calling from_pipe().

@christopher5106
Copy link
Author

I tried with pipe.unload_lora_weights() before from_pipe and it still breaks

@sayakpaul
Copy link
Member

Yeah looking into that now.

@sayakpaul
Copy link
Member

@christopher5106 #10206 should probably solve the problem.

@christopher5106
Copy link
Author

yes, it does work, when using unload_lora_weights before switch

@sayakpaul
Copy link
Member

That is expected for the reasons I mentioned:
#10202 (comment)

Do you think this should be documented?

@yiyixuxu any thoughts?

@christopher5106
Copy link
Author

@sayakpaul
just for my knowledge, what is the purpose of the torch.cat on single element array:

converted_state_dict[f"{block_prefix}attn.to_q.{lora_key}.weight"] = torch.cat([sample_q])

@a-r-r-o-w
Copy link
Member

It's a null-op here I think. It comes from copying the Flux conversion script which also has this:

converted_state_dict[f"{block_prefix}attn.to_q.weight"] = torch.cat([sample_q])
. It probably slipped through by mistake, or was serving a different purpose when it was added and eventually evolved to a single element list. Can be tackled in a refactor

@christopher5106
Copy link
Author

@sayakpaul I thought it was solved. I don't know why but no more, something has change. Please reopen

@a-r-r-o-w a-r-r-o-w reopened this Jan 8, 2025
@sayakpaul
Copy link
Member

What is the problem?

@christopher5106
Copy link
Author

Unloading flux control lora does not work anymore.

import torch
from controlnet_aux import CannyDetector
from diffusers import FluxControlPipeline
from diffusers.utils import load_image
from diffusers import FluxImg2ImgPipeline


model = "black-forest-labs/FLUX.1-dev"
pipe = FluxControlPipeline.from_pretrained(
    model, 
    torch_dtype=torch.bfloat16
).to("cuda")

pipe.load_lora_weights(
    "black-forest-labs/FLUX.1-Canny-dev-lora"
)

prompt = "A robot made of exotic candies and chocolates of different kinds. The background is filled with confetti and celebratory gifts."
control_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/robot.png")

processor = CannyDetector()
control_image = processor(control_image, low_threshold=50, high_threshold=200, detect_resolution=1024, image_resolution=1024)

image = pipe(
    prompt=prompt,
    control_image=control_image,
    num_inference_steps=50,
    guidance_scale=30.0,
).images[0]

pipe.unload_lora_weights()

pipe = FluxImg2ImgPipeline.from_pipe(
    pipe,
    torch_dtype=torch.bfloat16
)

pipe = pipe.to("cuda")

init_image = load_image("https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg").resize((1024, 1024))
prompt = "cat wizard, gandalf, lord of the rings, detailed, fantasy, cute, adorable, Pixar, Disney, 8k"
image = pipe(
    prompt=prompt, 
    image=init_image, 
    num_inference_steps=28, 
    strength=0.5, 
    guidance_scale=2.5
).images[0]
    image = pipe(
            ^^^^^
  File "venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "venv/lib/python3.12/site-packages/diffusers/pipelines/flux/pipeline_flux_img2img.py", line 773, in __call__
    latents, latent_image_ids = self.prepare_latents(
                                ^^^^^^^^^^^^^^^^^^^^^
  File "venv/lib/python3.12/site-packages/diffusers/pipelines/flux/pipeline_flux_img2img.py", line 565, in prepare_latents
    latents = self.scheduler.scale_noise(image_latents, timestep, noise)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "venv/lib/python3.12/site-packages/diffusers/schedulers/scheduling_flow_match_euler_discrete.py", line 187, in scale_noise
    sample = sigma * noise + (1.0 - sigma) * sample
             ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
RuntimeError: The size of tensor a (32) must match the size of tensor b (16) at non-singleton dimension 1

@christopher5106
Copy link
Author

When you asked me last month, it was working, in particular I double check now, unloading works well with commit 1b202c5 but no more with recent version of diffusers

@sayakpaul
Copy link
Member

@christopher5106
Copy link
Author

There was an introduction of a parameter reset_to_overwritten_params=True inbetween my validation... So with this parameter it now works well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working lora
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants