Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flux Control LoRA #9999

Merged
merged 79 commits into from
Dec 10, 2024
Merged

Flux Control LoRA #9999

merged 79 commits into from
Dec 10, 2024

Conversation

a-r-r-o-w
Copy link
Member

As discussed internally, Control LoRA requires some additional changes to the lora loading support that currently exists, which need to be made carefully without breaking backwards compatibility. This PR is a continuation of the dicussion in #9985 and is dependant on it being merge first for the pipelines. All the other irrelevant changes will go away when we rebase.

https://huggingface.slack.com/archives/C065E480NN9/p1732303859757229?thread_ts=1732208754.296369&cid=C065E480NN9

@a-r-r-o-w a-r-r-o-w requested a review from sayakpaul November 23, 2024 00:46
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@a-r-r-o-w a-r-r-o-w marked this pull request as draft November 23, 2024 01:12
@sayakpaul
Copy link
Member

@a-r-r-o-w I was able to load the LoRA after applying @BenjaminBossan's suggestions. My updates are in the sayak-flux-control-lora branch.

However, I am facing a problem during running inference:

Traceback (most recent call last):
  File "/home/sayak/diffusers/check_flux_control_lora.py", line 26, in <module>
    image = pipeline(
  File "/home/sayak/.pyenv/versions/3.10.12/envs/diffusers/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/home/sayak/diffusers/src/diffusers/pipelines/flux/pipeline_flux_control.py", line 782, in __call__
    control_image = self._pack_latents(
  File "/home/sayak/diffusers/src/diffusers/pipelines/flux/pipeline_flux_control.py", line 475, in _pack_latents
    latents = latents.view(batch_size, num_channels_latents, height // 2, 2, width // 2, 2)
RuntimeError: shape '[1, 8, 64, 2, 64, 2]' is invalid for input of size 262144

This is happening because self.transformer.config.in_channels is still 64 whereas it should be 128 after the state dict is expanded. We could leverage the register_to_config() method to update the config of the transformer if we expand any key.

There are some problems w.r.t unexpected keys (mostly concerning bias keys) during LoRA loading as well. Need to investigate this further.

code
from diffusers import FluxControlPipeline, FluxTransformer2DModel 
from diffusers.utils import load_image
from image_gen_aux import DepthPreprocessor
import torch

transformer = FluxTransformer2DModel.from_pretrained(
    "black-forest-labs/FLUX.1-dev", subfolder="transformer", torch_dtype=torch.bfloat16
)
pipeline = FluxControlPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-Depth-dev", 
    transformer=transformer, 
    revision="refs/pr/1", 
    torch_dtype=torch.bfloat16
).to("cuda")

# Was obtained after running the conversion with `convert_flux_control_lora_to_diffusers.py` with depth LoRA
# https://hf.co/black-forest-labs/FLUX.1-Depth-dev-lora
pipeline.load_lora_weights("/home/sayak/diffusers/scripts/converted.safetensors")

prompt = "A robot made of exotic candies and chocolates of different kinds. The background is filled with confetti and celebratory gifts."
control_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/robot.png")

processor = DepthPreprocessor.from_pretrained("LiheYoung/depth-anything-large-hf")
control_image = processor(control_image)[0].convert("RGB")

image = pipeline(
    prompt=prompt,
    control_image=control_image,
    height=1024,
    width=1024,
    num_inference_steps=30,
    joint_attention_kwargs={"scale": 0.85},
    guidance_scale=10.0,
    generator=torch.Generator().manual_seed(42),
).images[0]
image.save("output_depth_lora.png")

I have to run for some errands now so, I will let you investigate this. I will look into it after I return.

@sayakpaul
Copy link
Member

Okay I am back. Looking into it now.

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Dec 6, 2024

@BenjaminBossan ohh sorry for the trouble! 😬

@a-r-r-o-w
Copy link
Member Author

@sayakpaul Updated the docs with a lora example. For the integration tests, I'm unable to get the slices on our L40 runners because the process keeps getting killed due to high CPU usage. I've pushed the tests (without expected slices) here for the moment. Where are the existing lora test slices from if it's not possible to run these tests on L40?

@sayakpaul
Copy link
Member

sayakpaul commented Dec 7, 2024

@a-r-r-o-w thanks! I have added the integration tests with the proper slices. I think this PR is now ready to be reviewed.

image

@sayakpaul sayakpaul requested a review from yiyixuxu December 7, 2024 05:17
@a-r-r-o-w a-r-r-o-w added close-to-merge roadmap Add to current release roadmap labels Dec 10, 2024
@yiyixuxu yiyixuxu merged commit 49a9143 into main Dec 10, 2024
17 of 18 checks passed
@yiyixuxu yiyixuxu deleted the flux-control-lora branch December 10, 2024 19:08
@vladmandic
Copy link
Contributor

pipe.load_lora_weights("black-forest-labs/FLUX.1-Depth-dev-lora")

this works for standard model only.
for nf4 quantized model, it fails with

NotImplementedError: Only LoRAs with input/output features higher than the current module's input/output features are currently supported. The provided LoRA contains in_features=256 and out_features=3072, which are lower than module_in_features=1 and module_out_features=393216. If you require support for this please open an issue at https://github.com/huggingface/diffusers/issues.

@sayakpaul
Copy link
Member

Could you open a new issue and supplement it with a snippet?

@christopher5106
Copy link

#10202

sayakpaul added a commit that referenced this pull request Dec 23, 2024
* update


---------

Co-authored-by: yiyixuxu <[email protected]>
Co-authored-by: Sayak Paul <[email protected]>
@scarbain
Copy link

Hi, I'm having a really hard time trying to convert a trained flux control LoRA using diffusers scripts back to the original BFL format (for use in comfy). Do you have any tips or script available for this please ?

@sayakpaul
Copy link
Member

Sorry, we don't have a reverse script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
roadmap Add to current release roadmap
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants