-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Flux inpainting and Flux Img2Img #9135
Conversation
@Gothos This is looking great! Since this PR is not yet marked for review, I assume it is incomplete in some ways. Let us know if you're facing any problems and we'd be happy to help. There are a couple of issues and messages from folks asking to have this implemented and usable from diffusers, so really nice of you to take this up :) cc @asomoza here for more testing and implementation/noising improvements 🤩 |
It works ootb. I probably should have marked it as ready for review really, since it's missing only the inpainting-only checkpoint (i.e models similar to stable-diffusion-xl-inpainting-0.1, which we don't have for flux) support and docs. |
Hi @Gothos 👋🏻 can you provide any usage example showing how to run the inpainting pipeline? |
Sure! First
then: from diffusers import FluxInpaintPipeline
from PIL import Image
import torch
pipe = FluxInpaintPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
prompt = "your prompt here"
image = pipe(
prompt,
image = Image.open("path/to/image",
mask_image=Image.open("path/to/mask"),
strength=0.85, # below 0.85 doesn't seem to cause a lot of change
height=1024,
width=1024,
guidance_scale=3.5,
num_inference_steps=50,
max_sequence_length=512,
generator=torch.Generator("cpu").manual_seed(0)
).images[0]
image Just replacing the path to the image and path to the mask, and prompt should work. |
@asomoza if I'm not wrong the inpainting trained flux endpoint should check for 132 channels? If this is the case I'll probably finish the PR today. |
Also correct me if I'm wrong, but isn't the img2img equivalent to having an all-white mask in inpainting/not selectively blending latents in denoise step? I can add in an img2img pipeline as well if this is the case, since it'll involve minimal changes from inpainting. @a-r-r-o-w @asomoza |
@Gothos awesome work! I build FLUX.1 inpainting HF space using code from this PR: https://huggingface.co/spaces/SkalskiP/FLUX.1-inpaint |
Yeah saw the space and the linkedin post! Thanks for the mention! |
Nice work @Gothos! A few things before we merge. Can we
|
Will do today. |
@Gothos Yeah you can do that as well 👍🏽 |
Cool, will do all these and request a review. |
Looks like this fails on mac with MPS. There's been some recent fixes to FLUX for diffusers that might have to be added here as well? |
This is the error I get: |
Hmm I don't really have a mac to test this. Could you point out the PR? |
shape = (batch_size, num_channels_latents, height, width) | ||
latent_image_ids = self._prepare_latent_image_ids(batch_size, height, width, device, dtype) | ||
|
||
if latents is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be consistent with the defination of latents
input in our other img2img pipelines (they are image latents)
diffusers/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_img2img.py
Line 1277 in 007ad0e
if latents is None: |
|
||
if latents is None: | ||
noise = randn_tensor(shape, generator=generator, device=device, dtype=dtype) | ||
latents = self.scheduler.scale_noise(image_latents, timestep, noise) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that we do not need is_strength_max
for flow match based models: it is a pure noise when strengh==1
sample = sigma * noise + (1.0 - sigma) * sample |
will remove that for sd3 inpaint too
@Gothos If you can make some final checks, that would be great! (no worries if not) nd sorry we're a bit slow in this |
Haha I should be the one apologising, I've been too slow on this! I'll run
some examples on my end.
…On Tue, 3 Sept 2024, 13:55 YiYi Xu, ***@***.***> wrote:
@Gothos <https://github.com/Gothos>
thanks for your PR!
I made some final changes. We will merge this very soon.
If you can make some final checks, that would be great! (no worries if not)
nd sorry we're a bit slow in this
—
Reply to this email directly, view it on GitHub
<#9135 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AWY3A7NHSVNTQ45TGLUTEDLZUVW6VAVCNFSM6AAAAABMH2M4LGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRVHEYDANJYGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
--------- Co-authored-by: yiyixuxu <[email protected]> Update `UNet2DConditionModel`'s error messages (#9230) * refactor [CI] Update Single file Nightly Tests (#9357) * update * update feedback. improve README for flux dreambooth lora (#9290) * improve readme * improve readme * improve readme * improve readme fix one uncaught deprecation warning for accessing vae_latent_channels in VaeImagePreprocessor (#9372) deprecation warning vae_latent_channels add mixed int8 tests and more tests to nf4. [core] Freenoise memory improvements (#9262) * update * implement prompt interpolation * make style * resnet memory optimizations * more memory optimizations; todo: refactor * update * update animatediff controlnet with latest changes * refactor chunked inference changes * remove print statements * update * chunk -> split * remove changes from incorrect conflict resolution * remove changes from incorrect conflict resolution * add explanation of SplitInferenceModule * update docs * Revert "update docs" This reverts commit c55a50a. * update docstring for freenoise split inference * apply suggestions from review * add tests * apply suggestions from review quantization docs. docs.
What I can tell you is that as good as Flux is for modest inpainting (filling in a masked region) it is very poor at outpainting (replacing everything but the mask object). Flux needs an inpainting version. |
it will lose the ability of inpainting. as a text2img task in designated area with mask |
When I use higher strength FLUX overpaints a new image with my masked
image. It doesn't scale or blend properly as inpainting should. Yes,
if I reduce strength it will somewhat work but not as well as an inpainting
model should.
The whole point is that FLUX needs an Inpainting model to work as well as
SDXL inpainting works.
…On Mon, Sep 23, 2024 at 8:15 AM ssxxx1a ***@***.***> wrote:
@ssxxx1a <https://github.com/ssxxx1a> try higher denoising strength.
Larger than 0.85 works fine, start from 1. to understand if that is the
issue.
it will lose the ability of inpainting. as a text2img task in designated
area with mask
—
Reply to this email directly, view it on GitHub
<#9135 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AZTE5IBJQWLKQ5EFAL3U7HDZYAA73AVCNFSM6AAAAABMH2M4LGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRYGA2DGNJTGI>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
We need an Flux inpanting model like this: But it requirs a lot of GPU resources/money to train. |
Unfortunately yes if FLUX is to do proper inpainting / outpainting just
like SDXL does.
…On Thu, Sep 26, 2024 at 9:28 PM PandaYummy ***@***.***> wrote:
We need an Flux inpanting model like this:
https://huggingface.co/diffusers/stable-diffusion-xl-1.0-inpainting-0.1
But it requirs a lot of GPU resources/money to train.
—
Reply to this email directly, view it on GitHub
<#9135 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AZTE5IET6IV5IIPOLHOS5SDZYSYC7AVCNFSM6AAAAABMH2M4LGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNZYGIZDQMJRGQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
* quantization config. * fix-copies * fix * modules_to_not_convert * add bitsandbytes utilities. * make progress. * fixes * quality * up * up rotary embedding refactor 2: update comments, fix dtype for use_real=False (#9312) fix notes and dtype up up * minor * up * up * fix * provide credits where due. * make configurations work. * fixes * fix * update_missing_keys * fix * fix * make it work. * fix * provide credits to transformers. * empty commit * handle to() better. * tests * change to bnb from bitsandbytes * fix tests fix slow quality tests SD3 remark fix complete int4 tests add a readme to the test files. add model cpu offload tests warning test * better safeguard. * change merging status * courtesy to transformers. * move upper. * better * make the unused kwargs warning friendlier. * harmonize changes with huggingface/transformers#33122 * style * trainin tests * feedback part i. * Add Flux inpainting and Flux Img2Img (#9135) --------- Co-authored-by: yiyixuxu <[email protected]> Update `UNet2DConditionModel`'s error messages (#9230) * refactor [CI] Update Single file Nightly Tests (#9357) * update * update feedback. improve README for flux dreambooth lora (#9290) * improve readme * improve readme * improve readme * improve readme fix one uncaught deprecation warning for accessing vae_latent_channels in VaeImagePreprocessor (#9372) deprecation warning vae_latent_channels add mixed int8 tests and more tests to nf4. [core] Freenoise memory improvements (#9262) * update * implement prompt interpolation * make style * resnet memory optimizations * more memory optimizations; todo: refactor * update * update animatediff controlnet with latest changes * refactor chunked inference changes * remove print statements * update * chunk -> split * remove changes from incorrect conflict resolution * remove changes from incorrect conflict resolution * add explanation of SplitInferenceModule * update docs * Revert "update docs" This reverts commit c55a50a. * update docstring for freenoise split inference * apply suggestions from review * add tests * apply suggestions from review quantization docs. docs. * Revert "Add Flux inpainting and Flux Img2Img (#9135)" This reverts commit 5799954. * tests * don * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> * contribution guide. * changes * empty * fix tests * harmonize with huggingface/transformers#33546. * numpy_cosine_distance * config_dict modification. * remove if config comment. * note for load_state_dict changes. * float8 check. * quantizer. * raise an error for non-True low_cpu_mem_usage values when using quant. * low_cpu_mem_usage shenanigans when using fp32 modules. * don't re-assign _pre_quantization_type. * make comments clear. * remove comments. * handle mixed types better when moving to cpu. * add tests to check if we're throwing warning rightly. * better check. * fix 8bit test_quality. * handle dtype more robustly. * better message when keep_in_fp32_modules. * handle dtype casting. * fix dtype checks in pipeline. * fix warning message. * Update src/diffusers/models/modeling_utils.py Co-authored-by: YiYi Xu <[email protected]> * mitigate the confusing cpu warning --------- Co-authored-by: Vishnu V Jaddipal <[email protected]> Co-authored-by: Steven Liu <[email protected]> Co-authored-by: YiYi Xu <[email protected]>
Thanks for your code, and how much GPU VRAM consume? |
--------- Co-authored-by: yiyixuxu <[email protected]>
* quantization config. * fix-copies * fix * modules_to_not_convert * add bitsandbytes utilities. * make progress. * fixes * quality * up * up rotary embedding refactor 2: update comments, fix dtype for use_real=False (#9312) fix notes and dtype up up * minor * up * up * fix * provide credits where due. * make configurations work. * fixes * fix * update_missing_keys * fix * fix * make it work. * fix * provide credits to transformers. * empty commit * handle to() better. * tests * change to bnb from bitsandbytes * fix tests fix slow quality tests SD3 remark fix complete int4 tests add a readme to the test files. add model cpu offload tests warning test * better safeguard. * change merging status * courtesy to transformers. * move upper. * better * make the unused kwargs warning friendlier. * harmonize changes with huggingface/transformers#33122 * style * trainin tests * feedback part i. * Add Flux inpainting and Flux Img2Img (#9135) --------- Co-authored-by: yiyixuxu <[email protected]> Update `UNet2DConditionModel`'s error messages (#9230) * refactor [CI] Update Single file Nightly Tests (#9357) * update * update feedback. improve README for flux dreambooth lora (#9290) * improve readme * improve readme * improve readme * improve readme fix one uncaught deprecation warning for accessing vae_latent_channels in VaeImagePreprocessor (#9372) deprecation warning vae_latent_channels add mixed int8 tests and more tests to nf4. [core] Freenoise memory improvements (#9262) * update * implement prompt interpolation * make style * resnet memory optimizations * more memory optimizations; todo: refactor * update * update animatediff controlnet with latest changes * refactor chunked inference changes * remove print statements * update * chunk -> split * remove changes from incorrect conflict resolution * remove changes from incorrect conflict resolution * add explanation of SplitInferenceModule * update docs * Revert "update docs" This reverts commit c55a50a. * update docstring for freenoise split inference * apply suggestions from review * add tests * apply suggestions from review quantization docs. docs. * Revert "Add Flux inpainting and Flux Img2Img (#9135)" This reverts commit 5799954. * tests * don * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> * contribution guide. * changes * empty * fix tests * harmonize with huggingface/transformers#33546. * numpy_cosine_distance * config_dict modification. * remove if config comment. * note for load_state_dict changes. * float8 check. * quantizer. * raise an error for non-True low_cpu_mem_usage values when using quant. * low_cpu_mem_usage shenanigans when using fp32 modules. * don't re-assign _pre_quantization_type. * make comments clear. * remove comments. * handle mixed types better when moving to cpu. * add tests to check if we're throwing warning rightly. * better check. * fix 8bit test_quality. * handle dtype more robustly. * better message when keep_in_fp32_modules. * handle dtype casting. * fix dtype checks in pipeline. * fix warning message. * Update src/diffusers/models/modeling_utils.py Co-authored-by: YiYi Xu <[email protected]> * mitigate the confusing cpu warning --------- Co-authored-by: Vishnu V Jaddipal <[email protected]> Co-authored-by: Steven Liu <[email protected]> Co-authored-by: YiYi Xu <[email protected]>
What does this PR do?
PR to add
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Adds basic flux inpainting. This still has some ways to go, especially since any flux equivalent of 9-channel inpainting is not supported yet. I'd also like comments on noising.


Image, mask, and inpainting a cactus at strengths from 0.65 to 0.9