Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Community Pipeline] MagicMix #1839

Merged
merged 7 commits into from
Dec 28, 2022
Merged

[Community Pipeline] MagicMix #1839

merged 7 commits into from
Dec 28, 2022

Conversation

daspartho
Copy link
Contributor

Community pipeline based on my implementation of the MagicMix: Semantic Mixing with Diffusion Models paper.

This Diffusion Pipeline allows for the semantic mixing of an image and a text prompt to create a new concept while preserving the spatial layout and geometry of the subject in the image.

Here are some examples I reproduced from the paper using my implementation-

Input Image:

telephone

Prompt: "Bed"
Output Image:

telephone-bed

Input Image:

sign

Prompt: "Family"
Output Image:

sign-family

Input Image:

sushi

Prompt: "ice-cream"
Output Image:

sushi-ice-cream

Input Image:

pineapple

Prompt: "Cake"
Output Image:

pineapple-cake

@patrickvonplaten could you please take a look at it, looking forward to any comments!
Thanks :)

Reference: #841

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Dec 26, 2022

The documentation is not available anymore as the PR was closed or merged.

@aengusng8
Copy link
Contributor

Cool! But why did you put it in the community pipeline instead of the internal pipeline?

Copy link
Contributor

@patil-suraj patil-suraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool @daspartho ! The PR looks good, just left some nits

pipe = DiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
custom_pipeline="magic_mix",
scheduler = DDIMScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear", clip_sample=False, set_alpha_to_one=False),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scheduler can be loaded using

DDIMScheduler.from_pretrained("CompVis/stable-diffusion-v1-4", subfolder="scheduler")

Comment on lines +837 to +839
pipe = DiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
custom_pipeline="magic_mix",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, maybe load the pipeline in fp16, by passing the torch_dtype argument, to make inference faster.

prompt: str,
kmin: float = 0.3,
kmax: float = 0.6,
v: float = 0.5,
Copy link
Contributor

@patil-suraj patil-suraj Dec 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we use a more descriptive name for this argument? One letter variables aren't informative

@daspartho
Copy link
Contributor Author

@patil-suraj made some changes :)

@daspartho
Copy link
Contributor Author

daspartho commented Dec 28, 2022

could we use a more descriptive name for this argument? One letter variables aren't informative

The v parameter is the interpolation constant used in the layout generation process, so I settled for mix_factor as the new parameter name.
It is clear and more descriptive, and it helps to convey the purpose of the parameter in the context of the code.

wyt @patil-suraj

Copy link
Contributor

@patil-suraj patil-suraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! mix_factor sounds good to me.

@patil-suraj patil-suraj merged commit 2ba42aa into huggingface:main Dec 28, 2022
@daspartho daspartho deleted the magic_mix branch January 3, 2023 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants