[Major Update] Reference-only Control #1236
Replies: 67 comments 152 replies
-
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
This method can rediffuse midjourney images Midjourney V5 (https://twitter.com/kajikent/status/1654409097041817601): woman in street, masterpiece, best quality, Negative prompt: lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry I am not even using high-res fix. just a random run without many quality consideration. for a simple record (Edit - if you want to reproduce this example in 1.1.170, please use "balanced" mode, style fidelity=1.0, batch count 4, batch size 1) |
Beta Was this translation helpful? Give feedback.
-
handles cartoon without problem (Edit - if you want to reproduce this example in 1.1.170, please use "balanced" mode, style fidelity=1.0) |
Beta Was this translation helpful? Give feedback.
This comment has been hidden.
This comment has been hidden.
-
Hi @lllyasviel, This preprocessor seems pretty great but I can't run it. I updated my webui and Controlnet to v1.1.157 but I'm having the following error : RuntimeError: Given Groups=1, weight of size [320,8,3,3] expected input [3,4,64,64] to have 8channels, but got 4 channels instead. The error disappears if I disable the preprocessor. |
Beta Was this translation helpful? Give feedback.
-
This is great but I think it needs a bit of optimizing as its a bit slow to start generating once u select it. Cant wait for future updates Another thing noticed. I think this has to do with batch generating. If u batch generate and select one with a specific seed and try to generate it, the image changes |
Beta Was this translation helpful? Give feedback.
-
I presume this is a bug, not sure if it's just a display bug or something strange happening in the background as well, but if you use Highres Fix, during each HighRes Fix step it will show a message stating "ControlNet has used VAE to encode Latent Shape"
Sorry about the dodgy formatting. |
Beta Was this translation helpful? Give feedback.
-
some results are very good, it only works in balanced mode, can this be adjusted so it also works in prompt /cnet more important modes ? |
Beta Was this translation helpful? Give feedback.
-
resolved in 1.1.170 |
Beta Was this translation helpful? Give feedback.
-
it seems that we will double review some tech parts of this to try to resolve some problems. but in most tests it works in expected way. |
Beta Was this translation helpful? Give feedback.
-
I tried doubling, quadrupling, maxing out the number of sampling steps but I still can't get the image to be sharp and correct; that is, with the balanced mode option turned on. As Arron17 pointed out earlier the only thing that seems to fix the image is turning on either the prompt is more important option or the control net is more important option. With either of those two options turned on, I can have a normal number of sampling steps and the images come out just as good as ever. I would prefer to be able to use the balanced mode though, as it seems to produce the best results. EDIT: Again, as pointed out by Arron17, lowering the ending control step to about 0.35 seems to resolve the blur/color in balanced mode as well. |
Beta Was this translation helpful? Give feedback.
-
hi all. 1.1.162 fixed all vae precision or double load problems |
Beta Was this translation helpful? Give feedback.
-
I noticed that strengths above 1 seem to behave the same as strength 1, but strengths below 1 do have different effects. But with multi-controlnet, I do get a stronger effect just doubling up on this preprocessor. Could it be made to double up like that internally, if you choose a strength above 1? |
Beta Was this translation helpful? Give feedback.
-
Is this going to be released as a model that can be used from Diffusers? |
Beta Was this translation helpful? Give feedback.
-
Thanks for the amazing work! @lllyasviel Is there a specific pr/commit that I can use to have a look at what changes on the control net code? I am not familiar with this code base but it looks to me that is just the UI and not the modelling itself. I am a CV engineer and I love your work, I'd like to see the training and modelling code so I can learn something new :) Thanks a lot |
Beta Was this translation helpful? Give feedback.
-
hi all. MAC MPS problems are fixed in 1.1.167 (I think, hopefully) |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Test 2meta: Style Fidelity = 0.0 Style Fidelity = 0.25 Style Fidelity = 0.5 Style Fidelity = 0.75 Style Fidelity = 1.0 |
Beta Was this translation helpful? Give feedback.
-
You need at least 1.1.168 to use the Style Fidelity Slider feature |
Beta Was this translation helpful? Give feedback.
-
would guess mode be possible for reference only? |
Beta Was this translation helpful? Give feedback.
-
hi all 1.1.170 reference-only begin to support inpaint variation models |
Beta Was this translation helpful? Give feedback.
-
What's the argument name for style fidelity for the API? Edit: Just noticed that it's not yet supported so hopefully we will see it added soon. threshold_a could be used instead of adding a new argument. |
Beta Was this translation helpful? Give feedback.
-
Hi - I've tried using 1.66-1.68 - I either get bad looking images - or the person looks nothing like the original photo - but - it does take the clothes and background and produces something similar - but face and hair are completely different people.. any ideas why? thx.. |
Beta Was this translation helpful? Give feedback.
-
OK - 1.1.170 - i think this feature is finished and we wont make any big modifications to it now. |
Beta Was this translation helpful? Give feedback.
-
Thank you for this feature, it's supremely useful. Here's a neat trick I found for replacing characters — |
Beta Was this translation helpful? Give feedback.
-
What happens if Style Fidelity has negative values and greater than 1? it would be interesting to see |
Beta Was this translation helpful? Give feedback.
-
Hi! Does anyone know what causes this error? It happens to me when I try to render in batch File "D:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\functional.py", line 360, in einsum |
Beta Was this translation helpful? Give feedback.
-
I seem to be getting this error, ControlNet used torch.float32 VAE to encode torch.Size([2, 4, 96, 96]). |
Beta Was this translation helpful? Give feedback.
-
Hi all, we added more reference preprocessors. Please go to that page for discussion. This discussion will be locked as finished and too heated. |
Beta Was this translation helpful? Give feedback.
-
Reference-Only Control
Now we have a
reference-only
preprocessor that does not require any control models. It can guide the diffusion directly using images as references.(Prompt "a dog running on grassland, best quality, ...")
This method is similar to inpaint-based reference but it does not make your image disordered.
Many professional A1111 users know a trick to diffuse image with references by inpaint. For example, if you have a 512x512 image of a dog, and want to generate another 512x512 image with the same dog, some users will connect the 512x512 dog image and a 512x512 blank image into a 1024x512 image, send to inpaint, and mask out the blank 512x512 part to diffuse a dog with similar appearance. However, that method is usually not very satisfying since images are connected and many distortions will appear.
This
reference-only
ControlNet can directly link the attention layers of your SD to any independent images, so that your SD will read arbitary images for reference. You need at least ControlNet 1.1.153 to use it.To use, just select
reference-only
as preprocessor and put an image. Your SD will just use the image as reference.Note that this method is as "non-opinioned" as possible. It only contains very basic connection codes, without any personal preferences, to connect the attention layers with your reference images. However, even if we tried best to not include any opinioned codes, we still need to write some subjective implementations to deal with weighting, cfg-scale, etc - tech report is on the way.
Beta Was this translation helpful? Give feedback.
All reactions