[Major Update] Reference-only Control #1236

lllyasviel · 2023-05-13T09:59:18Z

lllyasviel
May 13, 2023
Collaborator

Reference-Only Control

Now we have a reference-only preprocessor that does not require any control models. It can guide the diffusion directly using images as references.

(Prompt "a dog running on grassland, best quality, ...")

This method is similar to inpaint-based reference but it does not make your image disordered.

Many professional A1111 users know a trick to diffuse image with references by inpaint. For example, if you have a 512x512 image of a dog, and want to generate another 512x512 image with the same dog, some users will connect the 512x512 dog image and a 512x512 blank image into a 1024x512 image, send to inpaint, and mask out the blank 512x512 part to diffuse a dog with similar appearance. However, that method is usually not very satisfying since images are connected and many distortions will appear.

This reference-only ControlNet can directly link the attention layers of your SD to any independent images, so that your SD will read arbitary images for reference. You need at least ControlNet 1.1.153 to use it.

To use, just select reference-only as preprocessor and put an image. Your SD will just use the image as reference.

Note that this method is as "non-opinioned" as possible. It only contains very basic connection codes, without any personal preferences, to connect the attention layers with your reference images. However, even if we tried best to not include any opinioned codes, we still need to write some subjective implementations to deal with weighting, cfg-scale, etc - tech report is on the way.

lllyasviel · 2023-05-13T10:35:16Z

lllyasviel
May 13, 2023
Collaborator Author

This method works with anime without problem since it is model-free and reference-only

input (actually this is a human-drawn image)

output

Now your problem of "drawing a same person with some changes" should be solved now

(Edit - if you want to reproduce this example in 1.1.170, please use "balanced" mode, style fidelity=1.0)

0 replies

lllyasviel · 2023-05-13T10:43:59Z

lllyasviel
May 13, 2023
Collaborator Author

For comparison, this is T2I adapter style, same inputs, prompts, and parameters

this is CN 1.1 Shuffle, same inputs, prompts, and parameters

It is hard to say which is better but to me this example is very difficult and both methods just fail. (because neither of these two models is for anime)

but reference-only is model free so it does not have this problem

if you have a clear goal in mind then you should definitely need the new "reference-only"

0 replies

lllyasviel · 2023-05-13T11:11:35Z

lllyasviel
May 13, 2023
Collaborator Author

This method can rediffuse midjourney images

Midjourney V5 (https://twitter.com/kajikent/status/1654409097041817601):

SD1.5 (Realistic Vision V20):

woman in street, masterpiece, best quality,

Negative prompt: lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry
Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 123456, Size: 768x512, Model hash: c0d1994c73, Model: realisticVisionV20_v20, ENSD: 31337, Version: 1.1.1-115-g89f9faa6, ControlNet 0: "preprocessor: reference_only, model: None, weight: 1, starting/ending: (0, 1), resize mode: Crop and Resize, pixel perfect: True, control mode: ControlNet is more important, preprocessor params: (64, 1, 64)"

I am not even using high-res fix. just a random run without many quality consideration. for a simple record

(Edit - if you want to reproduce this example in 1.1.170, please use "balanced" mode, style fidelity=1.0, batch count 4, batch size 1)

5 replies

trendland May 13, 2023

This method can rediffuse midjourney images

Midjourney V5 (https://twitter.com/kajikent/status/1654409097041817601):

SD1.5 (Realistic Vision V20): !

woman in street, masterpiece, best quality, Negative prompt: lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 123456, Size: 768x512, Model hash: c0d1994c73, Model: realisticVisionV20_v20, ENSD: 31337, Version: 1.1.1-104-gb08500ce, ControlNet 0: "preprocessor: reference_only, model: None, weight: 1, starting/ending: (0, 1), resize mode: Crop and Resize, pixel perfect: True, control mode: Balanced, preprocessor params: (64, 64, 64)"

I am not even using high-res fix. just a random run without many quality consideration. for a simple record

Well, been trying the same setup (img from MJ) and same settings (steps, model, steps...) with the new CN pre-processor and Im far from the gens you're getting...

Will keep on playing with it but for now, images are blurry, and very low quality...

lllyasviel May 13, 2023
Collaborator Author

@trendland

can you give more info for debugging? like input images and prompt

shubhdotai May 14, 2023

I'm also facing the same issues. Generated images are blurry and far off from the provided image

Here are my configurations -
Stable Diffusion.pdf

Rkkss May 14, 2023

I am not getting the same results either

lllyasviel May 14, 2023
Collaborator Author

Edit - if you want to reproduce this example in 1.1.170, please use "balanced" mode, style fidelity=1.0
batch count 4, batch size 1
(This is added to the post)

lllyasviel · 2023-05-13T11:18:33Z

lllyasviel
May 13, 2023
Collaborator Author

handles cartoon without problem

(Edit - if you want to reproduce this example in 1.1.170, please use "balanced" mode, style fidelity=1.0)

1 reply

herval May 13, 2023

oy that's my cartoon! 🤣

BlastedRemnants · 2023-05-13T13:56:20Z

BlastedRemnants May 13, 2023

I got mine to work, and narrowed down the cause a bit. I've got a bunch of different webui bats with different args for trying different things, and the one that gives the error includes --precision full, --opt-channelslast, and --upcast-sampling. Two bats that DID work included --medvram and --xformers, with the other having --medvram and --opt-sdp-attention, so it's got to be one of those three flags causing the error for me. Avoid those and you should probably be fine :D

Edit: Tested further and nailed it down to "--precision full" causing my error. Not sure if it would've worked with a different model, I'm just going to not use that arg instead since it was only in there for an experiment anyway.

Eagle07-Sudo · 2023-05-13T13:59:23Z

Eagle07-Sudo May 13, 2023

fixed in 1.1.157 eaf9939

Thank you! The issue has been resolved completely.

system1system2 · 2023-05-13T17:06:42Z

system1system2 May 13, 2023

@herval I had the same issue with the Vladmandic fork on macOS.

Solved by flagging Full Precision for models in the Settings and restarting the server.
It's much slower now, but it works.

zagm0g · 2023-05-13T18:34:01Z

zagm0g May 13, 2023

I get the RuntimeError: Input type (float) and bias type (struct c10::Half) should be the same when I run on OSX. Works fine on Ubuntu

I am getting the same error.

steffenr · 2023-05-14T17:27:36Z

steffenr May 14, 2023

I am getting the same error.

Same for me.. Running A1111 on a M2 Macbook Pro. ControlNet v1.1.166 (latest from today)

scarbain · 2023-05-13T15:53:48Z

scarbain
May 13, 2023

Hi @lllyasviel,

This preprocessor seems pretty great but I can't run it. I updated my webui and Controlnet to v1.1.157 but I'm having the following error : RuntimeError: Given Groups=1, weight of size [320,8,3,3] expected input [3,4,64,64] to have 8channels, but got 4 channels instead.

The error disappears if I disable the preprocessor.

2 replies

dilectiogames May 13, 2023

I've got this if a model that is intended for inpaint is selected. use a normal model

scarbain May 13, 2023

I can confirm it's working with a default model.
Why doesn't it work with inpainting models?

Thawneflash · 2023-05-13T16:10:52Z

Thawneflash
May 13, 2023

This is great but I think it needs a bit of optimizing as its a bit slow to start generating once u select it. Cant wait for future updates

Another thing noticed. I think this has to do with batch generating. If u batch generate and select one with a specific seed and try to generate it, the image changes

2 replies

Thawneflash May 13, 2023

I am not sure it just takes time to start compared to other controlnet models. Not talking about generation speed but start speed.

Or maybe I am just overthinking, its not unbearable, just thought I will mention something I noticed. I could be wrong tho and might be my machines problem

dilectiogames May 13, 2023

on my minimum specs computer it runs quite fast. way faster than the other models that require a preprocessor calculation.

Arron17 · 2023-05-13T16:47:49Z

Arron17
May 13, 2023

I presume this is a bug, not sure if it's just a display bug or something strange happening in the background as well, but if you use Highres Fix, during each HighRes Fix step it will show a message stating "ControlNet has used VAE to encode Latent Shape"

Loading preprocessor: reference_only███████████████████████████████████████████████████| 50/50 [00:40<00:00, 1.03it/s] Pixel Perfect Mode Enabled. resize_mode = ResizeMode.INNER_FIT raw_H = 886 raw_W = 800 target_H = 768 target_W = 512 estimation = 693.4537246049662 preprocessor resolution = 693 0%| | 0/35 [00:00<?, ?it/s] ControlNet has used VAE to encode latent shape torch.Size([2, 4, 96, 64]) | 0/15 [00:00<?, ?it/s] 100%|██████████████████████████████████████████████████████████████████████████████████| 35/35 [00:09<00:00, 3.61it/s] 0%| | 0/15 [00:00<?, ?it/s] ControlNet has used VAE to encode latent shape torch.Size([2, 4, 139, 92]) 7%|█████▌ | 1/15 [00:00<00:13, 1.00it/s] ControlNet has used VAE to encode latent shape torch.Size([2, 4, 139, 92]) | 36/50 [00:12<00:16, 1.15s/it] 13%|███████████ | 2/15 [00:01<00:12, 1.02it/s] ControlNet has used VAE to encode latent shape torch.Size([2, 4, 139, 92]) | 37/50 [00:13<00:14, 1.09s/it] 20%|████████████████▌ | 3/15 [00:02<00:11, 1.03it/s] ControlNet has used VAE to encode latent shape torch.Size([2, 4, 139, 92]) | 38/50 [00:14<00:12, 1.06s/it] 27%|██████████████████████▏ | 4/15 [00:03<00:10, 1.03it/s] ControlNet has used VAE to encode latent shape torch.Size([2, 4, 139, 92]) | 39/50 [00:15<00:11, 1.03s/it] 33%|███████████████████████████▋ | 5/15 [00:04<00:09, 1.03it/s] ControlNet has used VAE to encode latent shape torch.Size([2, 4, 139, 92])▊ | 40/50 [00:16<00:10, 1.01s/it] 40%|█████████████████████████████████▏ | 6/15 [00:05<00:08, 1.03it/s] ControlNet has used VAE to encode latent shape torch.Size([2, 4, 139, 92])██ | 41/50 [00:17<00:08, 1.00it/s] 47%|██████████████████████████████████████▋ | 7/15 [00:06<00:07, 1.03it/s] ControlNet has used VAE to encode latent shape torch.Size([2, 4, 139, 92])███▍ | 42/50 [00:18<00:07, 1.01it/s] 53%|████████████████████████████████████████████▎ | 8/15 [00:07<00:06, 1.03it/s] ControlNet has used VAE to encode latent shape torch.Size([2, 4, 139, 92])████▊ | 43/50 [00:19<00:06, 1.02it/s] 60%|█████████████████████████████████████████████████▊ | 9/15 [00:08<00:05, 1.03it/s] ControlNet has used VAE to encode latent shape torch.Size([2, 4, 139, 92])██████ | 44/50 [00:20<00:05, 1.02it/s] 67%|██████████████████████████████████████████████████████▋ | 10/15 [00:09<00:04, 1.03it/s] ControlNet has used VAE to encode latent shape torch.Size([2, 4, 139, 92])███████▍ | 45/50 [00:21<00:04, 1.03it/s] 73%|████████████████████████████████████████████████████████████▏ | 11/15 [00:10<00:03, 1.03it/s] ControlNet has used VAE to encode latent shape torch.Size([2, 4, 139, 92])████████▋ | 46/50 [00:22<00:03, 1.03it/s] 80%|█████████████████████████████████████████████████████████████████▌ | 12/15 [00:11<00:02, 1.03it/s] ControlNet has used VAE to encode latent shape torch.Size([2, 4, 139, 92])██████████ | 47/50 [00:23<00:02, 1.03it/s] 87%|███████████████████████████████████████████████████████████████████████ | 13/15 [00:12<00:01, 1.03it/s] ControlNet has used VAE to encode latent shape torch.Size([2, 4, 139, 92])███████████▎ | 48/50 [00:24<00:01, 1.03it/s] 93%|████████████████████████████████████████████████████████████████████████████▌ | 14/15 [00:13<00:00, 1.03it/s] ControlNet has used VAE to encode latent shape torch.Size([2, 4, 139, 92])████████████▋ | 49/50 [00:25<00:00, 1.03it/s] 100%|██████████████████████████████████████████████████████████████████████████████████| 15/15 [00:14<00:00, 1.03it/s] Total progress: 100%|██████████████████████████████████████████████████████████████████| 50/50 [00:28<00:00, 1.77it/s] Total progress: 100%|██████████████████████████████████████████████████████████████████| 50/50 [00:28<00:00, 1.03it/s]]

Sorry about the dodgy formatting.

8 replies

lllyasviel May 13, 2023
Collaborator Author

hi can you show input image and more detail for debug

Arron17 May 13, 2023

Sure, I've just run through a few scenarios.

Parameters: "cinematic photograph of woman with blonde hair and green eyes, wearing a sleeveless turtleneck, standing in the street
Negative prompt: painted by bad-artist, child, loli, claws
Steps: 35, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 3197667430, Size: 512x768"

Model analogMadness_v40, SD VAE v1-5-pruned-emaonly

Input Image:

Result with Reference Only (Balanced Control Mode):

Result with Reference Only (My Prompt is more important Control Mode):

Result with ControlNet is more important is the exact same as "My Prompt is more important"

Here some results with a different type of model, this time it's mixProv4_v4 and SD VAE wd-1-4-epoch2-fp16

Result with Reference Only (Balanced Control Mode):

Result with Reference Only (My Prompt is More Important Control Mode):

Result with ControlNet is more important gives the same results as "My Prompt is more important"

Looking at these, it seems like maybe it's not an issue with the VAE, but the Balance Control Mode is creating strange outputs.

Arron17 May 13, 2023

For another comparison, this is the prompt on Analogue Madness without the Reference Only

The images are a lot less washed out

enlyth May 13, 2023

I'm seeing the same, where an image will suddenly jump to correct after a certain threshold of steps. See image below (XY plot of four different individual generations with the same seed, but with a different number of steps):

Arron17 May 13, 2023

Seems like you can eliviate some of the washing out/pixelisation by changing the ending control step.

Prompt: "cinematic photo of a young woman"

Input image:

Ending Control Step 1:

Ending Control Step 0.75:

Ending Control Step 0.5:

Ending Control Step 0.35:

However the problem with this is it's not always going to get what you want, in this I just wanted the jumper pattern, so it's worked quite well, however it won't in other circumstances

2blackbar · 2023-05-13T17:25:58Z

2blackbar
May 13, 2023

some results are very good, it only works in balanced mode, can this be adjusted so it also works in prompt /cnet more important modes ?

0 replies

lllyasviel · 2023-05-13T17:57:39Z

lllyasviel
May 13, 2023
Collaborator Author

~~it seems that this method needs you to double the sampling step~~

resolved in 1.1.170

4 replies

matrix4767 May 13, 2023

It already doubles the generation time.

enlyth May 13, 2023

It seems it suddenly jumps from blurry to sharp with a certain number of steps, depending on sampler as well: (for the ancestral ones they never converge on a sharp image)

Thawneflash May 13, 2023

Does this also apply to high res fix? If originally using 10 steps for high res fix, should I use 20?

Thawneflash May 13, 2023

Does this also apply to high res fix? If originally using 10 steps for high res fix, should I use 20?

I haven't tried myself yet, but I'd try that out if I were you :) Try 20 steps instead of 10.

Ok I tried it out, can confirm it gives more details!

lllyasviel · 2023-05-13T18:15:12Z

lllyasviel
May 13, 2023
Collaborator Author

it seems that we will double review some tech parts of this to try to resolve some problems. but in most tests it works in expected way.

0 replies

swissmas · 2023-05-13T19:11:00Z

swissmas
May 13, 2023

I tried doubling, quadrupling, maxing out the number of sampling steps but I still can't get the image to be sharp and correct; that is, with the balanced mode option turned on. As Arron17 pointed out earlier the only thing that seems to fix the image is turning on either the prompt is more important option or the control net is more important option. With either of those two options turned on, I can have a normal number of sampling steps and the images come out just as good as ever. I would prefer to be able to use the balanced mode though, as it seems to produce the best results.

EDIT: Again, as pointed out by Arron17, lowering the ending control step to about 0.35 seems to resolve the blur/color in balanced mode as well.

0 replies

lllyasviel · 2023-05-13T19:26:05Z

lllyasviel
May 13, 2023
Collaborator Author

hi all. 1.1.162 fixed all vae precision or double load problems

2 replies

lllyasviel May 13, 2023
Collaborator Author

and we will begin to investigate blurring/collapsing problems today. hopefully we will be able to find a fix. God bless ControlNet

justifiedfalsebeliefs May 13, 2023

@lllyasviel thank you! I can confirm I also have this blur behavior, and it is also dependent on number of steps. The preprocessor is clearly working though, I'm getting incredible results other than the blurring!

Dekker3D · 2023-05-13T19:55:15Z

Dekker3D
May 13, 2023

I noticed that strengths above 1 seem to behave the same as strength 1, but strengths below 1 do have different effects. But with multi-controlnet, I do get a stronger effect just doubling up on this preprocessor. Could it be made to double up like that internally, if you choose a strength above 1?

1 reply

2blackbar May 13, 2023

This is very true and having 2 of them gets you even closer results to reference image, it should work the same by having strength 2 on just one controlnet, but having even more than 2 would be interesting to test as well

dkackman · 2023-05-13T20:10:40Z

dkackman
May 13, 2023

Is this going to be released as a model that can be used from Diffusers?

3 replies

herval May 13, 2023

My understanding is this is not a model, it’s just a preprocessor?

dkackman May 13, 2023

Good point. Still would like to use it outside of webui. Will look through the code.

cihankaradogan May 14, 2023

Same here

FrancescoSaverioZuppichini · 2023-05-14T18:25:27Z

FrancescoSaverioZuppichini
May 14, 2023

Thanks for the amazing work! @lllyasviel

Is there a specific pr/commit that I can use to have a look at what changes on the control net code? I am not familiar with this code base but it looks to me that is just the UI and not the modelling itself. I am a CV engineer and I love your work, I'd like to see the training and modelling code so I can learn something new :)

Thanks a lot

0 replies

lllyasviel · 2023-05-14T20:44:30Z

lllyasviel
May 14, 2023
Collaborator Author

hi all. MAC MPS problems are fixed in 1.1.167 (I think, hopefully)

0 replies

lllyasviel · 2023-05-14T22:12:46Z

lllyasviel
May 14, 2023
Collaborator Author

Update in 1.1.168 (Style Fidelity Slider)

Since different people have different preferences, we will add a Style Fidelity Slider slider since 1.1.168

This slider only work in "balanced" mode
higher value mean more fidelity in style, but also more risk to collapse
lower value mean less fidelity in style, but also more robust

Example:

1girl, masterpiece, best quality,
Negative prompt: lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry
Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 3439472849, Size: 512x640, Model hash: 8a6d1ef16b, Model: Anything-V3.0-non-ema-fp32, Clip skip: 2, ENSD: 31337, Version: 1.1.1-115-g89f9faa6, ControlNet 0: "preprocessor: reference_only, model: None, weight: 1, starting/ending: (0, 1), resize mode: Crop and Resize, pixel perfect: True, control mode: Balanced, preprocessor params: (64, 0, 64)"

Input:

Style Fidelity = 0

Style Fidelity = 0.25

Style Fidelity = 0.5

Style Fidelity = 0.75

Style Fidelity = 1.0

1 reply

vonvanzu May 15, 2023

Amazing. One question: the checkpoint is important? or you can use SD1.5 or SD2.1?

lllyasviel · 2023-05-14T22:15:51Z

lllyasviel
May 14, 2023
Collaborator Author

Test 2

Input image

meta:
woman in street, masterpiece, best quality,
Negative prompt: lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry
Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 4118128803, Size: 768x512, Model hash: c0d1994c73, Model: realisticVisionV20_v20, ENSD: 31337, Version: 1.1.1-115-g89f9faa6, ControlNet 0: "preprocessor: reference_only, model: None, weight: 1, starting/ending: (0, 1), resize mode: Crop and Resize, pixel perfect: True, control mode: Balanced, preprocessor params: (64, 0, 64)"

Style Fidelity = 0.0

Style Fidelity = 0.25

Style Fidelity = 0.5

Style Fidelity = 0.75

Style Fidelity = 1.0

0 replies

lllyasviel · 2023-05-14T22:19:58Z

lllyasviel
May 14, 2023
Collaborator Author

You need at least 1.1.168 to use the Style Fidelity Slider feature

0 replies

demirklvc · 2023-05-14T22:23:28Z

demirklvc
May 14, 2023

would guess mode be possible for reference only?

0 replies

lllyasviel · 2023-05-14T23:22:03Z

lllyasviel
May 14, 2023
Collaborator Author

hi all 1.1.170 reference-only begin to support inpaint variation models

0 replies

kft334 · 2023-05-14T23:27:06Z

kft334
May 14, 2023

What's the argument name for style fidelity for the API?

Edit: Just noticed that it's not yet supported so hopefully we will see it added soon. threshold_a could be used instead of adding a new argument.

0 replies

JasonMarkT · 2023-05-15T00:00:53Z

JasonMarkT
May 15, 2023

Hi - I've tried using 1.66-1.68 - I either get bad looking images - or the person looks nothing like the original photo - but - it does take the clothes and background and produces something similar - but face and hair are completely different people.. any ideas why? thx..

6 replies

JasonMarkT May 15, 2023

will do..

JasonMarkT May 15, 2023

I tried all the above - including using pth files instead of safetensors - same result really - it looks a bit better with the slider up - or 'c-net important' on, i'd say it looks like an array of different actors who look a bit like the original.. but it is far off from the examples I have seen.

JasonMarkT May 15, 2023

and the update is now 1.7

JasonMarkT May 15, 2023

The result that comes up always has the original photo next to it..I'm wondering if this is normal or points to an issue..

JasonMarkT May 15, 2023

that looks cool..nice..

I've tried a variety of models - R-Vision 2 seems to work the best .

I've tried different click skip levels and tried going control weight 2 - but results are the same - just a bit wackier - cfg slider changes things at different levels - but doesn't make the image better.

lllyasviel · 2023-05-15T00:06:59Z

lllyasviel
May 15, 2023
Collaborator Author

OK - 1.1.170 - i think this feature is finished and we wont make any big modifications to it now.
In the future perhaps we will add different preprocessors like "reference_XXXX" but we will not make big modification to "reference_only". Begin from 1.1.170 and you can basically make sure that all results can be reproduced in the future

1 reply

BahzBeih May 15, 2023

OK - 1.1.170 - i think this feature is finished and we wont make any big modifications to it now. In the future perhaps we will add different preprocessors like "reference_XXXX" but we will not make big modification to "reference_only". Begin from 1.1.170 and you can basically make sure that all results can be reproduced in the future

i hope you add option to reference feature, to use multiple photos as reference instead of one photo, i guess this will solve output images is not look alike the original person 🤞🤞🤞🤞🤞

v0xie · 2023-05-15T00:55:03Z

v0xie
May 15, 2023

Thank you for this feature, it's supremely useful.

Here's a neat trick I found for replacing characters —
Adding the name of a character to the prompt dramatically changes the composition, if using reference_only alone.
Using tile_resample ControlNet with an early ending step helps maintain a similar composition/pose.

Without tile_resample

With tile_resample

0 replies

loboere · 2023-05-15T02:01:46Z

loboere
May 15, 2023

What happens if Style Fidelity has negative values and greater than 1? it would be interesting to see

0 replies

josplit · 2023-05-15T02:25:59Z

josplit
May 15, 2023

Hi! Does anyone know what causes this error? It happens to me when I try to render in batch

File "D:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\functional.py", line 360, in einsum
return _VF.einsum(equation, operands) # type: ignore[attr-defined]
RuntimeError: einsum(): operands do not broadcast with remapped shapes [original->remapped]: [16, 9216, 40]->[16, 9216, 1, 40] [8, 154, 40]->[8, 1, 154, 40]

0 replies

kelvin-zhao · 2023-05-15T03:26:42Z

kelvin-zhao
May 15, 2023

I seem to be getting this error,

ControlNet used torch.float32 VAE to encode torch.Size([2, 4, 96, 96]).
/AppleInternal/Library/BuildRoots/a0876c02-1788-11ed-b9c4-96898e02b808/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:705: failed assertion `[MPSTemporaryNDArray initWithDevice:descriptor:] Error: product of dimension sizes > 2**31'
Abort trap: 6
stable-diffusion-webui $ /opt/homebrew/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

0 replies

lllyasviel · 2023-05-15T05:34:16Z

lllyasviel
May 15, 2023
Collaborator Author

Hi all, we added more reference preprocessors.
#1280

Please go to that page for discussion.

This discussion will be locked as finished and too heated.
Note that BUG reports should go directly to Issue.

0 replies

[Major Update] Reference-only Control #1236

lllyasviel May 13, 2023 Collaborator

Reference-Only Control

Replies: 67 comments · 152 replies

lllyasviel May 13, 2023 Collaborator Author

lllyasviel May 13, 2023 Collaborator Author

lllyasviel May 13, 2023 Collaborator Author

lllyasviel May 13, 2023 Collaborator Author

lllyasviel May 14, 2023 Collaborator Author

lllyasviel May 13, 2023 Collaborator Author

This comment has been hidden.

lllyasviel May 13, 2023 Collaborator Author

lllyasviel May 13, 2023 Collaborator Author

lllyasviel May 13, 2023 Collaborator Author

lllyasviel May 13, 2023 Collaborator Author

lllyasviel May 13, 2023 Collaborator Author

lllyasviel May 14, 2023 Collaborator Author

lllyasviel May 14, 2023 Collaborator Author

Update in 1.1.168 (Style Fidelity Slider)

lllyasviel May 14, 2023 Collaborator Author

Test 2

lllyasviel
May 13, 2023
Collaborator

Replies: 67 comments 152 replies

lllyasviel
May 13, 2023
Collaborator Author

lllyasviel
May 13, 2023
Collaborator Author

lllyasviel
May 13, 2023
Collaborator Author

lllyasviel May 13, 2023
Collaborator Author

lllyasviel May 14, 2023
Collaborator Author

lllyasviel
May 13, 2023
Collaborator Author

lllyasviel May 13, 2023
Collaborator Author

lllyasviel
May 13, 2023
Collaborator Author

lllyasviel
May 13, 2023
Collaborator Author

lllyasviel
May 13, 2023
Collaborator Author

lllyasviel May 13, 2023
Collaborator Author

lllyasviel
May 14, 2023
Collaborator Author

lllyasviel
May 14, 2023
Collaborator Author

lllyasviel
May 14, 2023
Collaborator Author