-
Notifications
You must be signed in to change notification settings - Fork 27.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request]: Latent diffusion upscaler for the Stable Diffusion autoencoder #4446
Comments
Is it similar to the "scale latents" option for A1111's highres fix ? (see #2613 for example for some info) |
Nope |
I tested this out, wasn't hard to convert it to a stand-alone script. Note that there is a stray config file used from here - https://huggingface.co/spaces/multimodalart/latentdiffusion/blob/main/latent-diffusion/models/first_stage_models/kl-f8/config.yaml (not the CompVis repo), but just putting that file there did work. Also added |
|
@specblades probably better to start with the notebook, then, it will be friendlier to use. |
Goal is to use it in webui, i mean |
Okay, I was able to convert @pbaylies sdu.txt into a script for automatic1111 https://gist.github.com/nagolinc/3993e7329cafab5d5bd4698977ebebcc Before you can run it, you will need to download the two files: into your {automatic}/models/LDSR/ folder |
Could you provide some kind of tutorial on how to use it? |
Many OOMs, "RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.HalfTensor) should be the same" and other errors. Does not save the result automatically. |
Exception: bad file inside ./models/LDSR/laion_text_cond_latent_upscaler_2_1_00470000_slim.pth: laion_text_cond_latent_upscaler_2_1_00470000_slim/data.pkl The file may be malicious, so the program is not going to read it. |
it start with --disable-safe-unpickle, then works only once, then restart required to work again once :( but it's good for testing, i like the result very much! could it be implemented as an upscaler, to be usable in Extras tab too, instead of being a script? |
hmm this is also interetsing: "[Stability AI]Nov 10 |
i've done some cleanup on the script, removed unused imports and functions, and moved a few model.to() around, and now it works fine, no crash, feel free to test: also added some test code to main() so it can be used standalone from cli. |
i've replaced the bilinear interpolate function in processing.py (for highres fix latent scaling) to a call to this new model (which operates in latent space anyway), and IT WORKS! if you enable highres fix & scale latent, and set denoising to zero, then it will immediately upscale by 2x the generated image! if you set denoising to some low value <0.3 then it will work further on the upscaled version and fix some artifacts too! !!!!!!!!!!!!!!!!!! UPSCALING LATENT !!!!!!!!!!!!!!!!!!!!! @AUTOMATIC1111 please look at run_sdu_latent_upscale() in http://thot.banki.hu/arpi/sdu_upscale_mod.py |
You can re-create the discussion so you have the authorship |
@arpitest COMMANDLINE_ARGS= --disable-safe-unpickle --xformers --allow-code |
@specblades the _mod version is not a script, it should be in modules/ and the func called from hires-fix part of processing.py, needs some small changes to the code... i hope @AUTOMATIC1111 will do it soon in proper way, instead of my ugly hack :) |
@arpitest i understand i think we need both - upscale in extras and hr-fix |
here is my modified version: http://thot.banki.hu/arpi/processing.py sorry this is a proof of concept only, not intended for wide use this ugly way :) |
@arpitest nvm, do ur best, please! |
Out of curiosity, does anybody knows of some comparison between this new latent upscaler and the "scale latent hires-fix" already implemented in A1111 (that performs bilinear interpolation in latent space) ? |
using bilinear interpolation on vectors (latents are not pixels!) is a bad idea anyway... there are methods for vector interpolation, like euler, quaternion etc. it's a very different math... |
I see a way for improving highres-fix's scale latents option here. TY |
FYI : feature request for new latent vector interpolation methods here |
it looks bad... i've found the best formula conditioning=1.0-denoising for higher denoising value you need lower conditioning to compensate |
At high denoisings it will oversaturate like CFG>20 |
broken since last update. I'm getting an error whenever the modified processing.py is used after my last git pull today Upscale latent space image when doing hires. fix no longer available in webui settings btw |
which repo/branch? it is still working for me. what error? |
File "F:\Programs\stable-diffusion-webui\launch.py", line 295, in |
are you overwriting the processing.py by the old patched version or patching the new one? |
until now i was using the old patched version, how do i use the diff file ? |
Thanks for this one! Works really great on square images, but when I try to switch to rectangular like 768x1024 then it crashes at |
Can someone please create simplified instructions on making this work? |
How can we still use this mod with latest version of webui? |
easiest way: copy to modules/ http://thot.banki.hu/arpi/sdu_upscale_mod.py add
to
|
it doesn't seem to work, now it gives me this error : RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 42 but got size 43 for tensor number 1 in the list. EDIT : nevermind, i was using exotic resolution. it's working fine, thank you. |
I'm trying to use the ratio defined in hr fix (1.25 in my tests) instead of 2.0 as latent upscale, but I have no idea on how to proceed. Maybe scaling by 2.0 then downsizing afterward ? |
this is fixed 2x scale, cannot do other sizes. and the original size must be divisable by 64 (so the resulting size by 128). |
Oh. Too bad, 2.0 is really too much to be useful as hr fix in my opinion. Either you have a very small source size and it contains less details, or you have a too large destination size and the model may be generating incoherent pictures. |
please check post above yours. |
Yes I have the exact same settings, and the code from http://thot.banki.hu/arpi/sdu_upscale_mod.py . Very strange. Anyone else having any success with 1.25 ratio ? |
I tried everything I could think of, and it always give me a x2.0 scale. My upload log for a 768x896 picture x1.25 is :
The first torch.Size log is (112, 96) which correspond to 869x768 (multiply by 8). But the second log of torch.Size is always doubled ; whatever my original hr fix ratio was. |
the code in sdu_upscale_mod.py always do 2x upscale, it's fixed by design. it does not even know your scaling settings, only gets the original size and the latent matrix: [batch_size, C, H, W] = low_res_latent.shape if you get different scale, then it is not in use! |
Next thing I tried was to downscale the samples after the 2x scale by using the old interpolate call :
It keeps the correct target resolution, but it only does a marginally better job at lower denoising (<0.5). And it takes quite a while to compute. Isn't there any way to do a better downscale before the image generation ? |
No success for me either - no matter what value I enter as a ratio parameter, the result is always 2.0. |
I'm probably stating something obvious but just a heads up that this new upscaler is not a silver bullet. In my use case it consistently produces images with less details compared to the Latent (nearest) upscaler, especially with DPM++ 2M sampler. I do believe it can be useful in some workflow or styles, not to mention right now the prototype still has room of turning. Definitely going to keep an eye on it. |
Hey! Any of you guys still using this? It was my favorite upscaling method but I'm getting an error on the newer webui versions, EDIT: It has to do with changes in the k-diffusion repo, I guess.
|
Is there an existing issue for this?
What would your feature do ?
Can we implement it?
Q from Twitter @RiversHaveWings:
I've trained a latent diffusion upscaler for the Stable Diffusion autoencoder (and anything you feel like feeding into it if you can tolerate a little artifacts) in collaboration with
@stabilityai
. Try the Colab written by
@nshepperd1
https://colab.research.google.com/drive/1o1qYJcFeywzCIdkfKJy7cTpgZTCM2EI4
Proposed workflow
See in colab
Additional information
No response
The text was updated successfully, but these errors were encountered: