Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sd3.5 integration (naive) #2183

Merged
merged 1 commit into from
Oct 29, 2024

Conversation

graemeniedermayer
Copy link

Minimal integration of SD3.5. Requires update to huggingface_guess lllyasviel/huggingface_guess#1

There's a a few hacky things. In particular in backend/loader.py there's a few things that could be structured better.

example

All text models are necessary for this implementation. Works with 8vram cards.

@Shahfahad7866
Copy link

Excited! ;) how to use it, can you share few steps.

@graemeniedermayer
Copy link
Author

graemeniedermayer commented Oct 26, 2024

  1. It's probably easiest to just add remote so in base repo directory/folder
git remote add grae https://github.com/graemeniedermayer/stable-diffusion-webui-forge.git
git fetch grae
git checkout grae/sd35_integration
  1. go to folder repositories/huggingface_guess/huggingface_guess/ and copy and paste into repositories/huggingface_guess/huggingface_guess/model_list.py
class SD35(BASE):
    huggingface_repo = "stabilityai/stable-diffusion-3.5-large"

    unet_config = {
        "in_channels": 16,
        "pos_embed_scaling_factor": None,
    }

    sampling_settings = {
        "shift": 3.0,
    }

    unet_extra_config = {}
    latent_format = latent.SD3

    memory_usage_factor = 1.2

    text_encoder_key_prefix = ["text_encoders."]
    unet_target = 'transformer'

    def clip_target(self, state_dict={}):
        result = {}
        pref = self.text_encoder_key_prefix[0]

        if "{}clip_l.transformer.text_model.final_layer_norm.weight".format(pref) in state_dict:
            result['clip_l'] = 'text_encoder'

        if "{}clip_g.transformer.text_model.final_layer_norm.weight".format(pref) in state_dict:
            result['clip_g'] = 'text_encoder_2'

        if "{}t5xxl.transformer.encoder.final_layer_norm.weight".format(pref) in state_dict:
            result['t5xxl'] = 'text_encoder_3'

        return result

and add SD35 before SD3

models = [Stable_Zero123, SD15_instructpix2pix, SD15, SD20, SD21UnclipL, SD21UnclipH, SDXL_instructpix2pix, SDXLRefiner, SDXL, SSD1B, KOALA_700M, KOALA_1B, Segmind_Vega, SD_X4Upscaler, Stable_Cascade_C, Stable_Cascade_B, SV3D_u, SV3D_p, SD35, SD3,  StableAudio, AuraFlow, HunyuanDiT, HunyuanDiT1, Flux, FluxSchnell]

Otherwise you'll get errors about not module 'huggingface_guess.model_list' has no attribute 'SD35'. Did you mean: 'SD15'?

  1. It's probably necessary to download all the models manually from huggingface because they require you to login. https://huggingface.co/stabilityai/stable-diffusion-3.5-large . The main checkout point go into model/Stable-diffusion folder and the text_encoders can go into the models/text_encoders folder

  2. You should be able to start forge normally and use SD3.5 after that! SGM_Uniform seems to be the main scheduler.

@lukas9936
Copy link

lukas9936 commented Oct 26, 2024

Hi, just a small question, I did everything exactly like you wrote above. At first I tried a GGUF version of SD3.5, but that failed directly. FP8 loads but gives me that error: "RuntimeError: Promotion for Float8 Types is not supported, attempted to promote Half and Float8_e4m3fn". Havent yet managed to try the full/normal FP16 version of 3.5 yet though. Is fp8 just not working with your fix or did I screw something else up?

EDIT: FP16/Normal SD 3.5 works, is there any way to make FP8 work as well (FP16 is really close to an OOM error, uses like 55GB/64GB RAM and all of my 16GB VRAM)

@Mathreex
Copy link

Tested and works, thank you.
Bugs found: Aditional options like batch and hires fix not working

@abzaloff
Copy link

It really works) thanks bro!)

@lllyasviel lllyasviel changed the base branch from main to sd35 October 29, 2024 04:52
@lllyasviel
Copy link
Owner

I am going to take a look about GGUF and the quality of HF clip-g vs this clip-g.
before that people can use sd35 branch if interested

@lllyasviel lllyasviel merged commit 2d5b6ca into lllyasviel:sd35 Oct 29, 2024
@Giribot
Copy link

Giribot commented Oct 29, 2024

Hello lllyasviel !
(very stupid question) Is the SD35 branch and the main branch merge soon ?
(and how to see (when, how ?) it's done ?)

Thanks !

(i'm lost now with the roadmap)

@E2GO
Copy link

E2GO commented Oct 29, 2024

Hello lllyasviel ! (very stupid question) Is the SD35 branch and the main branch merge soon ? (and how to see (when, how ?) it's done ?)

Thanks !

(i'm lost now with the roadmap)

I'm sure we'll see that announcement on main page at news)
image

@dan4ik94
Copy link

seems like doesn't work with the new sd35 medium

File "D:\webui_forge_cu121_torch231\webui\modules\models\sd35\mmditx.py", line 70, in modulate
    return x * (1 + scale.unsqueeze(1)) + shift.unsqueeze(1)
RuntimeError: The size of tensor a (1536) must match the size of tensor b (2304) at non-singleton dimension 2

The size of tensor a (1536) must match the size of tensor b (2304) at non-singleton dimension 2

@graemeniedermayer
Copy link
Author

The 3.5 medium issues looks like it's mostly updating files in modules/model/sd35 with the additions from https://github.com/Stability-AI/sd3.5 .

I'll try to get to it later today.

@dan4ik94
Copy link

The 3.5 medium issues looks like it's mostly updating files in modules/model/sd35 with the additions from https://github.com/Stability-AI/sd3.5 .

ty, gonna have a look too.

@DukenNukem47
Copy link

didnt know where to add this but had an issue yesterday where after updating forge, suddenly it seemed as if the quality of the images went downhill, als I could not replicate any of my older generated images, it was on version 612 and my last used on was 591, so I struggled to roll back but eventually got it rolled back and now I have glorious images again. oh wait, I just figured out you can use the version number added in the file metadata without knowing the whole hash number, so I just went to custom and pasted - f2.0.1v1.10.1-previous-591-gb592142f and it updated to that version and now things are normal again - still cant get over how drastically image quality went downhill, no more variety and widely different images between these commits

@abzaloff
Copy link

abzaloff commented Nov 8, 2024

didnt know where to add this but had an issue yesterday where after updating forge, suddenly it seemed as if the quality of the images went downhill

That's weird. My images are exactly the same as they were before the latest updates. Even rechecked after your post.

@Seedmanc
Copy link

So what are the VRAM requirements, can it run with offloading like Flux or there's no point in even trying on 8Gb?

@E2GO
Copy link

E2GO commented Nov 26, 2024

Could be nice to get some updates on this, please

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.