Worse results? #7

catboxanon · 2024-03-17T23:37:26Z

Hi, I'm a maintainer of the Stable Diffusion webui. I tried implementing the composite method outlined in the paper and this repo, but it seems to produce worse results in all cases I've tested. See AUTOMATIC1111/stable-diffusion-webui#15037 (comment) for reference, which includes sample images and the code I used.

This also seems to introduce some major performance issues, as mentioned in #6, but perhaps that's unavoidable?

maszhongming · 2024-03-18T04:35:51Z

Hi, I truly appreciate your input and the work you've put into integrating our methods. I'm not familiar with the Stable Diffusion webui codebase myself, but from what you've shared, the combine_denoised function appears to be implemented correctly.

Regarding the shared samples, do you think LoRA Merge is better because of the style? From my perspective, I think LoRA Composite's characters and poses align more closely with the image in Concept, but the style isn't significant enough; LoRA Merge has a better style, but there's a weird vertical bar on the right side of the curtain.

We have observed in both automatic and human evaluations that LoRA Merge shows superior performance when combining two LoRAs, especially when one of them is a "style" (the generated image shows distinct "style" features). However, when combining two other types of LoRAs (e.g., character + clothing), or when the number of LoRAs to be combined increases (3-5 LoRAs), it is significantly worse than our approach. Would you mind testing these scenarios and sharing your findings? I'm also willing to conduct tests with the same examples under the diffusers and peft codebases to pinpoint potential issues.

As for the inference speed, it's true that LoRA Composite demands more processing time. This increase should be linear, meaning combining k LoRAs would take roughly k times longer. Although I haven't come up with any methods to optimize it yet, the increase shouldn't be as drastic as you've mentioned, from a few seconds to over a minute.

By the way, have you had a chance to try the LoRA Switch method? Its inference time is on par with LoRA Merge, but our evaluations suggest it surpasses both LoRA Composite and Merge in terms of composition quality (Section 3.2 in our paper, observation 2).

catboxanon · 2024-03-18T15:35:06Z

Thank you for the in-depth reply!

I will try to take more time in conducting various tests with other LoRAs besides ones intended to influence style (e.g. character + clothing, as you mentioned). The degraded inference speed may be caused by some webui internals but I may look into that as well.

As for the Switch method -- this actually will be simpler to test, as an extension exists that would facilitate invoking this quite easily. I may even make a PR to extend the syntax with something that would make invoking the Switch method more simple. https://github.com/cheald/sd-webui-loractl

maszhongming · 2024-03-18T17:23:55Z

Thank you for your efforts and for exploring further tests!

Additionally, I strongly recommend testing with a combination of 3-5 LoRAs, as highlighted in our paper. With more LoRAs to be combined, vanilla LoRA merge will destabilize the generation process.

Regarding the degraded inference speed you've observed, while I'm not familiar with the webui's internals, a comparison with the implementation of diffusers in managing active adapters might shed some light on potential optimizations or differences.

Glad to hear the Switch method is simpler to test in your setup. Your work in extending the syntax for easier invocation is greatly appreciated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Worse results? #7

Worse results? #7

catboxanon commented Mar 17, 2024

maszhongming commented Mar 18, 2024

catboxanon commented Mar 18, 2024 •

edited

Loading

maszhongming commented Mar 18, 2024

Worse results? #7

Worse results? #7

Comments

catboxanon commented Mar 17, 2024

maszhongming commented Mar 18, 2024

catboxanon commented Mar 18, 2024 • edited Loading

maszhongming commented Mar 18, 2024

catboxanon commented Mar 18, 2024 •

edited

Loading