Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Add Facebook's Token Merging feature for faster inference time #4364

Closed
1 task done
ghost opened this issue Nov 6, 2022 · 5 comments
Closed
1 task done
Labels
enhancement New feature or request

Comments

@ghost
Copy link

ghost commented Nov 6, 2022

Is there an existing issue for this?

  • I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

Implement https://github.com/facebookresearch/ToMe which allows for faster image inference time

Proposed workflow

Maybe a CLI option? From what I read it decreases accuracy by a bit, so some people won't want to have it enabled.

Additional information

See code in facebookresearch/ToMe#7 and https://github.com/Birch-san/stable-diffusion/

@ghost ghost changed the title [Feature Request]: Add Facebook's Token Merging feature [Feature Request]: Add Facebook's Token Merging feature for faster inference time Nov 6, 2022
@ghost
Copy link
Author

ghost commented Nov 8, 2022

From the results by others and me it seems to speedup inference by ~20-25% at 512x512 which is quite significant, and allows for generation of much bigger images.

@ghost
Copy link
Author

ghost commented Nov 8, 2022

I did an extremely quick and dirty patch of webui and stable-diffusion repo with code from https://github.com/Birch-san/stable-diffusion to check ToMe. I also hard-coded doggettx's attention into the code because I'm not good enough to figure out how to add this properly - so for anyone using xformers, xformers will be faster for you than this patch.

https://gist.github.com/Yardanico/081e7e23ea1d51dd70f1a75a6df8b876 if you want to try.

I'm getting 25% speed increase on my RX6700XT from 6it/s to 7.5it/s, and I can also generate bigger resolutions while it being faster.

There is some accuracy loss though, but it largely depends on the prompt.

@ultranity
Copy link
Contributor

very interesting work, hope we can enjoy this feature soon (with xformers if possible).

And note that ToMe is drafting a stable diffusion suport (with examples and code "coming soon"), also ref: facebookresearch/ToMe#4

@mezotaken mezotaken added the enhancement New feature or request label Jan 12, 2023
@SLAPaper
Copy link
Contributor

SLAPaper commented Apr 1, 2023

Update: there is already a ToMe implementation for Stable Diffusion: https://github.com/dbolya/tomesd
It seems that it's intuitive to support it since there is only one line of code to apply it to a model:

import tomesd

# Patch a Stable Diffusion model with ToMe for SD using a 50% merging ratio.
# Using the default options are recommended for the highest quality, tune ratio to suit your needs.
tomesd.apply_patch(model, ratio=0.5)

# However, if you want to tinker around with the settings, we expose several options.
# See docstring and paper for details. Note: you can patch the same model multiple times.
tomesd.apply_patch(model, ratio=0.9, sx=4, sy=4, max_downsample=2) # Extreme merging, expect diminishing returns

Update: There is already a PR working on this, see below

Update again: I implement a extension to use ToMe (https://github.com/SLAPaper/a1111-sd-webui-tome), but it seems only gives a ~13% speed up when using batch size 8

@papuSpartan
Copy link
Contributor

Update: there is already a ToMe implementation for Stable Diffusion: https://github.com/dbolya/tomesd It seems that it's intuitive to support it since there is only one line of code to apply it to a model:

import tomesd

# Patch a Stable Diffusion model with ToMe for SD using a 50% merging ratio.
# Using the default options are recommended for the highest quality, tune ratio to suit your needs.
tomesd.apply_patch(model, ratio=0.5)

# However, if you want to tinker around with the settings, we expose several options.
# See docstring and paper for details. Note: you can patch the same model multiple times.
tomesd.apply_patch(model, ratio=0.9, sx=4, sy=4, max_downsample=2) # Extreme merging, expect diminishing returns

I'll try myself to find a proper place to add the line of code and edit comment if making some progress

Working on this in #9256

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants