[Feature Request]: Add Facebook's Token Merging feature for faster inference time #4364

ghost · 2022-11-06T01:56:05Z

Is there an existing issue for this?

I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

Implement https://github.com/facebookresearch/ToMe which allows for faster image inference time

Proposed workflow

Maybe a CLI option? From what I read it decreases accuracy by a bit, so some people won't want to have it enabled.

Additional information

See code in facebookresearch/ToMe#7 and https://github.com/Birch-san/stable-diffusion/

ghost · 2022-11-08T02:43:19Z

From the results by others and me it seems to speedup inference by ~20-25% at 512x512 which is quite significant, and allows for generation of much bigger images.

ghost · 2022-11-08T09:35:21Z

I did an extremely quick and dirty patch of webui and stable-diffusion repo with code from https://github.com/Birch-san/stable-diffusion to check ToMe. I also hard-coded doggettx's attention into the code because I'm not good enough to figure out how to add this properly - so for anyone using xformers, xformers will be faster for you than this patch.

https://gist.github.com/Yardanico/081e7e23ea1d51dd70f1a75a6df8b876 if you want to try.

I'm getting 25% speed increase on my RX6700XT from 6it/s to 7.5it/s, and I can also generate bigger resolutions while it being faster.

There is some accuracy loss though, but it largely depends on the prompt.

ultranity · 2022-11-10T09:57:13Z

very interesting work, hope we can enjoy this feature soon (with xformers if possible).

And note that ToMe is drafting a stable diffusion suport (with examples and code "coming soon"), also ref: facebookresearch/ToMe#4

SLAPaper · 2023-04-01T19:11:20Z

Update: there is already a ToMe implementation for Stable Diffusion: https://github.com/dbolya/tomesd
It seems that it's intuitive to support it since there is only one line of code to apply it to a model:

import tomesd

# Patch a Stable Diffusion model with ToMe for SD using a 50% merging ratio.
# Using the default options are recommended for the highest quality, tune ratio to suit your needs.
tomesd.apply_patch(model, ratio=0.5)

# However, if you want to tinker around with the settings, we expose several options.
# See docstring and paper for details. Note: you can patch the same model multiple times.
tomesd.apply_patch(model, ratio=0.9, sx=4, sy=4, max_downsample=2) # Extreme merging, expect diminishing returns

Update: There is already a PR working on this, see below

Update again: I implement a extension to use ToMe (https://github.com/SLAPaper/a1111-sd-webui-tome), but it seems only gives a ~13% speed up when using batch size 8

papuSpartan · 2023-04-01T19:41:39Z

Update: there is already a ToMe implementation for Stable Diffusion: https://github.com/dbolya/tomesd It seems that it's intuitive to support it since there is only one line of code to apply it to a model:
import tomesd

# Patch a Stable Diffusion model with ToMe for SD using a 50% merging ratio.
# Using the default options are recommended for the highest quality, tune ratio to suit your needs.
tomesd.apply_patch(model, ratio=0.5)

# However, if you want to tinker around with the settings, we expose several options.
# See docstring and paper for details. Note: you can patch the same model multiple times.
tomesd.apply_patch(model, ratio=0.9, sx=4, sy=4, max_downsample=2) # Extreme merging, expect diminishing returns
I'll try myself to find a proper place to add the line of code and edit comment if making some progress

Working on this in #9256

ghost changed the title ~~[Feature Request]: Add Facebook's Token Merging feature~~ [Feature Request]: Add Facebook's Token Merging feature for faster inference time Nov 6, 2022

mezotaken added the enhancement New feature or request label Jan 12, 2023

missionfloyd closed this as completed May 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: Add Facebook's Token Merging feature for faster inference time #4364

[Feature Request]: Add Facebook's Token Merging feature for faster inference time #4364

ghost commented Nov 6, 2022 •

edited by ghost

Loading

ghost commented Nov 8, 2022

ghost commented Nov 8, 2022

ultranity commented Nov 10, 2022

SLAPaper commented Apr 1, 2023 •

edited

Loading

papuSpartan commented Apr 1, 2023

[Feature Request]: Add Facebook's Token Merging feature for faster inference time #4364

[Feature Request]: Add Facebook's Token Merging feature for faster inference time #4364

Comments

ghost commented Nov 6, 2022 • edited by ghost Loading

Is there an existing issue for this?

What would your feature do ?

Proposed workflow

Additional information

ghost commented Nov 8, 2022

ghost commented Nov 8, 2022

ultranity commented Nov 10, 2022

SLAPaper commented Apr 1, 2023 • edited Loading

papuSpartan commented Apr 1, 2023

ghost commented Nov 6, 2022 •

edited by ghost

Loading

SLAPaper commented Apr 1, 2023 •

edited

Loading