Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running time on H100 #14

Open
harveymilk opened this issue Aug 29, 2024 · 4 comments
Open

Running time on H100 #14

harveymilk opened this issue Aug 29, 2024 · 4 comments

Comments

@harveymilk
Copy link

Hi, we're running the demo script for 768x768 input image and it takes 22seconds to generate a 2 second clip, however we're running on an H100 SXM GPU. I was wondering if this generation time is normal or if it should be a lot faster.

@AA-Developer
Copy link

I think its ok because it takes 150 seconds in gtx 3090 24GB

@harveymilk
Copy link
Author

thank you for responding so quickly, can you think of any optimizations we can make to make it faster on a single h100. We're looking to get as close to real-time generation as possible. Thanks a lot for developing this!

@AA-Developer
Copy link

AA-Developer commented Aug 29, 2024

hi
Yes it is possible

I am not part of the development team but I have looked at the code well and I am developing an interface for this system.
I will add API support to the interface as well, specifying the system's properties and the ability to use and load Lora.
The extraction quality is currently poor due to dimension inconsistency.
I have corrected this in the interface I am working on but have not published the code yet. I am still working on it.

Unfortunately there is no setting that does this now so you need to modify this part of the code Manually:
Go to the file
fancyvideo/pipelines/fancyvideo_infer_pipeline.py
in line 393 421 440
num_inference_steps = 50,
Modify the number to the required number of steps
The system uses 50 steps per frame which can be reduced to 30 steps
You can also use LCM Lora to use only 8 or 16 steps per frame

@nitinmukesh
Copy link

@AA-Developer

Thank for explaining the steps part.

Please also explain how to use LCM-Lora

You can also use LCM Lora to use only 8 or 16 steps per frame

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants