Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

video support #12

Open
ehartford opened this issue Aug 19, 2023 · 23 comments
Open

video support #12

ehartford opened this issue Aug 19, 2023 · 23 comments

Comments

@ehartford
Copy link

ehartford commented Aug 19, 2023

(rewriting sloppy request)
I was wondering if video support can be added?

At first I came up with lucidrain's video-diffusion-pytorch
https://github.com/lucidrains/video-diffusion-pytorch

But, after some research it seems like zeroscope might be the right model to use
https://huggingface.co/cerspense/zeroscope_v2_576w

@leejet
Copy link
Owner

leejet commented Aug 20, 2023

This model appears to be significantly different from stable-diffusion, no plans to support it currently. If there's time in the future, I will consider providing support for it.

@ehartford
Copy link
Author

ehartford commented Aug 20, 2023

I didn't necessarily mean this specific model, more "video" in general.

I think zeroscope would probably be the right place to start.

Sorry for being sloppy.

https://huggingface.co/cerspense/zeroscope_v2_576w

@leejet
Copy link
Owner

leejet commented Aug 21, 2023

It looks like this needs some work, and there are no plans to support it currently. Maybe in the future?

@Green-Sky
Copy link
Contributor

stable video diffusion (SVD) models from stability where released!

SVD was trained to generate 14 frames at resolution 576x1024 given a context frame of the same size. We use the standard image encoder from SD 2.1, but replace the decoder with a temporally-aware deflickering decoder

https://stability.ai/news/stable-video-diffusion-open-ai-video-model
https://huggingface.co/stabilityai/stable-video-diffusion-img2vid / https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt

@leejet
Copy link
Owner

leejet commented Nov 22, 2023

The SVD demo looks quite good. I'll make time in the next few days to study it, starting by running the official code to see its performance.

@Amin456789
Copy link

patiently waiting for SVD to release and being quantized!

@FSSRepo
Copy link
Contributor

FSSRepo commented Nov 28, 2023

@leejet
It seems to have almost the same architecture as SD 2.1 but includes some temporal consistency blocks called "time_stack." We'll need to see how they work and whether new functions need to be added to ggml. The conversion program works with this model; however, please note that we'll need to implement the vision version of CLIP to generate embeddings from images.

@leejet
Copy link
Owner

leejet commented Nov 28, 2023

I'm currently reviewing the SVD implementation code in comfyui. Perhaps I can learn how to conveniently implement SVD within sd.cpp from this.

@FSSRepo
Copy link
Contributor

FSSRepo commented Nov 28, 2023

I'm currently reviewing the SVD implementation code in comfyui. Perhaps I can learn how to conveniently implement SVD within sd.cpp from this.

Amazing! Good luck!!, Unfortunately, my time is limited as I am a student. Otherwise, I would be more than happy to help.

@Amin456789
Copy link

Bless u guys! SVD in cpp will be a dream! Good luck to all of u!

@Amin456789
Copy link

@leejet any update and progress on svd and inpainting? really excited to try them out in cpp!

@leejet
Copy link
Owner

leejet commented Dec 21, 2023

I've got a basic understanding of the SVD model architecture. Once I merge the #104 and #117, I'll attempt to implement SVD.

@Amin456789
Copy link

niceee! so excited, thanks

@Jonathhhan
Copy link

Jonathhhan commented Dec 29, 2023

Hotshot-XL looks interesting, too and works with SDXL models: https://huggingface.co/hotshotco/Hotshot-XL

@Amin456789
Copy link

@leejet it will be great if you support fp16 of SVD when it is done:
https://huggingface.co/becausecurious/stable-video-diffusion-img2vid-fp16/tree/main

they are smaller and probably more ram friendly

@engineer1109
Copy link

Need as well.

@Amin456789
Copy link

@leejet any update on svd please?

@mirix
Copy link

mirix commented Oct 24, 2024

I don't know if this is even remotely related to the SD architecture, but it would be could to support the new kid on the block:

https://huggingface.co/genmo/mochi-1-preview

https://huggingface.co/Kijai/Mochi_preview_comfy/tree/main

@patrickjonesdotca
Copy link

Any updates on SVD?

1 similar comment
@Zctoylm0927
Copy link

Any updates on SVD?

@bombless
Copy link

There are more img2vid and txt2vid models coming
https://github.com/THUDM/CogVideo
https://huggingface.co/IamCreateAI/Ruyi-Mini-7B
https://github.com/Tencent/HunyuanVideo

@delldu
Copy link
Contributor

delldu commented Jan 17, 2025

Good luck for you !!! In my opinion, applying ggml on video(AI) may be nightmare, because it dose not support tensor more than 4d, so conv3d, batchnorm3d etc will make you crazy !!!

@stduhpf
Copy link
Contributor

stduhpf commented Jan 17, 2025

Good luck for you !!! In my opinion, applying ggml on video(AI) may be nightmare, because it dose not support tensor more than 4d, so conv3d, batchnorm3d etc will make you crazy !!!

I can confirm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests