-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yet another unofficial Diffuser support #32
Comments
You are an absolute champ for doing this -- I am soooo hoping this works without a hitch with OneDiff so I can compile it for realtime. I have ControlNet running in realtime and integrated with Unity 3D via NDI to AI-generate the entire game world in real time with just prompts, a WASD controlled 3rd person OpenPose skeleton, and a stream of the depth image of randomly placed cubes. I plan to migrate my app into a microservices architecture and have the two separate NDI streams (that will be migrated to webrtc to make them usable over WAN) coming out of Unity, streamed into two completely separate StableDiffusionImg2ImgControlnet pipelines that only do one Controlnet each for an assigned layer, then they alpha blend the layers back together for the output. I believe this has the potential to produce absolutely groundbreaking results -- I am posting this right before I begin work on it, but if your port here works there's a chance I am going to report back with a working demo of a viable framework for AI-generating an entire videogame frame-by-frame in realtime, as it is played. I will be setting up a multimodal LLM Agent (likely Pixtral) with a sandbox inside the runtime of the game for function calling to spawn enemies and objects using just the existing pose skeletons, but the last step is getting LayerDiffuse applied so that I can focus specialized pipelines onto separate render layers in the game. The only thing I don't know if it will work -- I have to do
in my existing pipeline code to include the VAE in the pipeline compilation (to achieve the frame rate and responsiveness needed for the game controls to make the output an actually playable game). If it works we are golden -- if not I'll post an issue on OneDiff, and report back here to reassess. Fingers crossed it just works. vlc-record-2024-09-15-15h52m16s-2024-09-12.23-14-47.mkv-.mp4 |
你好,我已收到你的邮件,我会及时处理^_^
|
@WyattAutomation I am happy to hear that it works with Oneflow flawlessly |
Actually I was unfortunately not successful -- to clarify that video I posted is just OneDiff/Oneflow without layer diffusion, I want to use layer diffusion in the app I am developing here. My plan was to try to achieve much better quality by using multiple seperate pipelines running in seperate threads or containers, each with checkpoints, ControlNets and LoRAs that are specialized for generating only specific, dedicated parts or features of the frames as they render in realtime. In order to do this, I need to leverage having transparent backgrounds in the output of each layer for compositing the images into output frames. I could use YOLO and segmasking in a sort of realtime video generator version of what ADetailer does. But that's going to take time to do and isn't as ideal as having the diffusion model already have that step taken care of (and at higher quality). The error that OneDiff gave me when using the OneFlow backend I think is related to the classes in attention_parameters.py. I don't know if it's because they are inheriting nn.Module and don't have an instance of forward() declared or what it could be, but I upgraded everything to the latest version, I tried adding in a stub forward method to those classes , among several other things and couldn't resolve it. The pipeline instantiates just fine, it's when you try to inference the pipeline that this error occurs:
There were a bunch of other lines with similar errors, all pointing at “Unsupported type” and some of them referenced “attention_processor” in my installation of diffusers and some of them referenced “attention_processors” from your diffusers port. I tested your repo without OneDiff/OneFlow and it works fine. It only throws the error when it tries to generate an image after using compile_pipe or infer_compiler. In fact, it even worked when I compiled only the VAE and did not compile the UNet – compiling the UNet is specifically what triggers it to fail. I read a similar issue on the OneDiff repo on “Unsupported type” that someone else had with oneflow, but all they said was something along the lines of “I figured it out, a submodule of torch.nn.Module has to have a declaration of forward() in the class” then they closed the issue. I added stubs that declared the forward() method to all your classes in attention_processors.py but it remained unchanged (same error). I did manage to get images generating using the nexfort backend for OneDiff, however all the images are completely blank/transparent, and I get a Cuda warning about how the “graph is empty”. I will try again tomorrow, but if there is any idea on how to get your attention_processors agreeing with OneDiff/Oneflow, let me know. Thank you for your work on this; it remains the best option out there for what I am trying to do, if I can just get it working with OneDiff. |
Github: https://github.com/rootonchair/diffuser_layerdiffuse
This project is a port to Diffusers, it allows you to run transparent image with SD1.5 (transparent only or joint generation) and SDXL (Attn and Conv Injection) with Diffusers friendly API
Don't hesitate to give it a try
The text was updated successfully, but these errors were encountered: