-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Community Pipeline] Speech to Image #871
Comments
Hi, would be glad to work on it |
Awesome! Feel free to open a PR and if you have any questions just let us know :) (ideally in the parent issue unless it's very specific to this 🙏 ) |
I’m very much interested for this,I’ll try this feature |
Hi @osanseviero, I opened a PR for this issue. For the moment, I have tested my pipeline for whisper and it works pretty well. I'm waiting for your suggestions (maybe making the pipeline more generic so that we can use other speech-recognition models ?) |
It works very well ! Well done @MikailINTech :-) |
📌 gradio demo is available here: https://huggingface.co/spaces/fffiloni/speech-to-image |
This is amazing @fffiloni! Great work! |
Closing this issue as this is now done 🔥 amazing work everyone! |
I tried the sample from https://github.com/huggingface/diffusers/tree/main/examples/community#speech-to-image Is there anything wrong with the code? |
Intro
Community Pipelines are introduced in
diffusers==0.4.0
with the idea of allowing the community to quickly add, integrate, and share their custom pipelines on top ofdiffusers
.You can find a guide about Community Pipelines here. You can also find all the community examples under
examples/community/
. If you have questions about the Community Pipelines feature, please head to the parent issue.Idea: Speech to Image
You can use a
transformer
automatic-speech-recognition
such as OpenAIwhisper
to transcribe the text, and pass that to Stable Diffusion. Together, this would create a nicespeech-to-image
pipeline.Resources
The text was updated successfully, but these errors were encountered: