Voice-to-GPT is a web application that allows users to interact with an AI assistant using voice commands. The application records users' voice input, transcribes it, and sends the transcribed text to OpenAI's GPT-4 for processing. The AI assistant responds with an answer, which is then converted back to speech and played to the user. this version is faster than https://github.com/mkdev-me/voice-to-gpt because use whisper API instead of the free code. But it is not cheap
Screen.Recording.2023-03-28.at.11.57.42.mp4
- Voice input: Users can speak their questions or commands directly into their microphone.
- Automatic speech recognition (ASR): The application transcribes users' voice input using Whisper ASR.
- AI assistant: The transcribed text is sent to OpenAI's GPT-4, which processes the input and generates an appropriate response.
- Text-to-speech (TTS): The AI assistant's response is converted back to speech and played to the user.
- Please follow the instructions in the "Installation" section to set up and run the application.
remember to add the GPT API key in you env first
export OPENAI_API_KEY=......
You only need to say what you want to ask the GPT API.
To compile the image you need to do
docker build -t audio-to-gpt .
and to execute
docker run -p 5001:5000 -e OPENAI_API_KEY=$OPENAI_API_KEY audio-to-gpt
and after that open your browser in
and enjoy
Remember that depends of your computer, lambda, cloud run, etc resources spead will be different
- Open the application in a web browser.
- Click the "Record" button and speak your question or command into the microphone.
- Click the "Stop" button when you're done speaking.
- The application will transcribe your speech, send the text to GPT-4, and play the AI assistant's response.
Flask
Flask-CORS
OpenAI
Whisper
If you'd like to contribute to this project, please submit a pull request with your proposed changes. Be sure to provide a clear description of the changes and any relevant information.
This project is licensed under the MIT License. Please refer to the LICENSE file for more information.