-
Notifications
You must be signed in to change notification settings - Fork 0
Road Map
Kyle Weems edited this page Apr 29, 2024
·
6 revisions
- Emotional parsing to determine emotional state of puppet as it speaks
- Multimodal personality to provide more complex personality traits and responsivenesss.
- Documentation
- ComfyJS: Generic reading of chat messages to either treat it as a generic intake (ala the normal transcription mode) or to parse for specific phrases to respond to.
- Ollama support as an alternate LLM source than OpenAI's ChatGPT.
- Speech synthesis: Bark (it's slower, will work best when "speak chat message outloud" option is active.)
- Additional speech synthesis options.
- Replacing react-mic with a better audio capture package. (Possibly with use-whisper or something similar).
- More natural transcription timing, including the ability to sense pauses (and stop sending to transcription to prevent Whisper from hallucinating text) as well as updating the transcription each time you pause to capture whole sentences at once.
- ChatGPT streaming mode support to decrease LLM response times.
- An executable program version of the application.
- More to come