Skip to content

Road Map

Kyle Weems edited this page Apr 29, 2024 · 6 revisions

Sock Road Map

Just Down The Road

  • Emotional parsing to determine emotional state of puppet as it speaks
  • Multimodal personality to provide more complex personality traits and responsivenesss.
  • Documentation
  • ComfyJS: Generic reading of chat messages to either treat it as a generic intake (ala the normal transcription mode) or to parse for specific phrases to respond to.
  • Ollama support as an alternate LLM source than OpenAI's ChatGPT.
  • Speech synthesis: Bark (it's slower, will work best when "speak chat message outloud" option is active.)

Enhancements To Look At

  • Additional speech synthesis options.
  • Replacing react-mic with a better audio capture package. (Possibly with use-whisper or something similar).
  • More natural transcription timing, including the ability to sense pauses (and stop sending to transcription to prevent Whisper from hallucinating text) as well as updating the transcription each time you pause to capture whole sentences at once.
  • ChatGPT streaming mode support to decrease LLM response times.

Aspirations Farther Out

  • An executable program version of the application.
  • More to come
Clone this wiki locally