Skip to content

FIRSTinMI/live-captions

Repository files navigation

Live Captions

This application hosts a web page with captions generated by either Google Speech-to-Text API or April-ASR from an input stream on the local machine. To add to your stream you can add the url http://localhost:3000/ as a browser input and once the application is running captions will be sent to the browser input with a websocket.

Multiple inputs can be added and each can be set to display as a different color. Each input stream has an adjustable threshold, so if you hold the microphone away from your mouth to talk but forget to mute it, you can avoid having that conversation broadcast on the screen. It will also stop streaming to the Google API after about a minute of silence, since every minute of API use costs 1.6 cents per stream, we want to reduce that cost when we don't need it. Also includes a configurable profainity filter incase the transcription API mishears what someone said.

Usage

  1. Download the latest release
  2. Run the application
  3. Go to the settings page http://localhost:3000/settings.html
  4. Enter your Google API key in the server tab
  5. Create inputs for your microphones in the transcription tab
  6. Click apply to restart the server with the new settings
  7. Adjust the input thresholds to suit your needs
  8. Create a browser input in vMix pointing to http://localhost:3000/ and set it as the top overlay (4)
  9. ?
  10. Success

Local Engine

If you happen to be at a venue with a poor internet connection you can use the April engine. It's recognition is not as good as Google's but atleast it'll work consistently. Currently the April engine is in beta

  1. Make sure python is installed
  2. Open powershell and run this command pip install april_asr websockets psutil
  3. Select the April engine on the transcription tab of settings and click apply
  4. Wait for the software to download the model and script
  5. Win