-
run
audioWhisper\audioWhisper.exe --devices true
(orget-device-list.bat
) and get the Index of the audio device. (the number in[*]
at the end)- The input device is for the speech recognition and the output device is for the TTS (Text to Speech).
- The list is divided into input and output devices.
- You might need to scroll the list depending on the amount of devices you have.
-
run
audioWhisper\audioWhisper.exe
. By default, it tries to find your default Microphone. Otherwise, you need to add--device_index *
to the run command.Set
--device_out_index *
to the device index of the output device. (if you want to use TTS with a different device than your default output device.)Where the
*
is the device index found at step 1. Find more command-line flags here. -
If websocket option is enabled, you can control the whisper task (translate or transcript) as well as textual translation options while the AI is running.
For this: open the
websocket_clients/websocket-remote/
folder and start the index.html there.If you have the AI running on a secondary PC, open the HTML file with the IP as parameter like this:
index.html?ws_server=ws://127.0.0.1:5000
-
run the script with
--osc_ip 127.0.0.1
parameter. This way it automatically writes the recognized text into the in-game chat-box.example:
audioWhisper\audioWhisper.exe --model medium --task transcribe --energy 300 --osc_ip 127.0.0.1 --phrase_time_limit 9
-
run the script with
--websocket_ip 127.0.0.1
parameter (127.0.0.1 if you are running everything on the same machine), and set a--phrase_time_limit
if you expect not many pauses that could be recognized by the configured--energy
and--pause
values.example:
audioWhisper\audioWhisper.exe --model medium --task translate --device_index 4 --energy 300 --phrase_time_limit 15 --websocket_ip 127.0.0.1
-
Find a streaming overlay website in the
websocket_clients
folder. (So far onlystreaming-overlay-01
is optimized as overlay with transparent background.) -
Add the HTML file to your streaming application. (With some additional arguments if needed. See [Websocket Clients] for all possible arguments.)
For example:
websocket_clients/streaming-overlay-01/index.html?no_scroll=1&no_loader=1&bottom_align=1&auto_rm_message=15
- Run the Application listening on your Audio-Device with the VRChat Sound.
- Add the Overlay in the Desktop+ Beta with the embedded Browser with (
index.html?no_scroll=1&auto_hide_message=25
) - Set the Browser to allow Transparency.
- Attach the Browser to your VR-Headset.
Voilà, you have live translated subtitles in VR of other people speaking (or videos playing) which automatically disappear after 25 seconds.