-
run
audioWhisper\audioWhisper.exe --devices true
(orget-device-list.bat
) and get the Index of the audio device. (the number in[*]
at the end)- The input device is for the speech recognition and the output device is for the TTS (Text to Speech).
- The list is divided into input and output devices.
- You might need to scroll the list depending on the amount of devices you have.
-
run
audioWhisper\audioWhisper.exe
. By default, it tries to find your default Microphone. Otherwise, you need to add--device_index *
to the run command.Set
--device_out_index *
to the device index of the output device. (if you want to use TTS with a different device than your default output device.)Where the
*
is the device index found at step 1. Find more command-line flags here. -
If websocket option is enabled, you can control the whisper task (translate or transcript) as well as textual translation options while the AI is running.
For this: open the
websocket_clients/websocket-remote/
folder and start the index.html there.If you have the AI running on a secondary PC, open the HTML file with the IP as parameter like this:
index.html?ws_server=ws://127.0.0.1:5000
-
run the script with
--osc_ip 127.0.0.1
parameter. This way it automatically writes the recognized text into the in-game chat-box.example:
audioWhisper\audioWhisper.exe --model medium --task transcribe --energy 300 --osc_ip 127.0.0.1 --phrase_time_limit 9
-
run the script with
--websocket_ip 127.0.0.1
parameter (127.0.0.1 if you are running everything on the same machine), and set a--phrase_time_limit
if you expect not many pauses that could be recognized by the configured--energy
and--pause
values.example:
audioWhisper\audioWhisper.exe --model medium --task translate --device_index 4 --energy 300 --phrase_time_limit 15 --websocket_ip 127.0.0.1
-
Find a streaming overlay website in the
websocket_clients
folder. (So far onlystreaming-overlay-01
is optimized as overlay with transparent background.) -
Add the HTML file to your streaming application. (With some additional arguments if needed. See [Websocket Clients] for all possible arguments.)
For example:
websocket_clients/streaming-overlay-01/index.html?no_scroll=1&no_loader=1&bottom_align=1&auto_rm_message=15
- Run the Application listening on your Audio-Device with the VRChat Sound.
- Install Desktop+ (or stanadlone without steam: https://github.com/elvissteinjr/DesktopPlus/) and installed embedded Browser DLC
- Add the Browser Overlay with a URL like (
file:///E:/AI/Whispering-Tiger/websocket_clients/streaming-overlay-02/index.html?ws_server=ws://127.0.0.1:5001&auto_hide_message=25&no_scroll=1&no_loader=1
) or similar depending on your Path, which overlay type you want and its settings. - Set the Browser to allow Transparency.
- Attach the Browser to your VR-Headset.
Voilà, you have live translated subtitles in VR of other people speaking (or videos playing) which automatically disappear after 25 seconds.