-
Notifications
You must be signed in to change notification settings - Fork 4
Demo: Kws mode
Pocketsphinx supports keyword spotting mode where you can specify the keyword list to look for. The advantage of this mode is that you can specify a threshold for each keyword so that keyword can be detected in continuous speech. All other modes will try to detect the words from grammar even if you used words which are not in grammar. To find out what a template kwlist looks like, click here.
To test kws mode, use the launch file kws.launch
with the following command:
roslaunch pocketsphinx kws.launch dict:=/home/<your_user_name>/catkin_ws/src/pocketsphinx/demo/voice_cmd.dic kws:=/home/<your_user_name>/catkin_ws/src/pocketsphinx/demo/voice_cmd.kwlist input:=/home/<your_user_name>/catkin_ws/src/pocketsphinx/demo/goforward.raw
This will publish a topic kws_data
which publishes the detected keywords as messages.
Please note that roslaunch creates its own ROS server if one is not already running. So, in case you need to run multiple nodes or ROS clients separately, you should keep roscore
running in a separate terminal window.
The output of this launch file can be viewed using the following command in a new terminal window:
rostopic echo /kws_data
If instead of using an audio file as input, you require to use your system's microphone, just remove the input
parameter from the above command:
roslaunch pocketsphinx kws.launch dict:=/home/<your_user_name>/catkin_ws/src/pocketsphinx/demo/voice_cmd.dic kws:=/home/<your_user_name>/catkin_ws/src/pocketsphinx/demo/voice_cmd.kwlist
Following arguments can be provided to this command:
dict
Location(absolute path) of the dictionary. Dictionary contains the words which are used in the keyphrases along with their phonetics as used in the acoustic model.
kwlist
This contains a list of keyphrases along with their threshold values. A typical line looks like this:
Hello World /1e-12/
keyphrase
In case you only need to use 1 keyphrase, you can simply use a combination of a keyphrase and its threshold(see next argument). You should either use a kwlist argument or a combination of keyphrase and threshold.
threshold
To specify the threshold of the keyphrase in the above mentioned argument.
input
If you want to use an audio file as input, you can specify its absolute path here. Else you should omit this argument. System's microphone will be used as input instead.
hmm
In case the default acoustic model is missing or not suited to your needs, you can specify a new one using this argument. Just add its absolute path as value of this argument. You can find the existing ones here.