Speech recognition not working [Keyword Spotting] #170

mojosoeun · 2016-03-20T12:58:27Z

Hello.
From 9am to 4pm KST, voice recognition didn't work. But after that time it began working again. I'm trying to figure out why but I can't solve it.

evancohen · 2016-03-21T03:06:45Z

This is a known issue. My current theory is that because of the high volume of mirrors we are collectively using up Electron's Google speech key.

I am currently investigating solutions (and am open to suggestions)

shekit · 2016-04-12T01:12:24Z

Any updated theory on this? I'm facing a very random issue where my code was working earlier in the day and suddenly stopped working, no voice detection.. nada..

Here's another issue I opened on annyang that details all my attempts https://github.com/TalAter/annyang/issues/188

evancohen · 2016-04-18T07:11:36Z

You are correct in you assumption that the issue is with key utilization (there have been may discussions about this on the gitter chat, which I suggest you check out).

I'm looking at alternatives (BlueMix, Microsoft, etc) as well as investigating offline Keyword Spotting to reduce quota usage.

evancohen · 2016-04-20T20:44:37Z

Another update: I've got keyword spotting functioning in the evancohen/keyword-spotting branch. There are a number of issues that exist with this implementation, namely poor performance and some comparability issues with certain microphone setups.

shrimp69 · 2016-04-26T13:12:36Z

I just set it up today and it seems I already made too many requests. I have my own Google API keys but in a matter of 2-3 minutes I made over 500 requests and now it seems to be down for me.

How do I add this branch to my existing git folder? ( on the raspberry )

skydazz · 2016-04-26T14:33:50Z

I am still getting "Google speach recognizer is down :(" when I plug in any sort of microphone in to it. I have my Own Speech Keys. May I suggest that its a driver issue. Is it set to only be compatible with a list of mics? (PS: It was "Say "What Can I Say" to see a list of commands" when I unplug the microphone.)

skydazz · 2016-04-26T14:37:48Z

Also I get the "[1444:0426/103700:ERROR:logging.h(813)] Failed to call method: org.freedesktop.NetworkManager.GetDevices: object_path= /org/freedesktop/NetworkManager: org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.NetworkManager was not provided by any .service files" Message in the Terminal When I start it.

skydazz · 2016-04-27T01:05:25Z

I have started from scratch. I downgraded from Jessie to wheezy and I am following the documentation exactly as it is printed (except config.js). I am using a USB camera as a mic as it states in the documentation also. Will post results soon

skydazz · 2016-04-27T03:26:03Z

Ok, my issue now, still with sound, is getting the usb camera mic as the mic being used, I can use any USB sound device for anything. I have tried turtle beach px22 controller, USB sound card, and a USB camera

evancohen · 2016-04-28T16:33:28Z

@skydazz have you tried following the directions in the troubleshooting section of the documentation? You may also want to look at #20 (which was an old thread on the issue that may help you find an alternative solution)

Sachin1968 · 2016-04-29T14:44:12Z

@skydazz Were you able to resolve the issue you had with " Failed to call method: org.freedesktop.NetworkManager.GetDevices: object_path= /org/freedesktop/NetworkManager: org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.NetworkManager was not provided by any .service files"

I have the same issue and can't figure out how to resolve it. Thanks.

evancohen · 2016-04-29T17:47:19Z

@Sachin1968 that error is unrelated to this thread and is harmless - you can be safely ignore it.
You can find more info on this Chromium forum post.

evancohen · 2016-05-02T00:33:27Z

So, another update for you all :)
Keyword spotting officially works in the in the evancohen/keyword-spotting branch. Unfortunately the Pi is not quite powerful enough to process everything in real time. Because of that I've added a clap detector to that same branch, all you have to do is clap (a configurable number of times) and the mirror will start listening to you.

When using this branch on the Pi there are a few things you need to know
You'll have to install sox (it's a dependency for clap detection)

sudo apt-get install sox

You will also have to run npm install after switching to this branch because of the new dependencies. Make sure you update your config.js file to reflect the new properties in config.example.js!

Since this is all very new stuff I haven't had the chance to test it extensively. I already anticipate there being issues with the clap detection microphone configuration, luckily this is totally something that you can set up. In your config you can use the clap override object to change the following settings for clap detection:

overrides : {
    AUDIO_SOURCE: 'hw:1,0', // this is your microphone input. If you don't know it you can refer to this thread (http://www.voxforge.org/home/docs/faq/faq/linux-how-to-determine-your-audio-cards-or-usb-mics-maximum-sampling-rate)
    DETECTION_PERCENTAGE_START : '5%', // minimum noise percentage threshold necessary to start recording sound
    DETECTION_PERCENTAGE_END: '5%',  // minimum noise percentage threshold necessary to stop recording sound
    CLAP_AMPLITUDE_THRESHOLD: 0.7, // minimum amplitude threshold to be considered as clap
    CLAP_ENERGY_THRESHOLD: 0.3,  // maximum energy threshold to be considered as clap
    MAX_HISTORY_LENGTH: 10 // all claps are stored in history, this is its max length
}

As always, if you have any questions you can post them here or ask on gitter.

When Commands aren't working
Since commands are intermittent in the dev branch I've added a shim to Annyang to "simulate" a request. This can be done in the dev console with the following:

annyang.simulate("what can I say");

evancohen · 2016-05-13T09:05:36Z

@Keopss we'll get you sorted out in gitter :)

joerod · 2016-05-13T15:03:20Z

I had the same problem with overusing my 50 speech API calls in about 10 minutes of use so I'm happy to test the new "clap" feature.

Keopss · 2016-05-18T07:04:38Z

Hi @evancohen ! i don´t know what happen with my rabs :(

I have installed smart-mirror-master and works fine.

Then i install sox and smart-smirror with keyword then edit config.js and add overrides options but no clap and speech detection.

evancohen · 2016-05-18T23:02:02Z

The geniuses over at http://kitt.ai have created an offline keyword spotter that should work. In order to find out I need your help to train the keyword "smart mirror". Just follow these steps:

Go to https://snowboy.kitt.ai/hotword/47
Click "Record and download the model"
Follow the instructions to train the model (be sure to click "save" at the end!)

I'll continue to keep this thread updated with my progress and should hopefully have a working prototype this weekend!

Keopss · 2016-05-23T19:06:14Z

Hi! how gone? fine?

evancohen · 2016-05-23T19:32:19Z

I have a working implementation of keyword spotting, I'm currently trying to fix an issue on the Pi 3 that causes recognition to fail because of a native PulseAudio issue.

trenkert · 2016-05-23T20:51:18Z

Hello, I've succesfully installed smart mirror and I am very impressed!

However, I also get "Google Speech Recognizer is down"...

How long will it take for you to implement the new solution into the main branch?

Just an idea:
Could you use jasper with pocketsphinx (http://jasperproject.github.io/) to train a number of keywords than then either activate Google STT or Amazon Alexa? Or is Kitt.ai definitely better for keyword recognition?

evancohen · 2016-05-24T01:25:06Z

I tried Jasper... it's quite resource intensive and is painful to use as a dependency (building projects on top of it is great, building integration into an existing project not so much).

I also tried PulseAudio (same native recognition engine that Jasper uses) and it wasn't quick enough to recognize keywords without significant lag.

Snowboy (from the folks at kitt.ai) is super lightweight and very fast. Sure it requires a wrapper for their Python library, but that's not too difficult.

I actually have a working prototype with snowboy on the kws branch (using the OSX binaries). The only problem on the Pi now is with PulseAudeo, which is having issues with the Pi 3. Once I sort that out (and I think I have a fix) we'll be good to go.

It's been a long journey, with lots of painful dead ends, but I'm feeling really close!

tl;dr Snowboy is great, I should have something working really soon.

trenkert · 2016-05-24T09:01:30Z

cool! What's the problem with pulseaudio?

evancohen · 2016-05-24T18:31:01Z

@trenkert it's an issue with Bluetooth that causes PulseAudio to crap out. Even after disabling it within the config the issue persists (which makes me think there may be another root cause). I'm worried that the real cause is a conflict between dependencies of the mirror and keyword spotter (but I haven't confirmed this yet, and I don't think it's the cause).

chenguoguo · 2016-05-25T02:55:21Z

Sorry for coming into this late. Snowboy is a C++ library and doesn't have much dependency. It will work as long as you can feed it linear PCM data sampled at 16K, with bits per sample 16, and number of channels 1. PyAudio is only used for demo purpose. If it turns out that PyAudio is the problem, we can turn to other alternatives for audio capturing.

In the Snowboy repo we are trying to add examples of using Snowboy in different programing languages. So far the examples are using PortAudio or PyAudio, but if you look at the code (e.g., the C++ demo code https://github.com/Kitt-AI/snowboy/blob/master/examples/C%2B%2B/demo.cc), you can see that switching the audio capturing tool should be easy.

@evancohen, let me know if it turns out that PyAudio is the problem. We can look into other alternatives for audio capturing.

evancohen · 2016-05-25T07:29:39Z

Hey @chenguoguo thanks for dropping in! Awesome to see you all so committed to your (super awesome) project. I managed to write a pretty hacky IPC between Node and your pre-packaged Snowboy binaries/Python wrapper. It's definitely not the ideal way to use Snowboy with Node, but I just wanted to see if I could get something that would work.

I don't think it would be too challenging to wrap the existing C++ code so it could be easily consumed via Node. I'll take a look at it this weekend if I get the chance 😄

For everyone else: I managed to coax PulseAudio into cooperating on my Pi, and everything seems to work super well! You can test it out by doing the following:

Using the `kws` branch on the Pi 2/3:

First you'll want to check out the branch and update your config file to include the new config.kws object.
Then you should train your own model, which will be most accurate when done on the Pi itself: https://snowboy.kitt.ai/hotword/47 (bonus points if you also help the Snowboy team train their model)
Download and replace the smart_mirror.pmdl model with the one you just created in the root of the smart-mirror directory.
Install the necessary dependencies: sudo apt-get install python-pyaudio python3-pyaudio sox
Run the mirror! Say "smart mirror" and the mirror should start listening (there is no UI for this yet).

As always let me know if you have any issues over on gitter.

chenguoguo · 2016-05-25T07:47:48Z

That's great @evancohen! @amir-s is also helping us working on the NodeJS module, see the issue here. He'll likely get something soon.

trenkert · 2016-05-26T20:24:58Z

@evancohen I've experienced a similar issue. I would guess it has to do with pulseaudio-bluetooth. It works for me when I start pulseaudio manually once again after login.

ojrivera381 · 2016-07-18T05:37:27Z

@evancohen Thanks. I rebuilt it all seemed to be fine on my lab monitor in my office however when I moved it to its perm location speech stopped working. Also how do i exit it and get to the main desktop with menus. right now if I alt+f4 is closes the window but I can't see any menus to go through pi settings or launch terminal etc.. Thanks again.

evancohen · 2016-07-18T14:15:43Z

@ojrivera381 is the Pi still connected to the same WiFi network? Have you exceeded your 50 query/day quota? The menu is missing because you have unclutter installed.
You can probubly also press the windows key on your keyboard, (which opens the Raspbian equivalent of the start menu). You can also get to the terminal via the recycling bin on the desktop (hacky, I know).

If those two things look good, I would follow the instructions for troubleshooting in the docs.

evancohen self-assigned this Mar 23, 2016

evancohen changed the title ~~Speech recognition not working~~ Speech recognition not working Mar 24, 2016

evancohen added ready status: in progress status: next and removed ready status: in progress labels Mar 24, 2016

evancohen mentioned this issue Apr 22, 2016

Plug in microphone, Google speach recognizer is down :( #228

Closed

evancohen changed the title ~~Speech recognition not working~~ Speech recognition not working [Keyword Spotting] Apr 29, 2016

evancohen added status: in progress and removed status: next labels May 1, 2016

evancohen added a commit that referenced this issue May 25, 2016

adding precompiled Pi binaries for #170

2b6a184

evancohen mentioned this issue May 27, 2016

A fully functional keyword spotter #261

Merged

ikucukkaya mentioned this issue Jun 10, 2016

Google Speech Recognizer is down :( #282

Closed

evancohen mentioned this issue Jun 11, 2016

Speech Rec Works Intermittantly #287

Closed

evancohen closed this as completed in 1b5733b Jun 11, 2016

evancohen removed the status: in progress label Jun 11, 2016

justbill2020 added the status: Outdated Issue - Informational Only label Nov 22, 2016

Repository owner locked and limited conversation to collaborators Jan 25, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speech recognition not working [Keyword Spotting] #170

Speech recognition not working [Keyword Spotting] #170

mojosoeun commented Mar 20, 2016

evancohen commented Mar 21, 2016

shekit commented Apr 12, 2016

evancohen commented Apr 18, 2016 •

edited

Loading

evancohen commented Apr 20, 2016 •

edited

Loading

shrimp69 commented Apr 26, 2016 •

edited

Loading

skydazz commented Apr 26, 2016

skydazz commented Apr 26, 2016

skydazz commented Apr 27, 2016

skydazz commented Apr 27, 2016

evancohen commented Apr 28, 2016

Sachin1968 commented Apr 29, 2016

evancohen commented Apr 29, 2016

evancohen commented May 2, 2016 •

edited

Loading

evancohen commented May 13, 2016

joerod commented May 13, 2016

Keopss commented May 18, 2016 •

edited

Loading

evancohen commented May 18, 2016

Keopss commented May 23, 2016

evancohen commented May 23, 2016

trenkert commented May 23, 2016

evancohen commented May 24, 2016

trenkert commented May 24, 2016

evancohen commented May 24, 2016

chenguoguo commented May 25, 2016

evancohen commented May 25, 2016

chenguoguo commented May 25, 2016

trenkert commented May 26, 2016

ojrivera381 commented Jul 18, 2016

evancohen commented Jul 18, 2016

Speech recognition not working [Keyword Spotting] #170

Speech recognition not working [Keyword Spotting] #170

Comments

mojosoeun commented Mar 20, 2016

evancohen commented Mar 21, 2016

shekit commented Apr 12, 2016

evancohen commented Apr 18, 2016 • edited Loading

evancohen commented Apr 20, 2016 • edited Loading

shrimp69 commented Apr 26, 2016 • edited Loading

skydazz commented Apr 26, 2016

skydazz commented Apr 26, 2016

skydazz commented Apr 27, 2016

skydazz commented Apr 27, 2016

evancohen commented Apr 28, 2016

Sachin1968 commented Apr 29, 2016

evancohen commented Apr 29, 2016

evancohen commented May 2, 2016 • edited Loading

evancohen commented May 13, 2016

joerod commented May 13, 2016

Keopss commented May 18, 2016 • edited Loading

evancohen commented May 18, 2016

Keopss commented May 23, 2016

evancohen commented May 23, 2016

trenkert commented May 23, 2016

evancohen commented May 24, 2016

trenkert commented May 24, 2016

evancohen commented May 24, 2016

chenguoguo commented May 25, 2016

evancohen commented May 25, 2016

Using the kws branch on the Pi 2/3:

chenguoguo commented May 25, 2016

trenkert commented May 26, 2016

ojrivera381 commented Jul 18, 2016

evancohen commented Jul 18, 2016

evancohen commented Apr 18, 2016 •

edited

Loading

evancohen commented Apr 20, 2016 •

edited

Loading

shrimp69 commented Apr 26, 2016 •

edited

Loading

evancohen commented May 2, 2016 •

edited

Loading

Keopss commented May 18, 2016 •

edited

Loading

Using the `kws` branch on the Pi 2/3: