Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Silence Detection #3

Closed
sammachin opened this issue May 16, 2016 · 4 comments
Closed

Silence Detection #3

sammachin opened this issue May 16, 2016 · 4 comments

Comments

@sammachin
Copy link

Awesome project guys, I'm the author of the orignal RaspberryPi Alexa project :)

I'm looking at integrating your code into AlexaPi, one thing I need to work out is silence detection, have you got any tips?

The flow would need to go WakeWord>Record Command>DETECT END OF COMMAND>Post to Alexa Voice Service.

@Timmmm
Copy link

Timmmm commented May 16, 2016

They keyword to search for is VAD (Voice Activity Detection). There are quite a lot of them available, often codecs contain them (e.g. Opus has one).

Here are a couple I found by Googling "github VAD":

https://github.com/mwv/vad

https://github.com/shriphani/Listener/blob/master/VAD.py

@chenguoguo
Copy link
Collaborator

Great work on the RaspberryPi Alexa project @sammachin !

Yes one way is to try the third party VAD tools as @Timmmm mentioned.

However, Snowboy does have internal VAD. I just updated the internal code so that the RunDetection() function will return -2 if silence is detected. We will update the library in this repository asap. You may have to smooth the VAD output a little bit, e.g., call it the end of a sentence after receiving several -2's instead of just -2.

Another way is to have a stop hotword, e.g., you say the sentence like "Alexa, turn on the lights, thanks." and you let Snowboy detect both "Alexa" and "thanks".

@xuchen
Copy link
Collaborator

xuchen commented May 16, 2016

I updated the entire codebase with new VAD support and compiled binaries (v1.0.1). Now the return value of SnowboyDetect.RunDetection() (see snowboydecoder.py for detailed usage) function indicates silence, voice, error, and triggered words:

return meaning
-2 silence
-1 error
0 voice
1,... triggered index

Snowboy's VAD model has two layers: first layer is energy based, just like the above sampled examples; second layer is trained with neural networks to specifically recognize human voice.

@GemBro
Copy link

GemBro commented May 17, 2016

Yes yes ... glad you guys are now talking together on this ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants