Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Voice detection detecting wrong phrases #29

Closed
berni2288 opened this issue Apr 11, 2014 · 24 comments
Closed

Voice detection detecting wrong phrases #29

berni2288 opened this issue Apr 11, 2014 · 24 comments

Comments

@berni2288
Copy link

Hi,

I got a problem with the voice detection. I'm using a USB sound adapter (Daffodil US01) with a normal headset. When I record something with "arecord -r 48000 -d 600 -f S16_LE test.wav" and then play it again with aplay test.wav, the recorded sound is perfectly fine & clear without any background noises.

When I run "python main.py", it also seems to get triggered when I say "Jasper", however the recognition always tells me that I'm saying "BUT", "BUT OF", "WHAT" and things like that and I never ever hear the beep... I tried it a hundred times now. Is there something wrong in my setup ?

Output:

===================
JASPER: OF OF
===================
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm_dmix.c:957:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
INFO: cmn.c(175): CMN: 27.03 -0.24  0.11 -0.21 -0.21 -1.67  0.22 -0.29 -0.40 -0.16  0.48 -0.41  0.35
INFO: ngram_search_fwdtree.c(1549):     1359 words recognized (13/fr)
INFO: ngram_search_fwdtree.c(1551):    33927 senones evaluated (333/fr)
INFO: ngram_search_fwdtree.c(1553):    20976 channels searched (205/fr), 1764 1st, 17447 last
INFO: ngram_search_fwdtree.c(1557):     1752 words for which last channels evaluated (17/fr)
INFO: ngram_search_fwdtree.c(1560):     1581 candidate words for entering last phone (15/fr)
INFO: ngram_search_fwdtree.c(1562): fwdtree 0.20 CPU 0.196 xRT
INFO: ngram_search_fwdtree.c(1565): fwdtree 0.21 wall 0.204 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 19 words
INFO: ngram_search_fwdflat.c(937):      649 words recognized (6/fr)
INFO: ngram_search_fwdflat.c(939):    35160 senones evaluated (345/fr)
INFO: ngram_search_fwdflat.c(941):    31614 channels searched (309/fr)
INFO: ngram_search_fwdflat.c(943):     1567 words searched (15/fr)
INFO: ngram_search_fwdflat.c(945):     1001 word transitions (9/fr)
INFO: ngram_search_fwdflat.c(948): fwdflat 0.15 CPU 0.147 xRT
INFO: ngram_search_fwdflat.c(951): fwdflat 0.15 wall 0.150 xRT
INFO: ngram_search.c(1266): lattice start node <s>.0 end node </s>.87
INFO: ngram_search.c(1294): Eliminated 0 nodes before end node
INFO: ngram_search.c(1399): Lattice has 55 nodes, 124 links
INFO: ps_lattice.c(1365): Normalizer P(O) = alpha(</s>:87:100) = -739532
INFO: ps_lattice.c(1403): Joint P(O,S) = -752904 P(S|O) = -13372
INFO: ngram_search.c(888): bestpath 0.01 CPU 0.010 xRT
INFO: ngram_search.c(891): bestpath 0.01 wall 0.014 xRT
===================
JASPER: IS OF
===================
@bsherertz
Copy link

I'm having the same problem as well on my build with the same adapter. It could be the configuration but I haven't been able to fix it.

@ozett
Copy link

ozett commented Apr 17, 2014

i experienced similar miss-recognitions. How do i log/debug this. even if i had the plan for playing with own module and commands: How do i debug the words jaspers seems to 'understand'? i think, its more a software-problem than hardware...

@fritz-fritz
Copy link

Jasper will only recognize commands that are in it's WORDS list. That's probably why you are seeing the log above and why it looks like it doesn't recognize the words correctly.

However, if jasper isn't responding properly (eg not responding with beeps) it may be that you have to correct the audio hw for your setup. Jasper is currently hardcoded for a specific hardware address that may or may not apply to your setup. Chances are if you are not using a raspberry pi you will have to change this.

Check out my fork that merges my work with that of @Exadrid

@berni2288
Copy link
Author

Thanks for your useful reply fritz-fritz, I will try out what you said and tell if I was using a wrong hardware ID.

@ozett
Copy link

ozett commented Apr 17, 2014

just as another help to see whats jasper recognises. on my default console-screen (after reboot, not logged in), or also logged in as user "pi", how do i follow the printouts?
like this: ....
print "==================="
print "JASPER: " + result
print "==================="
...
i did not find any log to follow. what am i missing? it would be a help to debug things.. anybody a clue? thankx...

@berni2288
Copy link
Author

@ozett terminate all running python scripts and execute "python ~/jasper/client/main.py", this will show you all the output.

@fritz-fritz Do you know in which files I can find the hardcoded hardware ID's ? thanks

@fritz-fritz
Copy link

@berni2288 For one the file i linked you to client/mic.py and also theres another one in client/main.py that is less important

They look like this os.system("aplay -D hw:0,0 beep_hi.wav") I think what happens is that if the audio hw is wrong it causes the whole method to crash... but i could be wrong... it was necessary to change it to get Jasper to work for me though.

@blakejakopovic
Copy link

I'm having similar problems after testing with different USB adapters (desktop mic, webcam). I've tested recording (using arecord/aplay), and it is fairly clear and has minimal noise/static. I've also tested using a powered USB hub, to no effect. The only command I can get to work consistently is 'Jasper'; and occasionally 'time', however the 'time' phrase is often jumped into a mix of random words (eg. 'ITS TIME', 'TIME ON' or 'TIME TIME').

@mroswald
Copy link

same problem here

pocketsphinx_continuous has no results but tells "Warning: Could not find Capture element" -> any ideas? maybe arecord uses another device for recording than jasper or pocketsphinx?

@ghost
Copy link

ghost commented May 15, 2014

I had a similar problem. When you run /jasper/client/main.py does the terminal keep scrolling even when no speech is present? I found that I had to say "Jasper" two or three times for him to start listening. I eventually figured out that my microphone was set too high in alsamixer. To fix it:

Open terminal > alsamixer > F4 (Capture) > Play around with the levels until the terminal stops scrolling and you see the text "No disturbance detected" > Test voice commands

My level is set to 0. If I use arecord -D plughw: 0,0 test.wav and then aplay -D hw: 1,0 test.wav I can hear myself quite well. I should also note that I am using the suggested microphone.

@dennystc
Copy link

dennystc commented Jul 1, 2014

Hi, berni2288

Have you fixed this issue? I met this problem when I try Jasper on my rPI. The 'arecord' and 'aplay' implement so well. But the Jasper could only detect and recognize 'JASPER'. When I say 'What is the time', or 'How’s the weather?… What’s the weather like tomorrow?”'. Jasper could not work.

Meanwhile, The log also exists on my board.

...
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
.....
...

BR//Denny

@dennystc
Copy link

dennystc commented Jul 2, 2014

And more, I found some more error today. However, the Jasper is still running all the time.

JASPER: WHAT ON

ERROR:apscheduler.scheduler:Job "Notifier.gather (trigger: interval[0:00:30], next run at: 2014-07-02 06:33:09.591522)" raised an exception
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/apscheduler/scheduler.py", line 512, in _run_job
retval = job.func(_job.args, *_job.kwargs)
File "/home/pi/jasper/client/notifier.py", line 31, in gather
[client.run() for client in self.notifiers]
File "/home/pi/jasper/client/notifier.py", line 17, in run
self.timestamp = self.gather(self.timestamp)
File "/home/pi/jasper/client/notifier.py", line 35, in handleEmailNotifications
emails = Gmail.fetchUnreadEmails(self.profile, since=lastDate)
File "/home/pi/jasper/client/modules/Gmail.py", line 60, in fetchUnreadEmails
conn = imaplib.IMAP4_SSL('imap.gmail.com')
File "/usr/lib/python2.7/imaplib.py", line 1148, in init
IMAP4.init(self, host, port)
File "/usr/lib/python2.7/imaplib.py", line 163, in init
self.open(host, port)
File "/usr/lib/python2.7/imaplib.py", line 1159, in open
self.sock = socket.create_connection((host, port))
File "/usr/lib/python2.7/socket.py", line 571, in create_connection
raise err
error: [Errno 97] Address family not supported by protocol

@astahlman
Copy link
Contributor

I also struggled to get Jasper to recognize my voice commands, so I abstracted out a speech-to-text module which defaults to the PocketSphinx implementation but optionally allows users to switch to the Google Speech API by specifying an API key in profile.yml (a lot of this was based off of @fritz-fritz 's fork). I've found that this implementation is much more reliable and makes for a completely different experience.

I know one of the tenets of the jasper project is that the core should remain decoupled from third-party services. I'd be interested to hear from @shbhrsaha or @crm416 on whether they would be interested in a pull request that doesn't change the default behavior but that does allow users to plug in a third-party STT module by specifying configuration in their profile.

@johnwyles
Copy link

I am using the EXACT same hardware that was used and/or suggested AND the Jasper disk image and cannot get the device to recognize any speech input. Like everyone here I can get it to record/playback just fine using the arecord/aplay programs.

@johnwyles
Copy link

@astahlman and @fritz-fritz - this bug doesn't seem to be getting any attention from the authors @shbhrsaha and @crm416 - perhaps they have abandoned the project. @fritz-fritz seems to have a fork but it has a lot of dependencies - perhaps a little documentation on what has changed and how to set it up would be the way to go? I would be interested in getting something up - I have a number of RPi's I have hooked up to different things for home automation and want to tie all of them together - the https://github.com/johnwyles/PySpeak I wrote would be a pain to build out and I would rather start with something already tested and vetted

@fritz-fritz
Copy link

@johnwyles - I haven't had the time to continue development on my fork for a while though it seems to work for me in an Ubuntu environment and others I believe have it working on some RPIs. I believe @astahlman has recently submitted a pull request based off my branch (admittedly I haven't reviewed it yet) to the main project that indeed has seen interest from @crm416 though it doesn't directly address all the issues that fall under this issue.

The sad truth is that while some of this is the software limitations of using pocketsphinx vs google STT, the major problem IMO is the difficulty of abstracting hardware configurations (addressed in another branch) and playing with input levels etc... which is somewhat outside the scope of the project. While I prefer to use the google STT as the speech recognition is a lot more advanced, the original authors did not want to build that in as a dependency. It seems they may have decided to include it as an option after all if it the original STT engine is left as the default.

Hopefully I'll have the time soon to get back to working on this. I have tried to document my changes as I went and add relevant information to the readme on my fork but it may be somewhat incomplete still. I do not have a RPI to test on so anyone wishing to help with RPI specific documentation would be a big help.

@johnwyles
Copy link

I wanted to point out to others, who might be using projects with nodejs like I am, that there is a project that seems to be more complete with active development and less issues than this one. I have tested to work with RPi using the same hardware as this project: https://github.com/syl22-00/pocketsphinx.js

@Holzhaus
Copy link
Member

The original authors are still merging in PRs, so this project is not dead after all.

@fritz-fritz - The main problem with jasper is that a lot of code seems to have been written quick&dirty, e.g.:

  • hardcoded paths
  • using os.system excessively (even though it can be replaces, i.e. aplay → pyaudio)
  • hardcoded hardware arguments (like 'aplay -D hw:1,0')
  • not using a config directory but writing everything in the program directory
  • not using the tempfile module
  • direct YAML access for configuration without a wrapping module

Python is a perfect language for platform-independent development.
Of course there will still be hardware compatibility problems, but these are solveable after all.

I think jasper is a great project and can become even greater, If there's some more code cleanup and bugfixing (I've opened 2 pull requests for that matter).

@Marioheld
Copy link

@Holzhaus I had the idea of a setup wizard for jasper with a GUI similar to raspi-config. So it would be easy to install, change paths, change sound devices, etc. So there no longer hardcoded paths and not everything must be writeable. And the instalation would be clean and automatic.

@Holzhaus
Copy link
Member

@Marioheld Nice, although something similar exists (populate.py), although it's missing a nice ncurses interface. But the problem is that this requires a lot of work, not only for writing the config program itself, but by making all the values configurable in the first place.

@Marioheld
Copy link

@Holzhaus Yeah I know the populate.py but it´s just for the user profile. And because the easy installing guide for jasper not worked for I thought of an nice wizard. Sure it´s not easy to make all variable but I think some people here on Github maybe will help. That´s the sense of Github I think. By the way I also thinking about adding a feature in the wizard to change the STT and TTS Services so every user can easily chooce between the independence of pocketsphinx or the averages of other services e.g. Google STT.

@Holzhaus
Copy link
Member

@Marioheld That's where the setting belongs: In the user profile. Different users might want to use different audio setups.
Example:
There are two users on a machine: jasper1 and jasper2. Both run their own instance of jasper-client. There are two diffent soundcards connected to the machine, one with mic/speakers located in the living room, one with mic/speaker located in the bedroom. Because both instances access their own config, this is possible without a problem.

Concerning STT/TTS Services: Support for different STT services has already been added in pull request #118. Support for different TTS engines is part of my pull request #124 (see Holzhaus@2eaf002)

@Holzhaus
Copy link
Member

Closed due to no activity.

@YinxuWang
Copy link

Is there anyone who has got a good voice recognition with Jasper? If not, why not close the issue? I also got the bad recognition result. If this issue has not been solved, why not put it on the top.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests