-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't get speech_asynch_rest.py to work with Google Cloud Storage to transcribe long audio files #441
Comments
@mjgallow Can you please try the grpc version https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/speech/api/speech_async_grpc.py |
Sorry that I missed that I was supposed to use speech_async_grpc.py for long audio file transcription. Thanks! speech_async_grpc.py works for transcribing short FLAC audio files and (with a little tweaking) long, properly prepared raw audio files in Google Cloud Storage (need to specify uri where file is in Google Cloud Storage), at least for me, on Mac OS X El Capitan. (Didn't try on Windows 7.) Again, thanks! I'll close this ticket/issue. Below are some notes that might help others. I tried with my own publicly available Google Cloud Storage raw audio file. I properly converted the mp4 file to an mp3 file and then to the properly encoded raw file format using VLC (not shown below as I didn't use the application through the command line) and sox. Note that with sox, when you specify the input file to be converted, you need to specify the channels, bits, and rate. The soxi and file commands used before the sox command below provide the channels, bits, and rate information for the input file to be converted. The file command is probably unnecessary, actually. Also, I had to change lines 64 and 65 in speech_asych_grpc.py to the following (maintaining indentation) to transcribe the long raw audio file: I wanted output to go to text file instead of being printed out in the console, so I made the following code changes to my downloaded copy of speech_async_grpc.py. Mac OS X El Capitan Successful Attempt procedure and Terminal command line commands, with output in Terminal: Click Applications > Utilities > Terminal. |
I'm probably being dense or missing something, but speech_asynch_rest.py doesn't seem to work with Google Cloud Storage, and I believe you have to use Google Cloud Storage to transcribe long audio files.
I tried to find an answer to this on my own, but so far no luck. Hopefully I'm posting my issue in the right place, and following your guidelines for reporting issues (I did look over the guidelines that I could find). If not, let me know.
Here is an example Google Storage audio file to transcribe:
https://storage.googleapis.com/cloud-samples-tests/speech/brooklyn.flac
Below is what I'm entering and what errors I'm seeing when I try this in Windows 7 and Mac OS X El Capitan.
Also, note that I can run the examples fine that you list on https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/speech/api/README.md .
Also, note that I've been able successfully obtain transcripts for long audio files in my Google Cloud Storage using another method that involves curl.
I changed the audio part of the JSON object in speech_asynch_rest.py (currently line 68 or so, I believe) to the following only when trying to access the Google Cloud Storage audio file:
If you need more information, let me know.
_Windows 7 PC Attempt, with error feedback included_
Open Window cmd.exe (Click Start. Type in "cmd" (without quotes). Press Enter/Return.)
"export" command doesn't work in DOS
Here's what I typed in at the prompt (username and specific project name and id replaced).
C:\Python\python-docs-samples-master\speech\api> cd C:/Users/USERNAME/env/Scripts
C:\Python\python-docs-samples-master\speech\api> call activate.bat
C:\Python\python-docs-samples-master\speech\api> cd C:/Python/python-docs-samples-master/speech/api
(env) C:\Python\python-docs-samples-master\speech\api>set GOOGLE_APPLICATION_CREDENTIALS=C:\Python\My_Project-SOME_NUMBER.json
(env) C:\Python\python-docs-samples-master\speech\api>python speech_rest.py resources/audio.raw
{"results": [{"alternatives": [{"confidence": 0.98267895, "transcript": "how old is the Brooklyn Bridge"}]}]}
(env) C:\Python\python-docs-samples-master\speech\api>python speech_async_rest.py resources/audio.raw
{"name": "LONG_NUMBER_HERE"}
Waiting for server processing...
Waiting for server processing...
[{"alternatives": [{"transcript": "how old is the Brooklyn Bridge", "confidence": 0.98267895}]}]
(env) C:\Python\python-docs-samples-master\speech\api>python speech_async_rest.py gs://cloud-samples-tests/speech/brooklyn.flac
Traceback (most recent call last):
File "speech_async_rest.py", line 101, in
main(args.speech_file)
File "speech_async_rest.py", line 52, in main
with open(speech_file, 'rb') as speech:
OSError: [Errno 22] Invalid argument: 'gs://cloud-samples-tests/speech/brooklyn.flac'
_Mac OS X El Capitan Attempt, with error feedback included_
Click Applications > Utilities > Terminal.
Here's what I typed in at the prompt (username, transcription number, and specific project name and id replaced).
$ cd Desktop/python-docs-samples-master/speech/api
$ source env/bin/activate
$ export GOOGLE_APPLICATION_CREDENTIALS=/Users/USERNAME/Desktop/google_stuff/My_Project-SOME_NUMBER.json
$ python speech_rest.py resources/audio.raw
{"results": [{"alternatives": [{"confidence": 0.98267895, "transcript": "how old is the Brooklyn Bridge"}]}]}
$ python speech_async_rest.py resources/audio.raw
{"name": "LONG_NUMBER_HERE"}
Waiting for server processing...
Waiting for server processing...
[{"alternatives": [{"confidence": 0.98267895, "transcript": "how old is the Brooklyn Bridge"}]}]
$ python speech_async_rest.py gs://cloud-samples-tests/speech/brooklyn.flac
Traceback (most recent call last):
File "speech_async_rest.py", line 101, in
main(args.speech_file)
File "speech_async_rest.py", line 52, in main
with open(speech_file, 'rb') as speech:
IOError: [Errno 2] No such file or directory: 'gs://cloud-samples-tests/speech/brooklyn.flac'
The text was updated successfully, but these errors were encountered: