Yandex SpeechKit allows application developers to use Yandex speech technologies: speech recognition (Speech-to-Text) and speech synthesis (Text-to-Speech).
This is C# console demo application client for streaming Speech To Text functionality
More information regarding this service https://cloud.yandex.com/docs/speechkit/
Demo video: https://youtu.be/3j9-ZWP6bb4
- Register Yandex Cloud account and create folder in your tenant https://cloud.yandex.com/docs/resource-manager/operations/folder/create. Get id of your folder.
- Download and install .NET Core runtime for Win/Mac/Linux https://dotnet.microsoft.com/download
- Download, install and init Yandex Cloud Command Line interface tools https://cloud.yandex.com/docs/cli/quickstart#install
- Compile sources or download and unzip compiled client from Releases https://github.com/MaxKhlupnov/YC.SpeechKit.Streaming.Asr/releases
- Generate IaM token with command
yc iam create-token
- Prepare your audio in Ogg (Opus) or
- execute in command line:
dotnet YC.SpeechKit.Streaming.Asr.dll --iam-token your_iam_token --folder-id your_folder_id --in-file path_to_audio_file --audio-encoding your_file_encoding --sample-rate required_for_lpcm_only
example for ogg format: dotnet YC.SpeechKit.Streaming.Asr.dll --iam-token t1.9eu.......A3YAA --folder-id b1g95p77ivsq5c2vub3s --in-file="C:\PROJECTS\Yandex.Cloud\SpeechKit\DataSphere.ogg" --audio-encoding OggOpus
example for lpcm format: dotnet YC.SpeechKit.Streaming.Asr.dll --iam-token t1.9eu.......A3YAA --folder-id b1g95p77ivsq5c2vub3s --in-file="C:\PROJECTS\Yandex.Cloud\SpeechKit\DataSphere.ogg" --audio-encoding Linear16Pcm --sample-rate 16000
yc iam create-token
dotnet YC.SpeechKit.Streaming.Asr.dll --iam-token your_iam_token --folder-id your_folder_id --in-file path_to_audio_file --audio-encoding your_file_encoding --sample-rate required_for_lpcm_only
example for ogg format: dotnet YC.SpeechKit.Streaming.Asr.dll --iam-token t1.9eu.......A3YAA --folder-id b1g95p77ivsq5c2vub3s --in-file="C:\PROJECTS\Yandex.Cloud\SpeechKit\DataSphere.ogg" --audio-encoding OggOpus
example for lpcm format: dotnet YC.SpeechKit.Streaming.Asr.dll --iam-token t1.9eu.......A3YAA --folder-id b1g95p77ivsq5c2vub3s --in-file="C:\PROJECTS\Yandex.Cloud\SpeechKit\DataSphere.ogg" --audio-encoding Linear16Pcm --sample-rate 16000