Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider pulling en-US changes #7

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 46 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,6 @@
# LocalSTT

(Jump to [English](#English))

### [Català]

> **Nota: Aquesta aplicació de moment només és una prova de concepte**

LocalSTT és una aplicació per Android que proporciona reconeixement automàtic de la parla sense necessitat de conexió a internet ja que tot el processament és local al mòbil.

Això és possible gràcies a:
- un RecognitionService que utilitza la llibreria de Vosk
- un RecognitionService que utilitza la lliberia de Mozilla Deepspeech
- una Activity que gestiona intents RECOGNIZE_SPEECH entre altres

El codi és actualment una prova de concepte i es basa fortament en els següents projectes:
- [Kõnele](https://github.com/Kaljurand/K6nele)
- [Vosk Android Demo](https://github.com/alphacep/vosk-android-demo)

LocalSTT hauria de funcionar amb la majoria de teclats i aplicacions que implementen la funció de reconeixement de veu a través d'un intent RECOGNIZE_SPEECH o directament fent servir la classe SpeechRecognizer d'Android. Ha estat provada amb èxit fent servir les següent aplicacions en un terminal Android 9:
- [AnySoftKeyboard](https://github.com/AnySoftKeyboard/AnySoftKeyboard)
- [Kõnele](https://github.com/Kaljurand/K6nele)
- [SwiftKey](https://www.swiftkey.com)

Us podeu descarregar un APK que inclou models de Vosk i DeepSpeech pel català [aquí](https://github.com/ccoreilly/LocalSTT/releases/download/2020-12-03/localstt.apk).
(Jump to [Català](#Català))

### [English]

Expand All @@ -44,10 +22,52 @@ LocalSTT should work with all keyboards and applications implementing speech rec
- [Kõnele](https://github.com/Kaljurand/K6nele)
- [SwiftKey](https://www.swiftkey.com)

You can download a pre-built binary with Vosk and DeepSpeech models for catalan [here](https://github.com/ccoreilly/LocalSTT/releases/download/2020-12-03/localstt.apk).
You can download a pre-built binary with Vosk models for:
- English: https://github.com/ewheelerinc/LocalSTT/releases / [LocalSTT-en-us.apk](https://github.com/ewheelerinc/LocalSTT/releases/download/2022-01-18-en-US/LocalSTT-en-us.apk)

and also with DeepSpeech models here:
- Catalan: [here](https://github.com/ccoreilly/LocalSTT/releases/download/2020-12-03/localstt.apk).

If you want to use the application with your language just replace the models in `app/src/main/assets/sync/vosk-model/` with a package from https://alphacephei.com/vosk/models and rebuild the application.

#### Build notes:
- git clone https://github.com/ewheelerinc/LocalSTT.git
- ./gradlew build
- ./repack-n-sign.sh ./app/build/outputs/apk/release/app-release-unsigned.apk
- You might need to update paths and keys in this script for your use.

If you want to use the application with your language just replace the models in `app/src/main/assets/sync` and rebuild the application.
#### BUGS:

### Demo
- Does not work with Google's keyboard "GBoard".
- Not all record applications read the voice text properly, there must be another way---and if you know how, it is probably a trivial fix.
- DeepSpeech models were removed, they didn't build! Maybe it can be fixed?

#### Future Work

- Support query alphacephei.com and suppport selection+download of optional models. Then this apk can be packaged _without_ a language (much smaller!).

#### Demo

![LocalSTT in action](./demo.gif)

### [Català]

> **Nota: Aquesta aplicació de moment només és una prova de concepte**

LocalSTT és una aplicació per Android que proporciona reconeixement automàtic de la parla sense necessitat de conexió a internet ja que tot el processament és local al mòbil.

Això és possible gràcies a:
- un RecognitionService que utilitza la llibreria de Vosk
- un RecognitionService que utilitza la lliberia de Mozilla Deepspeech
- una Activity que gestiona intents RECOGNIZE_SPEECH entre altres

El codi és actualment una prova de concepte i es basa fortament en els següents projectes:
- [Kõnele](https://github.com/Kaljurand/K6nele)
- [Vosk Android Demo](https://github.com/alphacep/vosk-android-demo)

LocalSTT hauria de funcionar amb la majoria de teclats i aplicacions que implementen la funció de reconeixement de veu a través d'un intent RECOGNIZE_SPEECH o directament fent servir la classe SpeechRecognizer d'Android. Ha estat provada amb èxit fent servir les següent aplicacions en un terminal Android 9:
- [AnySoftKeyboard](https://github.com/AnySoftKeyboard/AnySoftKeyboard)
- [Kõnele](https://github.com/Kaljurand/K6nele)
- [SwiftKey](https://www.swiftkey.com)

Us podeu descarregar un APK que inclou models de Vosk i DeepSpeech pel català [aquí](https://github.com/ccoreilly/LocalSTT/releases/download/2020-12-03/localstt.apk).
6 changes: 4 additions & 2 deletions app/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@ repositories {
}

android {
lintOptions {
abortOnError false
}
compileSdkVersion 30
defaultConfig {
applicationId "cat.oreilly.localstt"
Expand All @@ -36,10 +39,9 @@ dependencies {
implementation 'androidx.appcompat:appcompat:1.2.0'
implementation 'net.java.dev.jna:jna:5.8.0@aar'
implementation 'com.google.code.gson:gson:2.8.7'
implementation 'org.mozilla.deepspeech:libdeepspeech:0.8.2'
implementation 'com.github.gkonovalov:android-vad:1.0.0'
}

ant.importBuild 'assets.xml'
preBuild.dependsOn(list, checksum)
clean.dependsOn(clean_assets)
clean.dependsOn(clean_assets)
21 changes: 2 additions & 19 deletions app/src/main/AndroidManifest.xml
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@

<service
android:name=".VoskRecognitionService"
android:process=":speechProcess"
android:icon="@drawable/ic_service_trigger"
android:label="@string/vosk_recognition_service"
android:permission="android.permission.RECORD_AUDIO">
Expand All @@ -102,23 +103,5 @@
android:name="android.speech"
android:resource="@xml/recognition_service" />
</service>

<service
android:name=".DeepSpeechRecognitionService"
android:icon="@drawable/ic_service_trigger"
android:label="@string/deepspeech_recognition_service"
android:permission="android.permission.RECORD_AUDIO">
<intent-filter>

<!-- The constant value is defined at RecognitionService.SERVICE_INTERFACE. -->
<action android:name="android.speech.RecognitionService" />

<category android:name="android.intent.category.DEFAULT" />
</intent-filter>

<meta-data
android:name="android.speech"
android:resource="@xml/recognition_service" />
</service>
</application>
</manifest>
</manifest>
40 changes: 14 additions & 26 deletions app/src/main/assets/sync/assets.lst
Original file line number Diff line number Diff line change
@@ -1,26 +1,14 @@
deepspeech-catala/kenlm.scorer
deepspeech-catala/model.tflite
vosk-catala/README
vosk-catala/am/final.mdl
vosk-catala/am/tree
vosk-catala/conf/mfcc.conf
vosk-catala/conf/model.conf
vosk-catala/graph/Gr.fst
vosk-catala/graph/HCLr.fst
vosk-catala/graph/disambig_tid.int
vosk-catala/graph/phones/align_lexicon.int
vosk-catala/graph/phones/align_lexicon.txt
vosk-catala/graph/phones/disambig.int
vosk-catala/graph/phones/disambig.txt
vosk-catala/graph/phones/optional_silence.csl
vosk-catala/graph/phones/optional_silence.int
vosk-catala/graph/phones/optional_silence.txt
vosk-catala/graph/phones/silence.csl
vosk-catala/graph/phones/word_boundary.int
vosk-catala/graph/phones/word_boundary.txt
vosk-catala/ivector/final.dubm
vosk-catala/ivector/final.ie
vosk-catala/ivector/final.mat
vosk-catala/ivector/global_cmvn.stats
vosk-catala/ivector/online_cmvn.conf
vosk-catala/ivector/splice.conf
vosk-model/README
vosk-model/am/final.mdl
vosk-model/conf/mfcc.conf
vosk-model/conf/model.conf
vosk-model/graph/Gr.fst
vosk-model/graph/HCLr.fst
vosk-model/graph/disambig_tid.int
vosk-model/graph/phones/word_boundary.int
vosk-model/ivector/final.dubm
vosk-model/ivector/final.ie
vosk-model/ivector/final.mat
vosk-model/ivector/global_cmvn.stats
vosk-model/ivector/online_cmvn.conf
vosk-model/ivector/splice.conf
Binary file not shown.

This file was deleted.

Binary file not shown.

This file was deleted.

Binary file removed app/src/main/assets/sync/vosk-catala/.DS_Store
Binary file not shown.
1 change: 0 additions & 1 deletion app/src/main/assets/sync/vosk-catala/README

This file was deleted.

1 change: 0 additions & 1 deletion app/src/main/assets/sync/vosk-catala/README.md5

This file was deleted.

1 change: 0 additions & 1 deletion app/src/main/assets/sync/vosk-catala/am/final.mdl.md5

This file was deleted.

Binary file removed app/src/main/assets/sync/vosk-catala/am/tree
Binary file not shown.
1 change: 0 additions & 1 deletion app/src/main/assets/sync/vosk-catala/am/tree.md5

This file was deleted.

5 changes: 0 additions & 5 deletions app/src/main/assets/sync/vosk-catala/conf/mfcc.conf

This file was deleted.

1 change: 0 additions & 1 deletion app/src/main/assets/sync/vosk-catala/conf/mfcc.conf.md5

This file was deleted.

1 change: 0 additions & 1 deletion app/src/main/assets/sync/vosk-catala/conf/model.conf.md5

This file was deleted.

1 change: 0 additions & 1 deletion app/src/main/assets/sync/vosk-catala/graph/Gr.fst.md5

This file was deleted.

1 change: 0 additions & 1 deletion app/src/main/assets/sync/vosk-catala/graph/HCLr.fst.md5

This file was deleted.

13 changes: 0 additions & 13 deletions app/src/main/assets/sync/vosk-catala/graph/disambig_tid.int

This file was deleted.

This file was deleted.

Loading