Releases · Sharrnah/whispering

28 Dec 20:51

Sharrnah

v1.2.0.6

c44a1ee

v1.2.0.6

Standalone Release File (2.54 GB):

Download Server:

Changelog:

[FEATURE] Added loading state message for loading dialogues.
[FEATURE] Send OCR processed image.
[TASK] Exposed some additional advanced settings, like tts rate and pitch and whisper logprob and no_speech thresholds
[TASK] Small stability improvements

Full Changelog: v1.2.0.5...v1.2.0.6

Assets 2

19 Dec 19:36

Sharrnah

v1.2.0.5

ca4de30

v1.2.0.5

Standalone Release File (2.31 GB):

Download Server:

Changelog:

[TASK] Added available settings values for UI display
[TASK] Added settings update request event
[TASK] Hide processing loader in websocket html clients if Whisper returns without result
[BUGFIX] Fixed fallback of Silero model list requiring internet connection
[FEATURE] expose initial_prompt setting for Whisper

Full Changelog: v1.2.0.4...v1.2.0.5

initial_prompt can be used to give Whisper a texting style to try to follow.

For example, setting initial_prompt to "Umm, let me think like, hmm... Okay, here's what I'm, like, thinking." will let Whisper transcribe filler words if they appear in the audio.

Assets 2

13 Dec 21:05

Sharrnah

v1.2.0.4

61a1fc4

v1.2.0.4

Standalone Release File (2.31 GB):

Download Server:

Changelog:

[TASK] Make most long-running tasks threaded.
[TASK] Only send OCR + Translation results to requesting client
[BUGFIX] Only send updated settings to other clients
[FEATURE] Send stop processing event with no whisper result

Full Changelog: v1.2.0.3...v1.2.0.4

Find the new UI here:
https://github.com/Sharrnah/whispering-ui/releases/latest

After download of the UI, place the exe and ttf into the root of your whispering tiger folder. (where the other .bat files, README.md etc. are located.)

Assets 2

12 Dec 22:45

Sharrnah

v1.2.0.3

e629b41

v1.2.0.3

Standalone Release File (2.31 GB):

Download Server:

Changelog:

[TASK] Make A.I. Model device for NLLB200 configurable
[BUGFIX] Add missing default settings
[BUGFIX] Wrong model download links
[BUGFIX] Signal only works in main thread of the main interpreter
- This results in the A.I. models to be downloaded on startup instead of when the model is requested.

Find the new UI here:
https://github.com/Sharrnah/whispering-ui/releases/latest

After download of the UI, place the exe and ttf into the root of your whispering tiger folder. (where the other .bat files, README.md etc. are located.)

Assets 2

11 Dec 21:36

Sharrnah

v1.2.0.2

375c31e

v1.2.0.2

Standalone Release File (2.31 GB):

Download Server:

Changelog:

[BUGFIX] silero init error with invalid settings
[TASK] Set OSC IP to a better default
[TASK] allow setting phrase_time_limit, pause and energy to be read from settings

Find the new UI here:
https://github.com/Sharrnah/whispering-ui/releases/latest

After download of the UI, place the exe and ttf into the root of your whispering tiger folder. (where the other .bat files, README.md etc. are located.)

Assets 2

10 Dec 22:11

Sharrnah

v1.2.0.1

e2bb0af

v1.2.0.1

Standalone Release File (2.31 GB):

Download Server:

Changelog:

[BUGFIX] Some more stability improvement
[BUGFIX] in combination with the new UI throwing: Invalid language.

Find the new UI here:
https://github.com/Sharrnah/whispering-ui/releases/latest

After download of the UI, place the exe and ttf into the root of your whispering tiger folder. (where the other .bat files, README.md etc. are located.)

Assets 2

08 Dec 02:22

Sharrnah

v1.2.0.0

c8419a3

v1.2.0.0

Standalone Release File (2.31 GB):

Download Server:

Changelog:

[TASK] Updated libraries
- Including Whisper Project which now features a large.v2 model.
[BUGFIX] Improvements about the general stability
[BUGFIX] TTS Silero loading on CPU and fallback if CUDA is not available
[BUGFIX] TTS Silero error if internet connection has issues. (Was caused by a forced online check which is now disabled)
[TASK] Added preprocessing of the text send to Silero. (So now numbers can be spoken and multiple punctuation's don't freak out the TTS)
[TASK] Websocket remote only shows the working Silero V3 models.
[BUGFIX] fixed issue with multiline text language recognition
[BUGFIX] CLI argument and settings-file options fallback.
[TASK] Improved websocket transfer of specific messages so they are not send to all clients anymore. (prevents multiple browser to play TTS etc.)
[TASK] Some general preparations for the upcoming new UI.

Assets 2

23 Nov 17:56

Sharrnah

v1.1.0.0

549dea0

v1.1.0.0

Standalone Release File (2.30 GB):

Download Server:

Changelog:

[FEATURE] Added TTS (Text 2 Speech) using Silero
[FEATURE] Added model download retry, fallback and checksum check.
[FEATURE] Added FLAN-T5 conditioning.
[TASK] Code restructuring.

Text 2 Speech example:

fvzyuMpe.mp4

Assets 2

14 Nov 15:05

Sharrnah

v1.0.7.1

8e45875

v1.0.7.1

Standalone Release File (2.30 GB):

Download Server:

Changelog:

[BUGFIX] translate to speaker if flan-t5 question processing is disabled
[TASK] Added OSC-auto-processing option (To toggle OSC temporarily while app is running)

About FLAN-T5:

flan_process_only_questions and flan_whisper_answer can be enabled, to have FLAN-T5 only answer spoken questions.
That means the from whisperAI recognized text should include a question-typical word and a question-mark.

Since FLAN-T5 can do much more, there might be more possibilities to use this A.I. model in the future.

Assets 2

13 Nov 14:35

Sharrnah

v1.0.7.0

3493c50

v1.0.7.0

Standalone Release File (2.30 GB):

Download Server:

Changelog:

[FEATURE] Added experimental FLAN-T5 AI. supporting automatic answering, continuation to questions or phrases, spoken or written. (see more on https://analyticsindiamag.com/google-ai-introduces-flan-t5-a-new-open-source-language-model/).
[FEATURE] Added LID language classifier for auto-detecting the language of text.
[FEATURE] Added NLLB200 text translator. Supporting around 200 languages in a single model.
[FEATURE] Added config file. (To support more settings without having to add much more Command-line flags)
[FEATURE] Added bottom_align HTML parameter to websocket clients. (To make it easier to align streaming overlays at the bottom of the image)
[TASK] Updated dependencies
[CHANGE] (Breaking change if used as command-line flag!) renamed m2m100_size and m2m100_device to txt_translator_size and txt_translator_device accordingly,

About FLAN-T5:

Since FLAN-T5 can do much more, there might be more possibilities to use this A.I. model in the future.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standalone Release File (2.54 GB):

Changelog:

Standalone Release File (2.31 GB):

Changelog:

Standalone Release File (2.31 GB):

Changelog:

Standalone Release File (2.31 GB):

Changelog:

Standalone Release File (2.31 GB):

Changelog:

Standalone Release File (2.31 GB):

Changelog:

Standalone Release File (2.31 GB):

Changelog:

Standalone Release File (2.30 GB):

Changelog:

Text 2 Speech example:

Standalone Release File (2.30 GB):

Changelog:

About FLAN-T5:

Standalone Release File (2.30 GB):

Changelog:

About FLAN-T5:

Releases: Sharrnah/whispering

v1.2.0.6

Standalone Release File (2.54 GB):

Changelog:

v1.2.0.5

Standalone Release File (2.31 GB):

Changelog:

v1.2.0.4

Standalone Release File (2.31 GB):

Changelog:

v1.2.0.3

Standalone Release File (2.31 GB):

Changelog:

v1.2.0.2

Standalone Release File (2.31 GB):

Changelog:

v1.2.0.1

Standalone Release File (2.31 GB):

Changelog:

v1.2.0.0

Standalone Release File (2.31 GB):

Changelog:

v1.1.0.0

Standalone Release File (2.30 GB):

Changelog:

Text 2 Speech example:

v1.0.7.1

Standalone Release File (2.30 GB):

Changelog:

About FLAN-T5:

v1.0.7.0

Standalone Release File (2.30 GB):

Changelog:

About FLAN-T5: