Releases · Sharrnah/whispering

06 Apr 22:53

Sharrnah

v1.3.4.1

5f38da2

v1.3.4.1

Standalone Release File (2.55 GB):

Download Server:

Changelog (v1.3.4.1)

[FEATURE] Send errors on stdErr in json format for the UI to parse
[FEATURE] Added download function that works in threads
[TASK] Added download arguments to the plugin tts method (Plugin incompatible change!)

Full Changelog: v1.3.4.0...v1.3.4.1

Assets 2

01 Apr 16:33

Sharrnah

v1.3.4.0

41df8d8

v1.3.4.0

Standalone Release File (2.55 GB):

Download Server:

Changelog (v1.3.4.0)

[FEATURE] Add support to show from/to language at translation in osc prefix
[BUGFIX] Update CTRanslate2 + faster-whisper to latest version to fix #10
[BUGFIX] Use sentencepiece also as tokenizer for better multi sentence translation
[BUGFIX] error for plugin if on_enable is defined but not on_disable
[TASK] Renamed txt_ascii to txt_romaji to reflect better what it does
[TASK] Added realtime_temperature_fallback option.
[TASK] separate ignorelist into external ignorelist.txt
[TASK] Added Volume direction OSC plugin to readme
[TASK] Add new command control VRChat Parameters Plugin to Readme

Full Changelog: v1.3.3.2...v1.3.4.0

Assets 2

25 Mar 22:40

Sharrnah

v1.3.3.2

96e86cc

v1.3.3.2

Standalone Release File (2.55 GB):

Download Server:

Changelog (v1.3.3.2)

[FEATURE] Added quit signal to quit running backend
[TASK] Improved NLLB-200 Ctranslate based text-translator on multiple sentences in one text.
[TASK] Updated CTranslate2
[BUGFIX] Make sure audio frames are cleared on new start of recording
[BUGFIX] Fix case when FindWindow windows API call returns wrong window handle for window title. (should fix OCR in these cases not working without error)

Full Changelog: v1.3.3.1...v1.3.3.2

Assets 2

24 Mar 00:35

Sharrnah

v1.3.3.1

1fcb5eb

v1.3.3.1

Standalone Release File (2.55 GB):

Download Server:

Changelog (v1.3.3.1)

[FEATURE] Added NLLB-200 using CTranslate2. (same as faster-whisper)
[FEATURE] Added new streaming overlay (completely rewritten and looks more like traditional subtitles)
[FEATURE] Added option to set OSC Chat Prefix
[FEATURE] Added mirror setting to html websocket overlays
[TASK] Added option to set NLLB-200 precision.
[TASK] Show typing indicator when starting speaking even without realtime mode
[TASK] Updated some dependencies
[TASK] Allow ctranslate to use float16 even on non-efficient FP16 devices
[BUGFIX] invalidating TTS data when using SSML break tag.

Full Changelog: v1.3.3.0...v1.3.3.1

Assets 2

18 Mar 20:36

Sharrnah

v1.3.3.0

debf000

v1.3.3.0

Standalone Release File (2.55 GB):

Download Server:

Changelog (v1.3.3.0)

[FEATURE] Added realtime transcription feature (only available when using VAD)
- [FEATURE] Added optional seperate realtime Whisper model. (allows using a smaller+faster model for realtime transcriptions. Only the final full-clip transcription uses the regular selected whisper model.)
- [FEATURE] Updated websocket clients to show realtime transcriptions
[TASK] Make phrase_time_limit, pause and energy values configurable at runtime
[TASK] Remove LLM/FLAN-T5 Large Language Model functions from main code and split it into a seperate plugin. (see https://gist.github.com/Sharrnah/eeaf2acda3e92d8eed1747f05a3f4102 )
[FEATURE] Added optional on_enable, on_disable methods for plugins
[TASK] cleaned up some ARGOS translation remains in code.
[TASK] set faster-whisper as default.
[BUGFIX] reactivate channel downsampling to improve detection when more than 1 channel is send

Full Changelog: v1.3.2.2...v1.3.3.0

Assets 2

13 Mar 14:03

Sharrnah

v1.3.2.2

d20a5e4

v1.3.2.2

Standalone Release File (2.55 GB):

Download Server:

Changelog (v1.3.2.2)

[TASK] Added Option to set more precision types (for faster-whisper). int8_float16 should improve memory footprint and speed even more without sacrificing much of the precision
[TASK] Added beam_size option to increase speed even more while sacrificing quality. Default is 5. a beam_size of 2 or 1 can make it really fast.
[TASK] Added cpu_threads and num_workers options. num_workers is not really used yet, but cpu_threads can improve performance when running on CPU if the CPU has enough cores/threads.

Full Changelog: v1.3.2.1...v1.3.2.2

Assets 2

12 Mar 01:46

Sharrnah

v1.3.2.1

cf06522

v1.3.2.1 (hotfix)

Standalone Release File (2.55 GB):

Download Server:

Changelog (hotfix):

[BUGFIX] Downloading of faster-whisper models

Changelog (v1.3.2.0)

[FEATURE] Added faster-whisper (smaller memory footprint + can be about 3x faster)
[BUGFIX] Updated whisper with bugfix of repeating sentences
[TASK] Improved on the Plugin system
[TASK] removed ARGOS translate because of incompatibility with faster-whisper
[BUGFIX] lock scikit-image to version 0.19.3 because of build bug in 0.20.0

Full Changelog: v1.3.1.0...v1.3.2.1

Assets 2

11 Mar 21:04

Sharrnah

v1.3.2.0

4e675fb

v1.3.2.0

Standalone Release File (2.55 GB):

Download Server:

Release File failed at downloading faster-whisper models. See hotfix

Changelog:

[FEATURE] Added faster-whisper (smaller memory footprint + can be about 3x faster)
[BUGFIX] Updated whisper with bugfix of repeating sentences
[TASK] Improved on the Plugin system
[TASK] removed ARGOS translate because of incompatibility with faster-whisper
[BUGFIX] lock scikit-image to version 0.19.3 because of build bug in 0.20.0

Full Changelog: v1.3.1.0...v1.3.2.0

Assets 2

06 Mar 18:02

Sharrnah

v1.3.1.0

d592464

v1.3.1.0

Standalone Release File (2.55 GB):

Download Server:

Changelog:

[FEATURE] Added Option for additional VAD Check on full Clip in addition to each frame
[TASK] Reduced default VAD confidence threshold to 0.4
[TASK] Expose FP16 Option for Whisper Model
[TASK] Skip NLLB-200 translation if source and target language are the same
[FEATURE] Simple Plugin System added.
[FEATURE] Proof of concept for additional LLM models.

Full Changelog: v1.3.0.1...v1.3.1.0

Assets 2

09 Jan 21:10

Sharrnah

v1.3.0.1

9bd2136

v1.3.0.1

Standalone Release File (2.55 GB):

Download Server:

Changelog:

[FEATURE] Added VAD (Voice Activity Detection)
[BUGFIX] Fix error on logging special language characters
[TASK] Reverted back logprob_threshold and no_speech_threshold to old defaults because of less consistent recognition with new values

Full Changelog: v1.2.0.6...v1.3.0.1

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standalone Release File (2.55 GB):

Changelog (v1.3.4.1)

Standalone Release File (2.55 GB):

Changelog (v1.3.4.0)

Standalone Release File (2.55 GB):

Changelog (v1.3.3.2)

Standalone Release File (2.55 GB):

Changelog (v1.3.3.1)

Standalone Release File (2.55 GB):

Changelog (v1.3.3.0)

Standalone Release File (2.55 GB):

Changelog (v1.3.2.2)

Standalone Release File (2.55 GB):

Changelog (hotfix):

Changelog (v1.3.2.0)

Standalone Release File (2.55 GB):

Release File failed at downloading faster-whisper models. See hotfix

Changelog:

Standalone Release File (2.55 GB):

Changelog:

Standalone Release File (2.55 GB):

Changelog:

Releases: Sharrnah/whispering

v1.3.4.1

Standalone Release File (2.55 GB):

Changelog (v1.3.4.1)

v1.3.4.0

Standalone Release File (2.55 GB):

Changelog (v1.3.4.0)

v1.3.3.2

Standalone Release File (2.55 GB):

Changelog (v1.3.3.2)

v1.3.3.1

Standalone Release File (2.55 GB):

Changelog (v1.3.3.1)

v1.3.3.0

Standalone Release File (2.55 GB):

Changelog (v1.3.3.0)

v1.3.2.2

Standalone Release File (2.55 GB):

Changelog (v1.3.2.2)

v1.3.2.1 (hotfix)

Standalone Release File (2.55 GB):

Changelog (hotfix):

Changelog (v1.3.2.0)

v1.3.2.0

Standalone Release File (2.55 GB):

Release File failed at downloading faster-whisper models. See hotfix

Changelog:

v1.3.1.0

Standalone Release File (2.55 GB):

Changelog:

v1.3.0.1

Standalone Release File (2.55 GB):

Changelog: