Skip to content
/ stt Public

STT is a tool that converts speech in audio files into text. It utilizes whisper.cpp for transcription and ffmpeg for processing various audio/video formats, making it easy to transcribe spoken content into written text.

License

Notifications You must be signed in to change notification settings

miukyo/stt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Logo

STT (Speech-to-Text Transcriber Tool)

STT is a tool that converts speech in audio files into text. It utilizes whisper.cpp for transcription and ffmpeg for processing various audio/video formats, making it easy to transcribe spoken content into written text.

Quick start:

  1. Download the STT from here
  2. Install ffmpeg (required)
  3. Run stt.exe

preview

Development

Follow these steps to set up STT on your local machine:

1. Clone the Repository

Clone the project to your local machine:

git clone https://github.com/miukyo/stt.git
cd stt

2. Install Dependencies

You’ll need ffmpeg and whisper.cpp for this tool to work properly.

  • Install ffmpeg
  • Build whisper.cpp (prebuilt binary is available on /lib folder, if you prefer building yourself please follow instruction from their repo)

3. Install Python Dependencies

Use pip to install required Python packages:

for running the program

pip install pywebview

for bundling the program into executable

pip install pyinstaller

4. Usage

After installing the dependencies, you can start transcribing your audio/video files.

Run the tool with the following command:

python main.py

To bundle the program run the following command:

pyinstaller main.spec

then copy lib folder into dist folder

Known Issues

  • This is the initial release. Please report any bugs or issues you encounter.

Contributing

Feel free to fork the repository, submit issues, or create pull requests to improve the tool. Contributions are welcome! License

This project is licensed under the MIT License - see the LICENSE file for details.

About

STT is a tool that converts speech in audio files into text. It utilizes whisper.cpp for transcription and ffmpeg for processing various audio/video formats, making it easy to transcribe spoken content into written text.

Resources

License

Stars

Watchers

Forks