Skip to content

timlod/speechrecorder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Newer version

The fork at https://github.com/dan-wells/speechrecorder contains many good updates and should be instead of this first version.

Basic SpeechRecorder clone

rec.py contains a basic SpeechRecorder clone written in Python which should be system-agnostic. Tested using Python 3.7, but 3.6+ should work well.

This code is set up to record Arctic A, but it is easy to adopt it for other texts.

Current version only saves one take. If you redo a part, it will overwrite the previous take.

This code is not the prettiest, I just wanted a quick tool that works. I do think all the bits around it (like saving multiple takes or loading different utterances) should be very easy to add if you know some python. Currently this tool spins up a separate process that handles audio recording/playback. I tried using a simple thread for this, but portaudio would not find the ASIO driver for my audio interface then. If anyone can figure out how to make this work with threads, that would be better (no while True: loop needed).

Prerequisites

Python packages:

  • PySoundFile
  • python-sounddevice
    • I recommend using this version (e.g. pip install sounddevice‑0.3.15‑cp37‑cp37m‑win_amd64.whl) as it includes a working version of PyAudio

You need to configure the correct Input and Output devices by hand for this tool to work:

Run the following commands to list available audio devices in a Python interpreter:

import sounddevice as sd
print(sd.query_devices())

From the list, note the indices of the devices you want to use for playback/recording and fill their numbers into the following lines of code in rec.py:

in_device = 10
out_device = 3
CHANNEL = 1

If you're using an audio interface with many input channels, and your microphone is plugged into input 8, put CHANNEL=8.

The default will record in 16kHz as this is the sampling rate used for building the voice according to the unit selection assignment. If you need higher/lower quality recordings, change fs to the required sampling rate. Default recording setting is 32bit float - sounddevice also allows for 16/24bit recordings. Look into the documentation to change it if necessary.

How to use

Run rec.py with python rec.py.

  • Press up and down to navigate sentences
  • Press space to record
  • Press down while recording to immediately record the next utterance (without stopping in between)
  • When not recording, press p to listen to the recorded audio
  • Press q to quit.

To use a text other than Arctic A, change labels and utts to be (same-length) python lists containing the identifier (used for saved file names) and utterances.

About

Basic Python SpeechRecorder clone

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages