A Torch implementation of SampleRNN: An Unconditional End-to-End Neural Audio Generation Model.
Listen to a selection of generated output at the following links:
Feel free to submit links to any interesting output you generate or dataset creation scripts as a pull request.
The following packages are required to run SampleRNN_torch:
- nn
- cunn
- cudnn
- rnn
- optim
- audio
- xlua
- gnuplot
NOTE: Update nn
and cudnn
even if they were already installed as fixes have been submitted which affect this project.
To retrieve and prepare the piano dataset, as used in the reference implementation, run:
cd datasets/piano/
./create_dataset.sh
Other dataset preparation scripts may be found under datasets/
.
Custom datasets may be created by using scripts/generate_dataset.lua
to slice multiple audio files into segments for training, audio must be placed in datasets/[dataset]/data/
.
To start a training session run th train.lua -dataset piano
. To view a description of all accepted arguments run th train.lua -help
.
To view the progress of training run th generate_plots
, the loss and gradient norm curve will be saved in sessions/[session]/plots/
.
By default samples are generated at the end of every training epoch but they can also be generated separately using th train.lua -generate_samples
with the session
parameter to specify the model.
Multiple samples are generated in batch mode for efficiency, however generating a single audio sample is faster with th fast_sample.lua
. See -help
for a description of the arguments.
A pretrained model of the piano dataset is available here. Download and copy it into your sessions/
directory and then extract it in place.
More models will be uploaded soon.
This code is based on the reference implementation in Theano.