This document outlines the variety of training scripts and external resources.
This section lists advanced training scripts that train RNNs on real-world datasets.
- recurrent-language-model.lua: trains a stack of LSTM, GRU, MuFuRu, or Simple RNN on the Penn Tree Bank dataset without or without dropout.
- recurrent-visual-attention.lua: training script used in Recurrent Model for Visual Attention. Implements the REINFORCE learning rule to learn an attention mechanism for classifying MNIST digits, sometimes translated. Showcases
nn.RecurrentAttention
,nn.SpatialGlimpse
andnn.Reinforce
. - noise-contrastive-esimate.lua: one of two training scripts used in Language modeling a billion words. Single-GPU script for training recurrent language models on the Google billion words dataset. This example showcases version 2 zero-masking. Version 2 is more efficient than version 1 as the
zeroMask
is interpolated only once. - multigpu-nce-rnnlm.lua : 4-GPU version of
noise-contrastive-estimate.lua
for training larger multi-GPU models. Two of two training scripts used in the Language modeling a billion words. This script is for training multi-layer SeqLSTM language models on the Google Billion Words dataset. The example uses MaskZero to train independent variable length sequences using the NCEModule and NCECriterion. This script is our fastest yet boasting speeds of 20,000 words/second (on NVIDIA Titan X) with a 2-layer LSTM having 250 hidden units, a batchsize of 128 and sequence length of a 100. Note that you will need to have Torch installed with Lua instead of LuaJIT; - twitter-sentiment-rnn.lua : trains stack of RNNs on a twitter sentiment analysis. The problem is a text classification problem that uses a sequence-to-one architecture. In this architecture, only the last RNN's last time-step is used for classification.
This section lists simple training scripts that train RNNs on dummy datasets. These scripts showcases the fundamental principles of the package.
- simple-recurrent-network.lua: uses the
nn.LookupRNN
module to instantiate a Simple RNN. Illustrates the first AbstractRecurrent instance in action. It has since been surpassed by the more flexiblenn.Recursor
andnn.Recurrence
. Thenn.Recursor
class decorates any module to make it conform to the nn.AbstractRecurrent interface. Thenn.Recurrence
implements the recursiveh[t] <- forward(h[t-1], x[t])
. Together,nn.Recursor
andnn.Recurrence
can be used to implement a wide range of experimental recurrent architectures. - simple-sequencer-network.lua: uses the
nn.Sequencer
module to accept a batch of sequences asinput
of sizeseqlen x batchsize x ...
. Both tables and tensors are accepted as input and produce the same type of output (table->table, tensor->tensor). TheSequencer
class abstract away the implementation of back-propagation through time. It also provides aremember(['neither','both'])
method for triggering what theSequencer
remembers between iterations (forward,backward,update). - simple-recurrence-network.lua: uses the
nn.Recurrence
module to define the h[t] <- sigmoid(h[t-1], x[t]) Simple RNN. Decorates it usingnn.Sequencer
so that an entire batch of sequences (input
) can forward and backward propagated per update. - simple-bisequencer-network.lua: uses a
nn.BiSequencerLM
and twonn.LookupRNN
to implement a simple bi-directional language model. - simple-bisequencer-network-variable.lua: uses
nn.RecLSTM
,nn.LookupTableMaskZero
,nn.ZipTable
,nn.MaskZero
andnn.MaskZeroCriterion
to implement a simple bi-directional LSTM language model. This example uses version 1 zero-masking where thezeroMask
is automatically interpolated from theinput
. - sequence-to-one.lua: a simple sequence-to-one example that uses
Recurrence
to build an RNN andSelectTable(-1)
to select the last time-step for discriminating the sequence. - encoder-decoder-coupling.lua: uses two stacks of
nn.SeqLSTM
to implement an encoder and decoder. The final hidden state of the encoder initializes the hidden state of the decoder. Example of sequence-to-sequence learning. - nested-recurrence-lstm.lua: demonstrates how RNNs can be nested to form complex RNNs.
- recurrent-time-series.lua demonstrates how train a simple RNN to do multi-variate time-series predication.
- rnn-benchmarks : benchmarks comparing Torch (using this library), Theano and TensorFlow.
- dataload : a collection of torch dataset loaders;
- A brief (1 hours) overview of Torch7, which includes some details about the rnn packages (at the end), is available via this NVIDIA GTC Webinar video. In any case, this presentation gives a nice overview of Logistic Regression, Multi-Layer Perceptrons, Convolutional Neural Networks and Recurrent Neural Networks using Torch7;