-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Simple Network Builder
The behavior of the simple network builder is controlled by the SimpleNetworkBuilder
block of the options. When an option is omitted the default value is assumed. We first provide a concise example and list all control parameters and options below.
SimpleNetworkBuilder = [
# 2 inputs, 2 hidden layers with 50 element nodes each, 2 outputs
layerSizes = 2:50*2:2
trainingCriterion = "CrossEntropyWithSoftmax"
evalCriterion = "ErrorPrediction"
layerTypes = "Sigmoid"
applyMeanVarNorm = true
]
In the above example 'trainingCriterion' and 'layerTypes' could be omitted since they are using the default values. The following parameters are available:
-
initValueScale
: the value for scaling the range of the random numbers used for initialization. Default is1
. If the model parameters are initialized using the uniform distribution, the random number range will be adjusted to[-0.05 * initValueScale, 0.05 * initValueScale]
. If the model parameters are initialized using the Gaussian distribution, the standard deviation will be adjusted to0.2 * initValueScale * fanout^(-1/2)
. -
layerTypes
: the type of nonlinear operation in hidden layers. Valid values areSigmoid
(default),Tanh
, andRectifiedLinear
. -
uniformInit
: determines whether to use uniform distribution to initialize model parameters. Valid values aretrue
(default) andfalse
(using Gaussian distribution to initialize model parameters). -
applyMeanVarNorm
: whether to apply mean/variance normalization on the input. Valid values aretrue
andfalse
(default). -
addDropoutNodes
: whether to add drop-out nodes. The default isfalse
. If specified totrue
, a drop-out node will be applied to the input node and the output of every hidden layer. -
layerSizes
: specifies the dimensions of layers. For instance,layerSizes=128:10:200:4000
describes a neural network with two hidden layers. The first hidden layer has a dimension of 10, and the second hidden layer has a dimension of 200. The input and output layers have a dimension of 128 and 4000, respectively. -
trainingCriterion
: the criterion used for training. The default isCrossEntropyWithSoftmax
. Alternatives areSquareError
,CrossEntropy
, andClassBasedCrossEntropyWithSoftmax
. TheClassBasedCrossEntropyWithSoftmax
is for class-based training, which would be useful if the output dimension is large and therefore need to be split into classes to speed-up the training and evaluation. -
evalCriterion
: the criterion for evaluation. The selection of values are the same as thetrainingCriterion
. -
lookupTableOrder
: specifies the order of context expanding in the lookupNode. The default value is1
. Setting it to a value such as 3 would expand the input dimension in a context-dependent way by an order of 3. For example, if the input observation has a dimension of 20, setting this value to 3 would set the input node dimension to 60.
For recurrent neural networks (RNNs), there are additional parameters.
-
recurrentLayer
: specifies the layers that contain self recurrent connections. By default there is no recurrent layer. Use the syntaxn1:n2:n3
to specify that layers n1, n2, and n3 have recurrent connections. -
defaultHiddenActivity
: the default hidden layer activity value used by the delay node when accessing values before the first observation. The default value is0.1
. -
rnnType
: the type of predefined networks. Valid values are:-
SIMPLENET
: the feed-forward neural network. This is the default network type. -
SIMPLERNN
: the simple RNN, which may be a deep RNN in which several layers have recurrent loops. -
CLASSLM
: the class-based simple RNN. It uses sparse input, sparse parameter and sparse output. This is often used for language modeling tasks. -
LBLM
: the log-bilinear neural network. -
LSTM
: the long short-term memory neural network. -
CLASSLSTM
: the class-based long short-term memory neural network. It uses sparse input, sparse parameter and sparse output. This is often used for language modeling tasks.
-