-
Notifications
You must be signed in to change notification settings - Fork 277
LSTM RNN Notes
Daniel Shiffman edited this page Nov 22, 2016
·
13 revisions
- Based on RecurrentJS
- Based on Keras LSTM examples and LSTM IMDB Movie Review Tutorial by Josiah Olson
- Tensorflow installation (below is for Python 2 on Mac OSX with CPU only, for other OS and GPU training, see this link)
$ export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-0.11.0-py2-none-any.whl
$ sudo pip install --upgrade $TF_BINARY_URL
$ sudo pip install keras
- A Return to Machine Learning by Kyle McDonald
- Nature of Code Neural Networks
- The Unreasonable Effectiveness of Recurrent Neural Networks
- Understanding LSTM Networks
- RecurrentJS (JavaScript)
- TensorFlow -- Google's open source machine learning framework (C++/python)
- Keras -- High level wrapper for machine learning (works on top of tensorflow)
- Run Keras models in the browser
- Torch-RNN (Lua)
- RNN: Recurrent Neural Network
- LSTM: Long Short-Term Memory
- Supervised Learning: training with "known" data
- Epoch: single pass through the entire training set
- Model: The results of a training process (can be saved for later use).
- Word (or char) vector: You can’t feed a string as training input to a neural network. A "word vector" is a way of representing text data that a neural network can understand. There isn't one way of doing this, but the end result is a big array of values describing every word (or char) from a given corpus and it's likelihood of appearing with another word around it.
- Prediction: The output of a neural network with arbitrary inputs.
- Learning Rate: This is a value that tells the neural network how fast to change its weights based on errors. When it is first training, it should learn fast but as it gets better that learning should "slow down."
- Perplexity: A measurement of accuracy: how much is the model guessing? A perplexity of 1 is no guessing, a perplexity of 10 is guessing between 10 options.
- Temperature: Affects the randomness of predictions. High temperatures (1.0+) will produce more unexpected outcomes but you'll see more "errors". Low temperatures will produce expected outcomes, but include a lot of repetition or most common words.