This jupyter notebook is a tutorial and demonstration implementation of Alex Graves's paper Adaptive Computation Time for Recurrent Neural Networks. Code is based on previous PyTorch implementation by Jason Phang
The notebook connects the formulas used in the paper to the code that implements those formulas by implementing a training pipeline on a small but meaningful dataset