By: Marc de la Barrera i Bardalet, Tim de Silva
nndp
provides a framework for solving finite horizon dynamic programming problems using neural networks that is implemented using the JAX functional programming paradigm and Haiku. This solution technique, introduced and described in detail by Duarte, Fonesca, Goodman, and Parker (2021), applies to problems of the following form:
The state vector is denoted by
We parametrize
-
u(state, action)
: reward function for$s_t$ =state
and$a_t$ =action
. -
m(key, state, action)
: state evolution equation for$s_{t+1}$ if$s_t$ =state
and$a_t$ =action
.key
is a JAX RNG key used to simulate any shocks present in the model. -
Gamma(state)
: defines the set of possible actions,$a_t$ , at$s_t$ =state
-
F(key, N)
: samplesN
observations from the distribution of$s_0$ .key
is a JAX RNG key used to simulate any shocks present in the model. -
policy(state, params, nn)
: defines how the output of a Haiku Neural Network,nn
, with parameters,params
, is mapped into an action at$s_t$ =state
We provide an example application to the income fluctations problem in docs/source/notebooks/income_fluctuations/main.ipynb
to illustrate how this framework can be used.
nndp
requires JAX and Haiku to be installed. To install with pip
, run pip install nndp
.
Duarte, Victor, Julia Fonseca, Jonathan A. Parker, and Aaron Goodman (2021), Simple Allocation Rules and Optimal Portfolio Choice Over the Lifecycle, Working Paper.