TensorDebugger (TDB) is a visual debugger for deep learning. It extends TensorFlow (Google's Deep Learning framework) with breakpoints + real-time visualization of the data flowing through the computational graph.
Specifically, TDB is the combination of a Python library and a Jupyter notebook extension, built around Google's TensorFlow framework. Together, these extend TensorFlow with the following features:
- Breakpoints: Set breakpoints on Ops and Tensors in the graph. Graph execution is paused on breakpoints and resumed by the user (via
tdb.c()
) Debugging features can be used with or without the visualization frontend. - Arbitrary Summary Plots: Real-time visualization of high-level information (e.g. histograms, gradient magnitudes, weight saturation) while the network is being trained. Supports arbitrary, user-defined plot functions.
- Flexible: Mix user-defined Python and plotting functions with TensorFlow Nodes. These take in
tf.Tensors
and output placeholder nodes to be plugged into TensorFlow nodes. The below diagram illustrates how TDB nodes can be mixed with the TensorFlow graph.
Modern machine learning models are parametrically complex and require considerable intuition to fine-tune properly.
In particular, Deep Learning methods are especially powerful, but hard to interpret in regards to their capabilities and learned representations.
Can we enable better understanding of how neural nets learn, without having to change model code or sacrifice performance? Can I finish my thesis on time?
TDB addresses these challenges by providing run-time visualization tools for neural nets. Real-time visual debugging allows training bugs to be detected sooner, thereby reducing the iteration time needed to build the right model.
To install the Python library,
pip install tfdebugger
To install the Jupyter Notebook extension, run the following in a Python terminal (you will need to have IPython or Jupyter installed)
import notebook.nbextensions
import urllib
import zipfile
SOURCE_URL = 'https://github.com/ericjang/tdb/releases/download/tdb_ext_v0.1/tdb_ext.zip'
urllib.urlretrieve(SOURCE_URL, 'tdb_ext.zip')
with zipfile.ZipFile('tdb_ext.zip', "r") as z:
z.extractall("")
notebook.nbextensions.install_nbextension('tdb_ext',user=True)
To get started, check out the MNIST Visualization Demo. More examples and visualizations to come soon.
status,result=tdb.debug(evals,feed_dict=None,breakpoints=None,break_immediately=False,session=None)
debug()
behaves just like Tensorflow's Session.run(). If a breakpoint is hit, status
is set to 'PAUSED' and result
is set to None
. Otherwise, status
is set to 'FINISHED' and result
is set to a list of evaluated values.
status,result=tdb.c()
Continues execution of a paused session, until the next breakpoint or end. Behaves like debug
.
status,result=tdb.s()
Evaluate the next node, then pause immediately to await user input. Unless we have reached the end of the execution queue, status
will remain 'PAUSED'. result
is set to the value of the node we just evaluated.
q=tdb.get_exe_queue()
Return value: list of remaining nodes to be evaluated, in order.
val=tdb.get_value(node)
Returns value of an evaluated node (a string name or a tf.Tensor)
TDB supports 2 types of custom Ops:
Here is an example of mixing tdb.PythonOps with TensorFlow.
Define the following function:
def myadd(ctx,a,b):
return a+b
a=tf.constant(2)
b=tf.constant(3)
c=tdb.python_op(myadd,inputs=[a,b],outputs=[tf.placeholder(tf.int32)]) # a+b
d=tf.neg(c)
status,result=tdb.debug([d], feed_dict=None, breakpoints=None, break_immediately=False)
When myadd
gets evaluated, ctx
is the instance of the PythonOp that it belongs to. You can use ctx to store state information (i.e. accumulate loss history).
PlotOps are a special instance of PythonOp that send graphical output to the frontend.
This only works with Matplotlib at the moment, but other plotting backends (Seaborn, Bokeh, Plotly) are coming soon.
def watch_loss(ctx,loss):
if not hasattr(ctx, 'loss_history'):
ctx.loss_history=[]
ctx.loss_history.append(loss)
plt.plot(ctx.loss_history)
plt.ylabel('loss')
ploss=tdb.plot_op(viz.watch_loss,inputs=[loss])
Refer to the MNIST Visualization Demo for more examples. You can also find more examples in the tests/ directory.
No, but it is built on top of it.
TDB is especially useful at the model prototyping stage and verifying correctness in an intuitive manner. It is also useful for high-level visualization of hidden layers during training.
TensorBoard is a suite of visualization tools included with Tensorflow. Both TDB and TensorBoard attach auxiliary nodes to the TensorFlow graph in order to inspect data.
TensorBoard cannot be used concurrently with running a TensorFlow graph; log files must be written first. TDB interfaces directly with the execution of a TensorFlow graph, and allows for stepping through execution one node at a time.
Out of the box, TensorBoard currently only supports logging for a few predefined data formats.
TDB is to TensorBoard as GDB is to printf. Both are useful in different contexts.
Apache 2.0