-
Notifications
You must be signed in to change notification settings - Fork 40
Brief Description of MaTEx TensorFlow (Old)
Charles Siegel edited this page Mar 28, 2017
·
2 revisions
MaTEx-TensorFlow provides a set of scripts to provide a distributed memory implementation of TensorFlow using MPI.
The script is available in src/deeplearning/tensorflow_old/mpi-tensorflow.py
MaTEx TensorFlow supports data parallelism (input dataset is split in compute nodes and the model/gradients are averaged at each step using MPI All-to-all reduction). Please read the arxiv paper (Distributed TensorFlow with MPI - https://arxiv.org/abs/1603.02339) for more details.
Getting Started on MaTEx-TensorFlow
- Required Software
- Installing MaTEx-TensorFlow on CPU Clusters
- Installing MaTEx-TensorFlow on GPU Clusters
- MaTEx-TensorFlow on Older glibc(v<2.19)
- DataSet Reader
- Testing Scripts
- Performance
- Running on PNNL Systems
- Running on NERSC Systems
- Restarting the MaTEx TensorFlow environment