T-Phenotype: Discovering Phenotypes of Predictive Temporal Patterns in Disease Progression (AISTATS2023)
Source code for the T-Phenotype approach proposed in paper "T-Phenotype: Discovering Phenotypes of Predictive Temporal Patterns in Disease Progression".
The simplest way to install is through pip
:
pip install git+https://github.com/yvchao/tphenotype
# Alternatively:
# pip install git+https://github.com/vanderschaarlab/tphenotype
To run the experiments, directly clone this repository via the following command.
git clone [email protected]:yvchao/tphenotype.git
# Alternatively:
# git clone [email protected]:vanderschaarlab/tphenotype.git
# Navigate into the repo:
cd tphenotype
# pip-install in editable mode:
pip install -e .
The following pip extras are available:
benchmarks
(pip install -e .[benchmarks]
): Adds additional dependencies needed for running the benchmarks. Install this extra if replicating benchmark results.external
benchmarks require TensorFlow 1.x, this will be installed if Python is<= 3.8
(as it is not compatible with newer Python). Otherwise, these benchmarks cannot be run.
dev
(pip install -e .[dev]
): Addsbenchmarks
extra and additional development related dependencies.
For full details, see [options.extras_require]
section in setup.cfg
.
In order to use CUDA, make sure your virtual environment (or conda environment) has the appropriate CUDA binaries. See PyTorch Get Started for details.
The rest of this section is only relevant to benchmarks
or dev
installation extras with CUDA.
The benchmarks
(and dev
) install extras will install tensorflow==1.15.5
if your Python version is <= 3.8
, as this is needed by external
benchmarks. It is tricky to make TF1 work with CUDA, and you may find it easier to just use the CPU for these benchmarks.
In order to make it CUDA compatible, you will need to check compatibility, e.g. here. Since TF1 is an old library, it is not officially supported by most modern CUDA devices. However, NVIDIA maintains a compatible version of TF1. This will be installed by benchmarks
and dev
extras if your environment has Python 3.8, as this is the only Python version for which binaries are available. Specifically, nvidia-pyindex
and nvidia-tensorflow[horovod]
will be installed.
To check CUDA has worked for TF1, run the following in Python:
import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
If you do not see any errors, and a GPU device should be shown at the end (if available), this indicates success.
Certain issues may arise, for instance with CUDA for PyTorch and CUDA for tensorflow
version compatibility. You will need to check the "NVIDIA CUDA Runtime" compatibility here and install the version of nvidia-tensorflow
that matches the CUDA binaries used by torch
. For example, if your installation of torch
uses CUDA 11.8
, you will need to install nvidia-tensorflow v22.12
, like so:
pip install "nvidia-tensorflow[horovod]==1.15.5+nv22.12"
Three datasets are used in the experiments.
- Synthetic data: provided in this repo as
data/synthetic/data_mixed.npz
; can be generated by runningdata/synthetic/data_generation.ipynb
. - PhysioNet ICU data: publicly available at PhysioNet.
- ADNI data: can be downloaded from loni.
There are three major parts of the experiment.
notebooks/benchmark/
: runbash run_experiment.sh
from withinnotebooks/benchmark/
and thensummary.ipynb
to generate benchmark results on the three datasets.notebooks/case_study/
: runexperiment_adni.ipynb
to generate the major results in the main manuscript.notebooks/appendix/
: run the four notebooks to generate all the rest results included in the appendix.
- The exact hyperparameters used in the paper experiments can be found here. Place these under
notebooks/benchmark/hyperparam_selection/
before runningrun_experiment.sh
to use them in the benchmarks experiments. - Some experiments are sensitive to the specific hardware and sampling order (in particular, the
"ICU"
experiments); and while the exact results may somewhat differ when running in your local environment, the argument in the paper is unaffected.
If you find the software useful, please consider citing the following paper:
@inproceedings{tphenotype2023,
title={T-Phenotype: Discovering Phenotypes of Predictive Temporal Patterns in Disease Progression}
author={Qin, Yuchao and van der Schaar, Mihaela and Lee,Changhee},
booktitle={Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS) 2023},
year={2023}
}