Verifying the Union of Manifolds Hypothesis for Image Data

Code accompanying the ICLR 2023 paper:

Verifying the Union of Manifolds Hypothesis for Image Data
Bradley Brown, Anthony L. Caterini, Brendan Leigh Ross, Jesse C. Cresswell, Gabriel Loaiza-Ganem
[arXiv, Video]

Requirements

A conda environment with all dependencies to run the repo can be created with:

conda env create -f env.yaml
conda activate uomh

Using the Repo

This repo contains all the code required to reproduce the generative modelling results from the paper. Specifically, the code to run a variety of deep generative models (ex. VAEs, WAEs) using disconnected (as introduced in our paper), conditional and vanilla training. Additionally, two step models (as introduced in the Two Step Paper) can be trained by combining a generalized auto-encoder (ex. VAE, WAE) with a density estimator (ex. NF, VAE).

Training Models

The following scripts are used to train models:

single_main.py: Used to train a non-disconnected, non-two-step model.
main.py: Used to train a non-disconnected, two-step model.
single_cluster_main.py: Used to train a disconnected, non-two-step model.
cluster_main.py: Used to train a disconnected, two-step model.

Command line arguments can be found in the respective files, notable command line arguments include:

dataset : str: which dataset to train on, options are mnist, fashion-mnist, svhn, cifar10, cifar100.
run-name : str: name of folder to store the run.
model : str: type of model for non-two-step runs, options are vae, avb, ae, wae, bigan, gan, flow, ebm, and arm.
gae-model : str: type of generalized autoencoder for two-step runs, options are vae, avb, ae, wae.
de-model : str: type of density estimator for two-step runs, options are vae, avb, flow, ebm.

The config folder contains many more configuration options, organized as follows:

├── config
│   ├── cluster_config.py # config for clustering/id estimation
│   ├── density_estimator.py # config for density estimator
│   ├── generalized_autoencoder.py # config for gae
│   ├── shared_config.py # config for general training

To change the default configuration options in these files, append --CONFIG_TYPE CONFIG_NAME=CONFIG_VALUE to the experiment run where

If CONFIG_NAME is defined in the cluster_config.py file of the config directory, then CONFIG_TYPE is cluster-config.
Else, if not training a two-step model, it is config.
Otherwise, it is gae-config, de-config, or shared-config if CONFIG_NAME is defined in the generalized_autoencoder.py, density_estimator.py, or shared_config.py file in the config directory respectively.

Example commands are

./single_main.py --dataset mnist --model flow  --run-name runs/flow_mnist_run
./single_cluster_main.py --dataset svhn --model vae --run-name runs/flow_svhn_cluster_run/1 --config scale_data=True 
./cluster_main.py --dataset cifar10 --model wae --cluster-config clustered_id_samples_save=./save_here --gae-config use_lr_scheduler=True

Repo Organization

The following tree shows the important files/folders in the repo

├── two_step_zoo
│   ├── clusterer # code for clustering datasets for disconnected models
│   ├── datasets  # dataset code
│   ├── density_estimator # code for ebms, autoregressive models and flows
│   ├── evaluators # code for evaluation metrics
│   ├── generalized_autoencoder # code for vae, wae, gans and bigans
│   ├── id_estimator # code for estimating intrinsic dimension on datasets / dataset clusters
│   ├── trainers # training code (including two-step and clustered models)
│   ├── neural_networks.py # neural network modules
│   ├── two_step.py # wrapper for models
│   ├── writer.py # handles logging / checkpointing
├── {main/single_main/single_cluster_main/cluster_main}.py
├── env.yaml

Acknowledgements

This repo is an extension of this codebase.
Intrinsic dimension estimation results obtained using this codebase.
Our classifier experiments are based off of this codebase.

BibTeX

@inproceedings{brown2022verifying,
  title={Verifying the Union of Manifolds Hypothesis for Image Data},
  author={Brown, Bradley CA and Caterini, Anthony L and Ross, Brendan Leigh and Cresswell, Jesse C and Loaiza-Ganem, Gabriel},
  booktitle={ICLR},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
classification_models		classification_models
config		config
figures		figures
scripts		scripts
two_step_zoo		two_step_zoo
README.md		README.md
classification_utils.py		classification_utils.py
classifier.py		classifier.py
cluster_main.py		cluster_main.py
distributions.py		distributions.py
env.yaml		env.yaml
load_run.py		load_run.py
main.py		main.py
rewrite_ood_metrics.py		rewrite_ood_metrics.py
single_cluster_main.py		single_cluster_main.py
single_main.py		single_main.py
uom_box_plots_nodset.png		uom_box_plots_nodset.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Verifying the Union of Manifolds Hypothesis for Image Data

Requirements

Using the Repo

Training Models

Repo Organization

Acknowledgements

BibTeX

About

Releases

Packages

Languages

layer6ai-labs/UoMH

Folders and files

Latest commit

History

Repository files navigation

Verifying the Union of Manifolds Hypothesis for Image Data

Requirements

Using the Repo

Training Models

Repo Organization

Acknowledgements

BibTeX

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages