torchtree

torchtree is a program designed for developing and inferring phylogenetic models. Implemented in Python, it leverages PyTorch for automatic differentiation. The suite of inference algorithms encompasses variational inference, Hamiltonian Monte Carlo, maximum a posteriori, and Markov chain Monte Carlo.

For a comprehensive assessment of torchtree's performance and use cases, please see our evaluation repository, torchtree-experiments, where torchtree was rigorously tested on various datasets and benchmarked for accuracy and speed.

Getting Started
- Dependencies
- Installation
Quick start
Documentation
Plug-ins

Getting Started

Dependencies

Installation

Use an Anaconda environment (Optional)

conda env create -f environment.yml
conda activate torchtree

To install the latest stable version you can run

pip install torchtree

To build torchtree from source you can run

git clone https://github.com/4ment/torchtree
pip install torchtree/

Check install

torchtree --help

Documentation

For detailed information on how to use torchtree and its features, please refer to the official documentation and API reference.

Quick start

torchtree requires a JSON file containing models and algorithms. A configuration file can be generated using torchtree-cli, a command line-based tool. This two-step process allows the user to adjust values in the configuration file, such as hyperparameters.

torchtree-cli implements several subcommands, each corresponding to a different type of inference algorithm. A list of available subcommands can be obtained by running torchtree-cli --help.

The following subcommands are available:

advi: Automatic differentiation variational inference
hmc: Hamiltonian Monte Carlo
map: Maximum a posteriori
mcmc: Markov chain Monte Carlo

Each subcommand/algorithm requires a different set of arguments which can be obtained by running torchtree-cli <subcommand> --help.

torchtree-cli requires an alignment file in FASTA format and a tree file in either Newick or NEXUS format. While torchtree uses the DendroPy library to parse and manipulate phylogenetic trees, it is recommended to use a Newick file due to the numerous variations of the NEXUS format.

Let's explore a few examples of how to use these programs using an influenza A virus dataset containing 69 DNA sequences. The alignment and tree files are located in the data directory.

1 - Generating a configuration file

Some examples of models using variational inference:

Unrooted tree with GTR+W4 model

W4 refers to a site model with 4 rates categories coming from a discretized Weibull distribution. This is similar to the more commonly used discretized Gamma distribution site model.

torchtree-cli advi -i data/fluA.fa -t data/fluA.tree -m GTR -C 4 > fluA.json

Time tree with strict clock and constant coalescent model

torchtree-cli advi -i data/fluA.fa -t data/fluA.tree -m JC69 --clock strict --coalescent constant > fluA.json

2 - Running torchtree

This will generate sample.csv and sample.trees files containing parameter and tree samples drawn from the variational distribution

torchtree fluA.json

torchtree plug-in

torchtree can be easily extended without modifying the code base thanks its modular implementation. Some examples of plug-ins:

A GitHub template is available to assist in the development of a plug-in, and it is highly recommended to use it. This template provides a structured starting point, ensuring consistency and compatibility with torchtree while streamlining the development process.

How to cite

If you use torchtree, please consider citing:


@misc{fourment2024torchtree,
      title={torchtree: flexible phylogenetic model development and inference using {PyTorch}}, 
      author={Mathieu Fourment and Matthew Macaulay and Christiaan J Swanepoel and Xiang Ji and Marc A Suchard and Frederick A Matsen IV},
      year={2024},
      eprint={2406.18044},
      archivePrefix={arXiv},
      primaryClass={q-bio.PE},
      url={https://arxiv.org/abs/2406.18044}
}

License

Distributed under the GPLv3 License. See LICENSE for more information.

Acknowledgements

torchtree makes use of the following libraries and tools, which are under their own respective licenses:

Name		Name	Last commit message	Last commit date
Latest commit History 234 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
data		data
docs		docs
examples/advi		examples/advi
notebooks		notebooks
test		test
torchtree		torchtree
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
environment-dev.yml		environment-dev.yml
environment.yml		environment.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

torchtree

Getting Started

Dependencies

Installation

Documentation

Quick start

1 - Generating a configuration file

Unrooted tree with GTR+W4 model

Time tree with strict clock and constant coalescent model

2 - Running torchtree

torchtree plug-in

How to cite

License

Acknowledgements

About

Releases 2

Packages

Contributors 2

Languages

License

4ment/torchtree

Folders and files

Latest commit

History

Repository files navigation

torchtree

Getting Started

Dependencies

Installation

Documentation

Quick start

1 - Generating a configuration file

Unrooted tree with GTR+W4 model

Time tree with strict clock and constant coalescent model

2 - Running torchtree

torchtree plug-in

How to cite

License

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages