Hyperbolic embeddings for approximating phylogenetic posteriors.
This code is under active research & development and not intended for production use for phylogenetic inference.
First install the Python version of hydra+ from https://github.com/mattapow/hydraPlus.
Use python version 3.7 in order to use the bito package (for more models and to make use of BEAGLE).
conda create --name dodonaphy python=3.9
conda activate dodonaphy
Pip needs numpy and cython for the setup:
pip install cython
pip install numpy
Then install the Dodonaphy locally package using pip:
pip install -e .
(Optional, requires pytest) Once the package is installed, tests are run using pytest:
pytest -o "testpaths=test"
The basic idea is to embed genetic data as points in Hyperbolic space and connect them to form a tree. Then perform Bayesian inference (MCMC or Variational Inference) in the embedding space. Gradient ascent is performed using pytorch.
Run Dodonaphy using
dodo [OPTIONS]
As an example, data in the example directory is provided. Perform MCMC with a gamma dirichlet prior on the example data as follows.
dodo --infer mcmc --path_root example --prior gammadir --epochs 10000
This starts from the neighbour joining tree. To start from a better tree, use:
dodo --infer mcmc --path_root example --prior gammadir --epochs 10000 --start start.nex --suffix good_start
This creates a directory contained in mcmc/up_nj/d3_k0
, signifying the embedding method "up", the tree connection method "nj" Neighbour Joining, three dimensions and a log negative curvature of 0, i.e. a curvature of -1.
This directory has:
samples.t
containing sampled trees in the posteriorlocations.csv
containing the tip embedding locationsmcmc.log
with information about the MCMC invoked.
At the command line, type
dodo --help
to see the options.