-
Notifications
You must be signed in to change notification settings - Fork 2
7. nodetraits
To run the analysis on your dataset and compute posterior probabilities, the executable nodetraits
and readnodetraits
from BayesCode require three files:
- A phylogenetic tree in newick format, with branch lengths in number of substitutions per site (neutral markers)
- A file containing the mean trait values for each species.
- A file containing the variation within species for each trait and the genetic variation within species (neutral markers).
The phylogenetic tree must be in newick format, with branch lengths in number of substitutions per site (neutral markers).
The file containing mean trait values for each species must be in a tab-delimited file with the following format:
TaxonName | Body_mass | Brain_mass |
---|---|---|
Panthera_tigris | 12.26 | 5.676 |
Pithecia_pithecia | 7.256 | 3.436 |
Colobus_angolensis | 9.176 | 4.284 |
Saimiri_boliviensis | 6.845 | 3.279 |
The columns are:
- TaxonName: the name of the taxon matching the name in the alignment and the tree.
- As many columns as traits, without spaces or special characters in the trait.
- The values can be
NaN
to indicate that the trait is not available for that taxon.
The file containing trait variation for each species must be in a tab-delimited file with the following format:
TaxonName | Nucleotide_diversity | Body_mass_variance | Body_mass_heritability | Brain_mass_variance | Brain_mass_heritability |
---|---|---|---|---|---|
Pithecia_pithecia | 0.0016 | 0.22871 | 0.2 | 0.00737 | 0.2 |
Colobus_angolensis | 0.0017 | 0.00393 | 0.2 | 0.00416 | 0.2 |
Saimiri_boliviensis | 0.0013 | 0.00022 | 0.2 | 0.00045 | 0.2 |
Pygathrix_nemaeus | 0.0016 | 0.00347 | 0.2 | 0.00097 | 0.2 |
- TaxonName: the name of the taxon matching the name in the alignment and the tree.
-
Nucleotide_diversity: the nucleotide diversity within species (neutral markers), cannot be
NaN
. - As many columns as traits, without spaces or special characters in the trait.
-
TraitName_variance: the phenotypic variance of the trait within species, can be
NaN
to indicate that the trait variance is not available for that taxon. -
TraitName_heritability (optional): the heritability of the trait within species, between 0 and 1, cannot be
NaN
. - The columns with the suffix
_variance
and_heritability
are repeated for each trait. -
TraitName_heritability_lower (optional): the lower bound of the heritability of the trait within species, between 0
and 1, cannot be
NaN
. -
TraitName_heritability_upper (optional): the upper bound of the heritability of the trait within species, between 0
and 1, cannot be
NaN
. - If the columns with the suffix
_heritability_lower
and_heritability_upper
are present, the heritability is randomly drawn from a uniform distribution between the lower and upper bounds. - If the columns with the suffix
_heritability
is present, it is taken as is.
If the genetic variance (instead of phenotypic variance) is available for a trait, the heritability can be omitted and will automatically be set to 1.0.
The file data/body_size/mammals.male.tsv
contains the mean trait values for each species, the
file data/body_size/mammals.male.var_trait.tsv
contains the variation within species for each trait and the genetic variation
within species (neutral markers), and the file data/body_size/mammals.male.tree
contains the phylogenetic tree.\
nodetraits
is run with the following command:
nodetraits -t data/body_size/mammals.male.tree --traitsfile data/body_size/mammals.male.tsv --until 2000 run_mammals_male
Then the chain run_mammals_male
is used to compute the posterior distribution of the ratio of between species
variation over within species variation with readnodetraits
:
readnodetraits --burnin 1000 --var_within data/body_size/mammals.male.var_trait.tsv --output results_mammals_male.tsv run_mammals_male
The file data_empirical/chain_name.ratio.tsv
contains the posterior mean of the ratio of between species variation
over within species variation, the 95% and 99% credible interval, and the posterior probability that the ratio is
greater than 1.
To obtain the ratio (without the posterior credible interval and probability) using maximum likelihood computation, the following python script can be used:
python3 utils/neutrality_index.py --tree data/body_size/mammals.male.tree --traitsfile data/body_size/mammals.male.tsv --var_within data/body_size/mammals.male.var_trait.tsv --output results_ML_mammals_male.tsv