Skip to content
Leighton Pritchard edited this page Apr 28, 2017 · 12 revisions

This page contains notes for the planned future development of pyani

Index

Interface

The current interface for pyani scripts is to call either the average_nucleotide_identity.py or genbank_get_genomes_by_taxon.py scripts with a combination of arguments. For the average_nucleotide_identity.py script in particular there are arguments that either perform a stage in the total analysis, or prevent a stage from executing. I would like to change this interface to a pyani.py COMMAND OPTIONS structure, similar to git and other tools.

More specificially, I would like to enable operations such as:

  • pyani.py download -t 931 -o my_organism: download all NCBI assemblies under taxon 931 to the directory my_organism
  • pyani.py index my_organism: generate MD5 or other hashes for each genome in the directory my_organism
  • pyani.py anim my_organism -o my_organism_ANIm --scheduler SGE: conduct ANIm analysis on the genomes in the directory my_organism
  • pyani.py anib my_organism -o my_organism_ANIb --scheduler SGE: conduct ANIb analysis on the genomes in the directory my_organism
  • pyani.py render my_organism_ANIm --gmethod seaborn: draw graphical output for the ANIm analysis in the directory my_organism_ANIm
  • pyani.py classify my_organism_ANIm: conduct classification analysis of ANIm results in the directory my_organism_ANIm
  • pyani.py db --setdb my_db: specify the database (sqlite3?) to hold comparison data
  • pyani.py db --update my_organism_ANIm: update the current comparison data database with the results contained in my_organism_ANIm - this might be useful after a partial run/failure.