Skip to content

Metagenomic pipeline to bin virus and detect bacterial host based on HiC data and short reads assembly.

License

Notifications You must be signed in to change notification settings

ABignaud/MetaVir

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MetaVir

Metagenomics phage bacterial host detection and phage binning pipeline based on metaHiC data and short reads metagenomic assembly.

The goal of this pipeline is to detect phage bacterial host and to build phage MAGs based on the contig host detected, and a classical binning of the contigs based on the covergae and the sequences.

THIS MODULE ARE NOW INCLUDED ON METATOR AND IS NOW LONGER MAINTAINED HERE: https://github.com/koszullab/metaTOR

Installation

Requirements

  • Python 3.6 or later is required.
  • The following librairies are required but will be automatically installed with the pip installation: checkv, docopt, networkx, numpy, pandas, pyfastx.
  • The following software and database should be installed separately if you used pip installation:

Using pip

pip3 install metavir

Or to use the latest version

git clone https://github.com/ABignaud/MetaVir.git
pip3 install -e ./MetaVir

Installation of metabat2 and checkV database are necessary to use the binning module. To install metabat2, follow the instructions here and to install and/or update checkv database, instructions are available here. To use MetaVir, it is mandatory to set the environnement variable CHECKVDB manually or using their workflows.

Using docker container

git clone https://github.com/ABignaud/MetaVir.git
cd MetaVir
docker build --tag metavir .
docker run metavir {hsot|binning} [parameters]

A docker image will be soon available.

Usage

metavir {host|binning} [parameters]

There are two main steps in the metaVir pipeline, which must be run in the following order:

  • host : Detect bacterial host from a metaHiC network binned by metaTOR given a annotated phages list.
  • binning : Build phages MAGs based on metagenomic binning using metabat2 and the host detection form the metaHiC data.

There are a number of other, optional, miscellaneous actions:

  • pipeline : Run both steps sequentially.
  • version : display current version number.
  • help : display help message.

Output files

host

  • host.tsv: Table with two columns the phage contigs name and the bacterial MAG host.

binning

  • phages_data.tsv: Clustering output summary from the binning.
  • phages_binned.fa: Fasta file with the sequences of the binnes phage MAGs. Each entry represent one phage MAG, and contigs are delimited by 180bp "N" spacers.
  • checkV_contigs: Directory with checkV output of phage contigs.
  • checkV_bins: Directory with checkV output of phage bins.
  • Some plots:
    • barplot_phage_bins_size_distribution.png
    • pie_phage_bins_size_distribution.png

References

MetaHiC phage-bacteria infection network reveals active cycling phages of the healthy human gut, Martial Marbouty, Agnès Thierry, Gaël A Millot, Romain Koszul, 2021

Contact

Authors

Research lab

Spatial Regulation of Genomes (Institut Pasteur, Paris)

About

Metagenomic pipeline to bin virus and detect bacterial host based on HiC data and short reads assembly.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published