Skip to content

Paleovirology Projects

Robert J. Gifford edited this page Dec 17, 2024 · 23 revisions

Overview

The paleovirology GLUE projects focus on endogenous viral elements (EVEs), remnants of ancient viruses integrated into host genomes. Reconstructing the evolutionary history of EVEs poses significant challenges. Many are highly degraded and divergent, complicating efforts to infer ancestral open reading frames (ORFs) and build sequence alignments. Identifying orthologous EVEs across host genomes can be equally difficult, particularly when EVE integrations occur in regions rich in repetitive DNA.

To address these challenges, the paleovirology GLUE projects aim to preserve and expand upon work characterizing EVE loci, constructing alignments, and inferring evolutionary relationships. These projects are implemented as extensions to virus diversity-focused GLUE projects for their corresponding exogenous viruses. By incorporating EVEs into the alignment hierarchies maintained in these base projects, the evolutionary relationships between orthologous EVEs and their related viral lineages can be explored. Ortholog-level alignments, in particular, facilitate the reconstruction of ancestral virus sequences from the genomic "fossils" left by EVEs. Additionally, all paleovirology GLUE projects adopt a consistent nomenclature that encodes information about EVE taxonomy, host distribution, and orthologous relationships, providing a standardized foundation for comparative analyses across host species and viral families.



Contents

GLUE projects developed in the Gifford Lab, that have a paleovirology focus:



ERVdb

Background

A defining feature of retroviruses is their unique replication strategy, which involves the reverse transcription of the viral RNA genome into DNA and its integration into the host cell's nuclear genome as a "provirus." While most retroviral infections occur in somatic cells, occasional infections of germline cells---such as sperm, eggs, or early embryos---result in the viral DNA being passed down through generations as part of the host genome. These inherited sequences are known as endogenous retroviruses (ERVs).

Once incorporated into the germline, ERVs can expand within the host genome through processes such as reinfection of germline cells and retrotransposition. This proliferation often results in multi-copy ERV lineages, with tens to thousands of related sequences scattered across the genome. Although many ERV insertions are lost over time due to genetic drift or purifying selection, some become fixed within populations, leaving a lasting genomic footprint. Today, ERVs make up an estimated 5-10% of vertebrate genomes, offering an unparalleled molecular fossil record of the long-term evolutionary interplay between retroviruses and their hosts.

Beyond their evolutionary significance, ERVs have played a key role in shaping vertebrate genomes and host physiology. For instance, ERVs have been implicated in critical processes such as placentation, antiviral immunity, and the regulation of gene expression. Their dual role as genomic fossils and functional genomic elements makes ERVs a unique subject of study, offering insights into both host-virus co-evolution and the evolutionary innovations driven by viral sequences.

Scope and History

Retroviruses are a long-standing research focus in the Gifford Lab, and the motivation for developing GLUE was in large part driven by the need for a systematic approach to organizing and analyzing ERV data. Recognizing the overlap between the techniques required for ERV studies and those used in broader viral analyses, such as genomic epidemiology, informed the design of GLUE.

By fostering a flexible framework for shared methodologies, ERVdb aims to facilitate the collaborative study of endogenous retroviruses across the many different analysis contexts in which they are relevant.

Features

  • Comprehensive Reference Sequence Set: Incorporates reference sequence data for diverse retroviruses, including exogenous retroviruses and representative endogenous retroviruses (ERVs).

  • RT Alignment: Features a codon-based alignment of reverse transcriptase (RT) sequences, spanning a wide diversity of retroviruses and ERV lineages. This alignment is maintained in a version-controlled manner via GitHub, enabling reproducible updates and edits.

  • Reproducible RT Phylogeny: Implements a reproducible workflow in GLUE for constructing phylogenies of reverse transcriptase (RT) sequences, forming the basis for exploring retrovirus and ERV diversity and evolutionary relationships.

Extension Project Overview

Property Description
Scope Retroviruses (family Retroviridae - endogenous and exogenous)
Development Period 2024-present
Lead Developers Robert J. Gifford
Main Objectives Comparative genomics, Molecular epidemiology
Data Sources NCBI Nucleotide, NCBI Genomes via the DIGS Tool
Associated Tools BLAST+, MAFFT, RAXML
Offline Project GitHub (Private)
Online Access None as yet
Status Under development
Documentation None Yet


Parvovirus-GLUE-EVE

Background

Parvovirus-GLUE-EVE is an extension to Parvovirus-GLUE, developed to facilitate studies of endogenous parvoviral elements (EPVs).

Scope and History

This extension was developed as part of an investigation into EPV diversity. This study was performed in the context of Gates Foundation-funded initiative to explore the potential of EPV integration sites as genomic safe harbours for AAV gene therapy.

Features

  • EPV consensus: Incorporates consensus sequences for EPV loci that occur as orthologs in multiple species.

  • Comprehensive EPV Reference Sequence Set: Includes a comprehensive set of EPV loci.

  • EPV Alignments: Curates a hierarchically linked set of multiple sequence alignments comprehensively representing homology between EPV and parvovirus sequences.

  • Reproducible Phylogenetic Reconstruction: Implements a reproducible process for reconstructing parvovirus/EPV phylogenies across a range of taxonomic levels.

Extension Project Overview

Property Description
Scope Endogenous viral elements (EVEs) derived from parvoviruses (family Parvoviridae)
Development Period 2020-2022
Lead Developers Robert J. Gifford
Main Objectives Comparative genomics, Molecular epidemiology
Data Sources NCBI
Associated Tools BLAST+, MAFFT, RAXML
Offline Project GitHub
Online Access None as yet
Status Mature. Actively being developed
Documentation Included in Parvovirus-GLUE User Guide


Flavivirus-GLUE-EVE

Background

Flavivirus-GLUE-EVE is an extension to Flavivirus-GLUE, developed to facilitate studies of endogenous flavirids (EFVs).

Scope and History

This extension was developed as part of an investigation into EFV diversity.

Features

  • EFV consensus: Incorporates consensus sequences for EFV loci that occur as orthologs in multiple species.

  • Comprehensive EFV Reference Sequence Set: Includes a comprehensive set of EFV loci.

  • EFV Alignments: Curates a hierarchically linked set of multiple sequence alignments comprehensively representing homology between EFV and exogenous flavivirid sequences.

  • Reproducible Phylogenetic Reconstruction: Implements a reproducible process for reconstructing flavivirid/EFV phylogenies across a range of taxonomic levels.

Extension Project Overview

Property Description
Scope Endogenous viral elements (EVEs) derived from flavivirids (family Flaviviridae)
Development Period 2020-2022
Lead Developers Robert J. Gifford
Main Objectives Comparative genomics, Molecular epidemiology
Data Sources NCBI
Associated Tools BLAST+, RAXML
Offline Project GitHub
Online Access None as yet
Status Mature. Not currently being developed
Documentation Included in Flavivirus-GLUE User Guide


Hepadnavirus-GLUE-EVE

Background

Hepadnavirus-GLUE-EVE is an extension to Hepadnavirus-GLUE, developed to facilitate studies of endogenous hepadnaviral elements (eHBVs).

Scope and History

This extension was developed as part of an investigation into eHBV diversity.

Features

  • eHBV consensus: Incorporates consensus sequences for eHBV loci that occur as orthologs in multiple species.

  • Comprehensive eHBV Reference Sequence Set: Includes a comprehensive set of eHBV loci.

  • eHBV Alignments: Curates a hierarchically linked set of multiple sequence alignments comprehensively representing homology between eHBV and hepadnavirus sequences.

  • Reproducible Phylogenetic Reconstruction: Implements a reproducible process for reconstructing hepadnavirus/eHBV phylogenies across a range of taxonomic levels.

Extension Project Overview

Property Description
Scope Endogenous viral elements (EVEs) derived from hepadnaviruses (family Hepadnaviridae)
Development Period 2020-2022
Lead Developers Robert J. Gifford
Main Objectives Comparative genomics, Molecular epidemiology
Data Sources NCBI
Associated Tools BLAST+, MAFFT, RAXML
Offline Project GitHub
Online Access None as yet
Status Mature. Actively being developed
Documentation Included in Hepadnavirus-GLUE User Guide


Filovirus-GLUE-EVE

Background

Filovirus-GLUE-EVE is an extension to Filovirus-GLUE, developed to facilitate studies of endogenous viral elements (EVEs) derived filoviruses.

Scope and History

This extension was developed as part of an investigation into EVE diversity.

Features

  • EVE consensus: Incorporates consensus sequences for EVE loci that occur as orthologs in multiple species.

  • Comprehensive EVE Reference Sequence Set: Includes a comprehensive set of EVE loci.

  • EVE Alignments: Curates a hierarchically linked set of multiple sequence alignments comprehensively representing homology between EVE and filovirus sequences.

  • Reproducible Phylogenetic Reconstruction: Implements a reproducible process for reconstructing filovirus/EVE phylogenies across a range of taxonomic levels.

Extension Project Overview

Property Description
Scope Endogenous viral elements (EVEs) derived from filoviruses (family Filoviridae)
Development Period 2020-2022
Lead Developers Robert J. Gifford
Main Objectives Comparative genomics, Molecular epidemiology
Data Sources NCBI
Associated Tools BLAST+, MAFFT, RAXML
Offline Project GitHub
Online Access None as yet
Status Mature. Actively being developed
Documentation Included in Filovirus-GLUE User Guide


CRESS-GLUE-EVE

Background

CRESS-GLUE-EVE is an extension to CRESS-GLUE, developed to facilitate studies of endogenous viral elements (EVEs) derived circular Rep-encoding single-stranded DNA (CRESS DNA) viruses (phylum Cressdnaviricota).

Scope and History

This extension was developed as part of an investigation into CRESS DNA virus diversity.

Features

  • EVE consensus: Incorporates consensus sequences for EVE loci that occur as orthologs in multiple species.

  • Comprehensive EVE Reference Sequence Set: Includes a comprehensive set of EVE loci.

  • EVE Alignments: Curates a hierarchically linked set of multiple sequence alignments comprehensively representing homology between EVE and CRESS DNA virus sequences.

  • Reproducible Phylogenetic Reconstruction: Implements a reproducible process for reconstructing CRESS DNA virus/EVE phylogenies across a range of taxonomic levels.

Extension Project Overview

Property Description
Scope Endogenous viral elements (EVEs) derived from circular Rep-encoding single-stranded DNA (CRESS DNA) viruses (phylum Cressdnaviricota)
Development Period 2020-2022
Lead Developers Robert J. Gifford
Main Objectives Comparative genomics, Molecular epidemiology
Data Sources NCBI
Associated Tools BLAST+, MAFFT, RAXML
Offline Project GitHub
Online Access None as yet
Status Mature. Actively being developed
Documentation Included in CRESS-GLUE User Guide


Lentivirus-GLUE-ERV

Background

Lentivirus-GLUE is a specialized resource designed to support the support comparative genomic and evolutionary analysis of lentiviruses. This extension project adds endogenous lentivirus sequences.

Scope and History

Lentivirus-GLUE-EVE was developed as part of an initiative to comprehensively map endogenous lentivirus sequences in published mammal genome sequences, performed in 2022.

Features

  • Comprehensive Genomic Database: Integrates data from all known lentivirus ERV sequences.

Extension Project Overview

Property Description
Scope Endogenous lentiviruses (genus Lentivirus)
Development Period 2018-2024
Lead Developers Robert J. Gifford
Main Objectives Comparative genomics, Molecular epidemiology
Data Sources NCBI
Associated Tools BLAST+, RAXML
Offline Project GitHub
Online Access None as yet
Status Mature. Actively being developed
Documentation GitHub Wiki


Deltaretrovirus-GLUE-ERV

Background

An extension to Deltaretrovirus-GLUE. Deltaretroviruses are a group of retroviruses that infect mammals. They only occur very rarely as endogenous sequences.

Scope and History

Deltaretrovirus-GLUE-EVE was developed to support paleovirological investigations of Deltaretrovirus-derived ERVs.

Features

  • Comprehensive Genomic Database: Integrates data from all known endogenous deltaretrovirus sequences.

Extension Project Overview

Property Description
Scope Endogenous deltaretroviruses (genus Deltaretrovirus)
Development Period 2018-2024
Lead Developers Robert J. Gifford
Main Objectives Comparative genomics, Molecular epidemiology
Data Sources NCBI
Associated Tools BLAST+, RAXML
Offline Project GitHub
Online Access None as yet
Status Mature. Not currently being developed
Documentation GitHub Wiki (Under Construction)


Clone this wiki locally