-
Notifications
You must be signed in to change notification settings - Fork 2
Paleovirology Projects
The paleovirology GLUE projects focus on endogenous viral elements (EVEs), remnants of ancient viruses integrated into host genomes. Reconstructing the evolutionary history of EVEs poses significant challenges. Many are highly degraded and divergent, complicating efforts to infer ancestral open reading frames (ORFs) and build sequence alignments. Identifying orthologous EVEs across host genomes can be equally difficult, particularly when EVE integrations occur in regions rich in repetitive DNA.
To address these challenges, the paleovirology GLUE projects aim to preserve and expand upon work characterizing EVE loci, constructing alignments, and inferring evolutionary relationships. These projects are implemented as extensions to virus diversity-focused GLUE projects for their corresponding exogenous viruses. By incorporating EVEs into the alignment hierarchies maintained in these base projects, the evolutionary relationships between orthologous EVEs and their related viral lineages can be explored. Ortholog-level alignments, in particular, facilitate the reconstruction of ancestral virus sequences from the genomic "fossils" left by EVEs. Additionally, all paleovirology GLUE projects adopt a consistent nomenclature that encodes information about EVE taxonomy, host distribution, and orthologous relationships, providing a standardized foundation for comparative analyses across host species and viral families.
GLUE projects developed in the Gifford Lab, that have a paleovirology focus:
- Endogenous Retroviruses: ERVdb
- Parvoviruses: Parvovirus-GLUE-EVE
- Flaviviruses: Flavivirus-GLUE-EVE
- Hepadnaviruses: Hepadnavirus-GLUE-EVE
- CRESS DNA viruses: CRESS-GLUE-EVE
- Filoviruses: Filovirus-GLUE-EVE
- Lentiviruses: Lentivirus-GLUE-ERV
- Deltaretroviruses: Deltaretrovirus-GLUE-ERV
A defining feature of retroviruses is their unique replication strategy, which involves the reverse transcription of the viral RNA genome into DNA and its integration into the host cell's nuclear genome as a "provirus." While most retroviral infections occur in somatic cells, occasional infections of germline cells---such as sperm, eggs, or early embryos---result in the viral DNA being passed down through generations as part of the host genome. These inherited sequences are known as endogenous retroviruses (ERVs).
Once incorporated into the germline, ERVs can expand within the host genome through processes such as reinfection of germline cells and retrotransposition. This proliferation often results in multi-copy ERV lineages, with tens to thousands of related sequences scattered across the genome. Although many ERV insertions are lost over time due to genetic drift or purifying selection, some become fixed within populations, leaving a lasting genomic footprint. Today, ERVs make up an estimated 5-10% of vertebrate genomes, offering an unparalleled molecular fossil record of the long-term evolutionary interplay between retroviruses and their hosts.
Beyond their evolutionary significance, ERVs have played a key role in shaping vertebrate genomes and host physiology. For instance, ERVs have been implicated in critical processes such as placentation, antiviral immunity, and the regulation of gene expression. Their dual role as genomic fossils and functional genomic elements makes ERVs a unique subject of study, offering insights into both host-virus co-evolution and the evolutionary innovations driven by viral sequences.
Retroviruses are a long-standing research focus in the Gifford Lab, and the motivation for developing GLUE was in large part driven by the need for a systematic approach to organizing and analyzing ERV data. Recognizing the overlap between the techniques required for ERV studies and those used in broader viral analyses, such as genomic epidemiology, informed the design of GLUE.
By fostering a flexible framework for shared methodologies, ERVdb aims to facilitate the collaborative study of endogenous retroviruses across the many different analysis contexts in which they are relevant.
-
Comprehensive Reference Sequence Set: Incorporates reference sequence data for diverse retroviruses, including exogenous retroviruses and representative endogenous retroviruses (ERVs).
-
RT Alignment: Features a codon-based alignment of reverse transcriptase (RT) sequences, spanning a wide diversity of retroviruses and ERV lineages. This alignment is maintained in a version-controlled manner via GitHub, enabling reproducible updates and edits.
-
Reproducible RT Phylogeny: Implements a reproducible workflow in GLUE for constructing phylogenies of reverse transcriptase (RT) sequences, forming the basis for exploring retrovirus and ERV diversity and evolutionary relationships.
Property | Description |
---|---|
Scope | Retroviruses (family Retroviridae - endogenous and exogenous) |
Development Period | 2024-present |
Lead Developers | Robert J. Gifford |
Main Objectives | Comparative genomics, Molecular epidemiology |
Data Sources | NCBI Nucleotide, NCBI Genomes via the DIGS Tool |
Associated Tools | BLAST+, MAFFT, RAXML |
Offline Project | GitHub (Private) |
Online Access | None as yet |
Status | Under development |
Documentation | None Yet |
Parvovirus-GLUE-EVE is an extension to Parvovirus-GLUE, developed to facilitate studies of endogenous parvoviral elements (EPVs).
This extension was developed as part of an investigation into EPV diversity. This study was performed in the context of Gates Foundation-funded initiative to explore the potential of EPV integration sites as genomic safe harbours for AAV gene therapy.
-
EPV consensus: Incorporates consensus sequences for EPV loci that occur as orthologs in multiple species.
-
Comprehensive EPV Reference Sequence Set: Includes a comprehensive set of EPV loci.
-
EPV Alignments: Curates a hierarchically linked set of multiple sequence alignments comprehensively representing homology between EPV and parvovirus sequences.
-
Reproducible Phylogenetic Reconstruction: Implements a reproducible process for reconstructing parvovirus/EPV phylogenies across a range of taxonomic levels.
Property | Description |
---|---|
Scope | Endogenous viral elements (EVEs) derived from parvoviruses (family Parvoviridae) |
Development Period | 2020-2022 |
Lead Developers | Robert J. Gifford |
Main Objectives | Comparative genomics, Molecular epidemiology |
Data Sources | NCBI |
Associated Tools | BLAST+, MAFFT, RAXML |
Offline Project | GitHub |
Online Access | None as yet |
Status | Mature. Actively being developed |
Documentation | Included in Parvovirus-GLUE User Guide |
Flavivirus-GLUE-EVE is an extension to Flavivirus-GLUE, developed to facilitate studies of endogenous flavirids (EFVs).
This extension was developed as part of an investigation into EFV diversity.
-
EFV consensus: Incorporates consensus sequences for EFV loci that occur as orthologs in multiple species.
-
Comprehensive EFV Reference Sequence Set: Includes a comprehensive set of EFV loci.
-
EFV Alignments: Curates a hierarchically linked set of multiple sequence alignments comprehensively representing homology between EFV and exogenous flavivirid sequences.
-
Reproducible Phylogenetic Reconstruction: Implements a reproducible process for reconstructing flavivirid/EFV phylogenies across a range of taxonomic levels.
Property | Description |
---|---|
Scope | Endogenous viral elements (EVEs) derived from flavivirids (family Flaviviridae) |
Development Period | 2020-2022 |
Lead Developers | Robert J. Gifford |
Main Objectives | Comparative genomics, Molecular epidemiology |
Data Sources | NCBI |
Associated Tools | BLAST+, RAXML |
Offline Project | GitHub |
Online Access | None as yet |
Status | Mature. Not currently being developed |
Documentation | Included in Flavivirus-GLUE User Guide |
Hepadnavirus-GLUE-EVE is an extension to Hepadnavirus-GLUE, developed to facilitate studies of endogenous hepadnaviral elements (eHBVs).
This extension was developed as part of an investigation into eHBV diversity.
-
eHBV consensus: Incorporates consensus sequences for eHBV loci that occur as orthologs in multiple species.
-
Comprehensive eHBV Reference Sequence Set: Includes a comprehensive set of eHBV loci.
-
eHBV Alignments: Curates a hierarchically linked set of multiple sequence alignments comprehensively representing homology between eHBV and hepadnavirus sequences.
-
Reproducible Phylogenetic Reconstruction: Implements a reproducible process for reconstructing hepadnavirus/eHBV phylogenies across a range of taxonomic levels.
Property | Description |
---|---|
Scope | Endogenous viral elements (EVEs) derived from hepadnaviruses (family Hepadnaviridae) |
Development Period | 2020-2022 |
Lead Developers | Robert J. Gifford |
Main Objectives | Comparative genomics, Molecular epidemiology |
Data Sources | NCBI |
Associated Tools | BLAST+, MAFFT, RAXML |
Offline Project | GitHub |
Online Access | None as yet |
Status | Mature. Actively being developed |
Documentation | Included in Hepadnavirus-GLUE User Guide |
Filovirus-GLUE-EVE is an extension to Filovirus-GLUE, developed to facilitate studies of endogenous viral elements (EVEs) derived filoviruses.
This extension was developed as part of an investigation into EVE diversity.
-
EVE consensus: Incorporates consensus sequences for EVE loci that occur as orthologs in multiple species.
-
Comprehensive EVE Reference Sequence Set: Includes a comprehensive set of EVE loci.
-
EVE Alignments: Curates a hierarchically linked set of multiple sequence alignments comprehensively representing homology between EVE and filovirus sequences.
-
Reproducible Phylogenetic Reconstruction: Implements a reproducible process for reconstructing filovirus/EVE phylogenies across a range of taxonomic levels.
Property | Description |
---|---|
Scope | Endogenous viral elements (EVEs) derived from filoviruses (family Filoviridae) |
Development Period | 2020-2022 |
Lead Developers | Robert J. Gifford |
Main Objectives | Comparative genomics, Molecular epidemiology |
Data Sources | NCBI |
Associated Tools | BLAST+, MAFFT, RAXML |
Offline Project | GitHub |
Online Access | None as yet |
Status | Mature. Actively being developed |
Documentation | Included in Filovirus-GLUE User Guide |
CRESS-GLUE-EVE is an extension to CRESS-GLUE, developed to facilitate studies of endogenous viral elements (EVEs) derived circular Rep-encoding single-stranded DNA (CRESS DNA) viruses (phylum Cressdnaviricota).
This extension was developed as part of an investigation into CRESS DNA virus diversity.
-
EVE consensus: Incorporates consensus sequences for EVE loci that occur as orthologs in multiple species.
-
Comprehensive EVE Reference Sequence Set: Includes a comprehensive set of EVE loci.
-
EVE Alignments: Curates a hierarchically linked set of multiple sequence alignments comprehensively representing homology between EVE and CRESS DNA virus sequences.
-
Reproducible Phylogenetic Reconstruction: Implements a reproducible process for reconstructing CRESS DNA virus/EVE phylogenies across a range of taxonomic levels.
Property | Description |
---|---|
Scope | Endogenous viral elements (EVEs) derived from circular Rep-encoding single-stranded DNA (CRESS DNA) viruses (phylum Cressdnaviricota) |
Development Period | 2020-2022 |
Lead Developers | Robert J. Gifford |
Main Objectives | Comparative genomics, Molecular epidemiology |
Data Sources | NCBI |
Associated Tools | BLAST+, MAFFT, RAXML |
Offline Project | GitHub |
Online Access | None as yet |
Status | Mature. Actively being developed |
Documentation | Included in CRESS-GLUE User Guide |
Lentivirus-GLUE is a specialized resource designed to support the support comparative genomic and evolutionary analysis of lentiviruses. This extension project adds endogenous lentivirus sequences.
Lentivirus-GLUE-EVE was developed as part of an initiative to comprehensively map endogenous lentivirus sequences in published mammal genome sequences, performed in 2022.
- Comprehensive Genomic Database: Integrates data from all known lentivirus ERV sequences.
Property | Description |
---|---|
Scope | Endogenous lentiviruses (genus Lentivirus) |
Development Period | 2018-2024 |
Lead Developers | Robert J. Gifford |
Main Objectives | Comparative genomics, Molecular epidemiology |
Data Sources | NCBI |
Associated Tools | BLAST+, RAXML |
Offline Project | GitHub |
Online Access | None as yet |
Status | Mature. Actively being developed |
Documentation | GitHub Wiki |
An extension to Deltaretrovirus-GLUE. Deltaretroviruses are a group of retroviruses that infect mammals. They only occur very rarely as endogenous sequences.
Deltaretrovirus-GLUE-EVE was developed to support paleovirological investigations of Deltaretrovirus-derived ERVs.
- Comprehensive Genomic Database: Integrates data from all known endogenous deltaretrovirus sequences.
Property | Description |
---|---|
Scope | Endogenous deltaretroviruses (genus Deltaretrovirus) |
Development Period | 2018-2024 |
Lead Developers | Robert J. Gifford |
Main Objectives | Comparative genomics, Molecular epidemiology |
Data Sources | NCBI |
Associated Tools | BLAST+, RAXML |
Offline Project | GitHub |
Online Access | None as yet |
Status | Mature. Not currently being developed |
Documentation | GitHub Wiki (Under Construction) |
GLUE by Robert J. Gifford Lab.
For questions, issues, or feedback, please open an issue on the GitHub repository.
- Project Data Model
- Schema Extensions
- Modules
- Alignments
- Variations
- Scripting Layer
- Freemarker Templates
- Example GLUE Project
- Command Line Interpreter
- Build Your Own Project
- Querying the GLUE Database
- Working With Deep Sequencing Data
- Invoking GLUE as a Unix Command
- Known Issues and Fixes
- Overview
- Hepatitis Viruses
- Arboviruses
- Respiratory Viruses
- Animal Viruses
- Spillover Viruses
- Virus Diversity
- Retroviruses
- Paleovirology
- Transposons
- Host Genes