Skip to content

Latest commit

 

History

History
37 lines (27 loc) · 1.95 KB

paper.md

File metadata and controls

37 lines (27 loc) · 1.95 KB
title tags authors affiliations date bibliography
Kindel: indel-aware consensus for nucleotide sequence alignments
bioinformatics
sequence analysis
genome assembly
name orcid affiliation
Bede Constantinides
0000-0002-3480-3819
1
name orcid affiliation
David L. Robertson
0000-0001-6338-0221
2
name index
Evolution and Genomic Sciences, University of Manchester, Manchester, UK
1
name index
MRC-University of Glasgow Centre for Virus Research, Glasgow, UK
2
26 May 2017
paper.bib

Summary

Kindel is a collection of tools for inferring consensus sequence from an alignment of nucleotide sequences in Sequence Alignment/Map (SAM) format [@samtools] in the presence of substitutions, insertions and deletions (indels). At regions where reads deviate sufficiently from the reference sequence, partially unaligned sequence context is used to perform local reassembly. In this way, Kindel generates a data-specific reference sequence that maximises overall read-reference similarity. While an elegant streaming approach to consensus inference was implemented in OCOCO [@ococo], like other approaches it fails to reconcile indels.

Kindel was developed for inferring consensus of highly diverse populations of RNA viruses such as hepatitis C and HIV, and is tested with deep sequenced hepatitis C alignments generated by BWA-MEM [@bwa] and Segemehl [@segemehl]. Furthermore, Kindel may be used to quantify and visualise subconsensus variation in allele frequencies across a reference sequence, facilitating comparison of intrapatient population state among multiple individuals and/or timepoints. Kindel is implemented as a Python 3 package with a command line interface.

Usage overview.

Local reassembly of clip-dominant regions (CDRs). Leveraging partially aligned reads, consensus can be accurately inferred across unrepresentative and poorly covered regions of reference sequence.

References