This repo contains the files needed to recreate all of the analysis and figures presented in our pre-print:
Altered Subgenomic RNA Abundance Provides Unique Insight into SARS-CoV-2 B.1.1.7 Infections
Matthew D Parker1,2, Hazel Stewart3, Ola M. Shehata4, Benjamin B. Lindsey5,6, Dhruv R Shah6, Sharon Hsu2,6, Alexander J Keeley5,6, David G Partridge5, Shay Leary7, Alison Cope5, Amy State5, Katie Johnson5, Nasar Ali5, Rasha Raghei5, Joe Heffer8, Nikki Smith6, Peijun Zhang6, Marta Gallis6, Stavroula F Louka6, Hailey R Hornsby6, Hatoon Alamri4, Max Whiteley6, Benjamin H Foulkes6, Stella Christou6, Paige Wolverson6, Manoj Pohare6, Samantha E Hansford6, Luke R Green6, Cariad Evans5, Mohammad Raza5, Dennis Wang1,2,9, Andrew E Firth3, James R Edgar3, Silvana Gaudieri9,10,11, Simon Mallal10,11, The COVID-19 Genomics UK (COG-UK) consortium十, Mark O. Collins4, Andrew A Peden4, Thushan I de Silva5,6*
1 Sheffield Biomedical Research Centre, The University of Sheffield, Sheffield, UK 2 Sheffield Bioinformatics Core, The University of Sheffield, Sheffield, UK 3 Department of Pathology, University of Cambridge, Cambridge, UK 4 Department of Biomedical Science, The University of Sheffield, Western Bank, Sheffield, UK 5 Sheffield Teaching Hospitals NHS Foundation Trust, Sheffield, UK. 6 The Florey Institute for Host-Pathogen Interactions & Department of Infection, Immunity and Cardiovascular Disease, Medical School, University of Sheffield, Sheffield, UK. 7 Institute for Immunology and Infectious Diseases, Murdoch University, Murdoch, Western Australia, Australia 8 IT Services, The University of Sheffield, Sheffield, UK 9 Department of Computer Science, The University of Sheffield, Sheffield, UK 10 Division of Infectious Diseases, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA 11 School of Human Sciences, University of Western Australia, Crawley, Western Australia, Australia
十Full list of consortium names and affiliations located in the supplementary material
*Corresponding Author
B.1.1.7 (alpha) lineage SARS-CoV-2 is more transmissible, may lead to greater clinical severity, and results in modest reductions in antibody neutralization. Subgenomic RNA (sgRNA) is produced by discontinuous transcription of the SARS-CoV-2 genome. Applying our tool (periscope) to ARTIC Network Oxford Nanopore Technologies genomic sequencing data from 4400 SARS-CoV-2 positive clinical samples, we show that normalised sgRNA is significantly increased in B.1.1.7 infections (n=879). This increase is seen over the previous dominant circulating lineage in the UK, B.1.177 (n=943), which is independent of genomic reads, E cycle threshold and days since symptom onset at sampling. A noncanonical sgRNA which could represent ORF9b is found in 98.4% of B.1.1.7 SARS-CoV-2 infections compared with only 13.8% of other lineages, with a 16-fold increase in median sgRNA abundance. We demonstrate that ORF9b protein levels are increased 6-fold in B.1.1.7 compared to a B lineage virus in vitro. We hypothesise that increased ORF9b in B.1.1.7 is a direct consequence of a triple nucleotide mutation in nucleocapsid (28280:GAT>CAT, D3L) creating a transcription regulatory-like sequence complementary to a region 3’ of the genomic leader. These findings provide a unique insight into the biology of B.1.1.7 and support monitoring of sgRNA profiles to evaluate emerging potential variants of concern.