Using within-host sequencing data to understand and predict RNA viral infections
- Funded by UK Research and Innovation (UKRI)
- Total publications:0 publications
Grant number: 2873841
Grant search
Key facts
Disease
COVID-19Start & end year
20232027Known Financial Commitments (USD)
$0Funder
UK Research and Innovation (UKRI)Principal Investigator
N/A
Research Location
United KingdomLead Research Institution
UNIVERSITY OF OXFORDResearch Priority Alignment
N/A
Research Category
Pathogen: natural history, transmission and diagnostics
Research Subcategory
Pathogen genomics, mutations and adaptations
Special Interest Tags
Data Management and Data Sharing
Study Type
Non-Clinical
Clinical Trial Details
N/A
Broad Policy Alignment
Pending
Age Group
Not Applicable
Vulnerable Population
Not applicable
Occupations of Interest
Not applicable
Abstract
SARS-CoV-2 evolution is characterised by the emergence of highly divergent variants of concern (VOC) with a large number of non-synonymous mutations compared to the other circulating lineages. Most SARS-CoV-2 infections are acute. Transmission typically occurs through a small bottleneck, where only a small number of viral particles are passed from one host to another, limiting the initial genetic diversity of the virus within a new infection. Hence, little diversity accumulates during acute infections and the origins of such divergent lineages mostly remain unclear. One hypothesis that has gained traction is that, within-host evolution, especially in chronically infected individuals, plays a crucial role in the evolutionary dynamics of the double stranded RNA virus, supported by the accelerated mutation rates and lineage-defining mutations observed in chronically infected individuals. Strides have been made in predicting future escape variants, which carry mutations that allow the virus to evade immune responses. These variants are shaped by the selection pressures within hosts, yet little attention has been paid to the role of within-host diversity and what it can tell us about viral adaptation under these pressures. Overview of the project goal This research aims to improve our understanding of chronic infection prevalence and transmission dynamics, aiding in predicting escape variants and guiding variant-specific vaccine development. We have access to the Office for National Statistics Covid-19 Infection Survey data, containing over 125,000 SARS-CoV-2 sequences, including sequences across multiple time points during an infection. This doctoral project aims to develop methods for quality control on sub consensus-level sequences to ensure robust inferences of true biological within-host diversity, differing from sequencing artifacts. By analyzing viral sequences and regular testing data, the project will analyze the temporal dynamics of within-host viral diversity throughout an infection, predicting time since infection (TSI) from viral genetic factors. There is a possibility of linking this dataset to the UK's Test and Trace app data which could enhance TSI estimates and understanding of infection timelines. Lastly, the project will assess whether low-frequency viral variants predict the emergence of globally prevalent mutations, such as those seen in VOCs. This project falls within the EPSRC Biological Informatics research area. Aims AIM 1: establishing a new methodology and tool for identifying artifactual minor alleles from sub consensus-level sequences of viral genomes in a population-based study. Unsupervised clustering algorithms will integrate study-level metadata and contextual genomic data to distinguish artifactual from biological genetic variation based on the patterns that would be expected under evolutionary processes. AIM 2: developing a framework for predicting the time since infection from viral genetic factors at the sub consensus level. This will include estimating TSI associated with sequences from a large study, curating a set of predictors (e.g. within-host nucleotide diversity, Shannon entropy) from the viral genetic data, correcting for artifactual sites from AIM 1, and developing a predictive model trained on predicting TSI. AIM 3: determining whether the appearance of intrahost single nucleotide variants within individuals can help predict future escape variants. This may take in consideration the temporal element of an infection at which these variants appear from AIM 2. Conclusion This proposed doctoral project will leverage an existing comprehensive SARS-CoV-2 datasets to bridge significant gaps in our current knowledge of within-host evolution of RNA viruses and provide tools that are applicable to managing current and future RNA viral outbreaks.