Illuminating the Dark RNA Virome through ultra-deep homology search
- Funded by Canadian Institutes of Health Research (CIHR)
- Total publications:0 publications
Grant number: 478741
Grant search
Key facts
Disease
Disease Xstart year
2023Known Financial Commitments (USD)
$168,764.09Funder
Canadian Institutes of Health Research (CIHR)Principal Investigator
Babaian ArtemResearch Location
CanadaLead Research Institution
University of TorontoResearch Priority Alignment
N/A
Research Category
Pathogen: natural history, transmission and diagnostics
Research Subcategory
Diagnostics
Special Interest Tags
Data Management and Data SharingInnovation
Study Type
Non-Clinical
Clinical Trial Details
N/A
Broad Policy Alignment
Pending
Age Group
Not Applicable
Vulnerable Population
Not applicable
Occupations of Interest
Not applicable
Abstract
SARS-CoV-2 has killed millions and cost the economy trillions. To pre-empt future pandemics, we will radically increase the sensitivity of the computational tools used to detect RNA viruses. Illuminating the full diversity of Earth's viruses directly supports Canadians' health by empowering a new generation of unbiased viral diagnostics; and indirectly through better viral surveillance across all facets of One Health (monitoring of crops, wildlife, and our environments for pathogens). At most 0.1% of Earth's viruses are known. Our ignorance is due to 1) inadequate sampling of Earth's biodiversity; and 2) poor computational sensitivity for detecting viruses. To overcome each of these limits, we propose combining two state-of-the-art technologies in a dual ultra-massive (Serratus) / ultra-sensitive (AlphaFold2) experiment. Serratus is a cloud-computing platform allowing us to analyze the global-collection of DNA/RNA sequencing data. Collectively public sequencing data gathered over the past 15 years captures 10M+ samples (60M+ gigabytes) at a cost of $10+ billion. Previously with Serratus we analyzed 5.4 million samples and discovered >130,000 new species of RNA viruses (only 15,000 were known previously). This was based on standard methods which have less than 50% sensitivity for detecting a novel virus, those we call Dark RNA viruses. To identify the missing 50% of Dark RNA viruses, we will deploy an AI tool called AlphaFold2 which enables vastly more sensitive protein structure analysis, albeit at ~10,000x the computational cost. Thus we "deeply" analyze 50,000 datasets to find examples of Dark RNA viruses, and in turn perform a "broad" re-analysis of all (now 8M+) public datasets. Illuminating the RNA virome has the potential to uncover viruses associated with human diseases long believed to have an infectious causes: (re)consider Parkinson's, Crohn's, arthritis, or cancers. The goal of this project is to catalyse a new generation infectious disease research.