Illuminating the Dark RNA Virome through ultra-deep homology search

  • Funded by Canadian Institutes of Health Research (CIHR)
  • Total publications:0 publications

Grant number: 478741

Grant search

Key facts

  • Disease

    Disease X
  • start year

    2023
  • Known Financial Commitments (USD)

    $168,764.09
  • Funder

    Canadian Institutes of Health Research (CIHR)
  • Principal Investigator

    Babaian Artem
  • Research Location

    Canada
  • Lead Research Institution

    University of Toronto
  • Research Priority Alignment

    N/A
  • Research Category

    Pathogen: natural history, transmission and diagnostics

  • Research Subcategory

    Diagnostics

  • Special Interest Tags

    Data Management and Data SharingInnovation

  • Study Type

    Non-Clinical

  • Clinical Trial Details

    N/A

  • Broad Policy Alignment

    Pending

  • Age Group

    Not Applicable

  • Vulnerable Population

    Not applicable

  • Occupations of Interest

    Not applicable

Abstract

SARS-CoV-2 has killed millions and cost the economy trillions. To pre-empt future pandemics, we will radically increase the sensitivity of the computational tools used to detect RNA viruses. Illuminating the full diversity of Earth's viruses directly supports Canadians' health by empowering a new generation of unbiased viral diagnostics; and indirectly through better viral surveillance across all facets of One Health (monitoring of crops, wildlife, and our environments for pathogens). At most 0.1% of Earth's viruses are known. Our ignorance is due to 1) inadequate sampling of Earth's biodiversity; and 2) poor computational sensitivity for detecting viruses. To overcome each of these limits, we propose combining two state-of-the-art technologies in a dual ultra-massive (Serratus) / ultra-sensitive (AlphaFold2) experiment. Serratus is a cloud-computing platform allowing us to analyze the global-collection of DNA/RNA sequencing data. Collectively public sequencing data gathered over the past 15 years captures 10M+ samples (60M+ gigabytes) at a cost of $10+ billion. Previously with Serratus we analyzed 5.4 million samples and discovered >130,000 new species of RNA viruses (only 15,000 were known previously). This was based on standard methods which have less than 50% sensitivity for detecting a novel virus, those we call Dark RNA viruses. To identify the missing 50% of Dark RNA viruses, we will deploy an AI tool called AlphaFold2 which enables vastly more sensitive protein structure analysis, albeit at ~10,000x the computational cost. Thus we "deeply" analyze 50,000 datasets to find examples of Dark RNA viruses, and in turn perform a "broad" re-analysis of all (now 8M+) public datasets. Illuminating the RNA virome has the potential to uncover viruses associated with human diseases long believed to have an infectious causes: (re)consider Parkinson's, Crohn's, arthritis, or cancers. The goal of this project is to catalyse a new generation infectious disease research.