Mining Diagnostics Sequences for SARS-CoV-2 Using Variation-Aware, Graph-Based Machine Learning Approaches Applied to SARS-CoV-1, SARS-CoV-2, and MERS Datasets

Grant number: unknown

Grant search

Key facts

  • Disease

    COVID-19
  • start year

    -99
  • Known Financial Commitments (USD)

    $0
  • Funder

    C3.ai DTI
  • Principal Investigator

    Prof and Prof and Assistant Prof Nancy Amato, Lawrence Rauchwerger, Todd Treangen
  • Research Location

    United States of America
  • Lead Research Institution

    University of Illinois, Rice University
  • Research Priority Alignment

    N/A
  • Research Category

    Pathogen: natural history, transmission and diagnostics

  • Research Subcategory

    Diagnostics

  • Special Interest Tags

    N/A

  • Study Type

    Unspecified

  • Clinical Trial Details

    N/A

  • Broad Policy Alignment

    Pending

  • Age Group

    Not Applicable

  • Vulnerable Population

    Not applicable

  • Occupations of Interest

    Not applicable

Abstract

On March 11, 2020, the WHO determined that an outbreak of a novel coronavirus had begun in Wuhan, China had reached pandemic status. Deep meta-transcriptomic RNA sequencing of bronchoalveolar lavage fluid samples from COVID-19 affected patients admitted to and hospitalized in Wuhan in late December 2019 revealed sequence similarity to a SARS-like coronaviruses. This genus, Betacoronavirus, was the viral etiologic agent of the previous 2002-2003 SARS outbreak in humans of SARS (e.g., or SARS-CoV-1). Rapid and precise bacterial and viral diagnostics are extremely important in multiple clinical settings, ranging from regular visits to quick epidemic responses. This is an especially relevant question given the current COVID-19 outbreak, caused by a SARS-CoV-2 coronavirus. The goal of this project is to use human and viral whole transcriptome analysis (RNA-Seq) and genomic datasets to identify SARS-CoV-2 "within host" polymorphisms that may interfere with diagnostic platforms and to develop novel, graph-based approaches to study co-occurrence patterns for both consensus-level and low frequency variants. We will compare these results to SARS-CoV-1 and MERS genomic data, to glean population level differences and elucidate biologically relevant differences specific to SARS-CoV-2, and allow for sensitive and accurate identification and transmission analysis of SARS-CoV-2.