Bioinformatics Framework for Wastewater-based Surveillance of Infectious Diseases

  • Funded by National Institutes of Health (NIH)
  • Total publications:0 publications

Grant number: unknown

Grant search

Key facts

  • Disease

  • Start & end year

  • Known Financial Commitments (USD)

  • Funder

    National Institutes of Health (NIH)
  • Principle Investigator

  • Research Location

    United States of America, Americas
  • Lead Research Institution

  • Research Category

    Epidemiological studies

  • Research Subcategory

    Disease susceptibility

  • Special Interest Tags


  • Study Subject


  • Clinical Trial Details


  • Broad Policy Alignment


  • Age Group


  • Vulnerable Population


  • Occupations of Interest



Project Summary SARS-CoV-2 is now a global pandemic with 4.2M cases and 290K deaths worldwide (as of May 12, 2020).In the United States, there are over 1.3M cases and 81K deaths. Locally, Arizona has over 11K cases and 562deaths. In response to this public health emergency, several studies have been published that describepatient characteristics in terms of signs, symptoms, and clinical endpoints. In addition, epidemiologists andinfectious disease researchers have utilized next-generation sequencing technology to produce completegenomes of the virus for clinical and epidemiologic investigation. Genomic epidemiology has enabled scientiststo understanding localized transmission while determining geographic sources of introductions from differentstates and countries. However, most of the sequencing for SARS-CoV-2 (as well as for other viruses) isperformed outside of state or local health departments such as the Centers for Disease Control and Prevention(CDC), universities, or private labs. It can then be difficult to link the pathogen, once sequenced, back to thedata collected by the health department for case investigation. This can inhibit genomic epidemiology whenthere is no link between sequences of viral isolates and epidemiologic case data. There is limited research in how to link pathogen sequences to epidemiologic case data; especially forCOVID-19. Thus, despite the abundance of clinical and epidemiologic data collected during this pandemic,more informatics research is needed to understand how to link viral genetic and epidemiological data anddemonstrate the value of this for disease surveillance. The goal of this supplement is to link epidemiologic data from COVID-19 positive patients in Arizona withviral genetics from sequenced isolates to better understand the relationship between viral genetics andepidemiologic and clinical phenotypes. We will accomplish this by utilizing Arizona's disease surveillancesystem and available sequences and metadata that are published in online nucleic acid databases. We will usedifferent probabilistic matching strategies to link the two different sources (Aim 1) and then use Bayesianphylogenetics and phylogeography to study clustering of epidemiologic cases (Aim 2). Epidemiologists can usethese findings to gain an understanding of how local viruses genetically cluster in relation to specificepidemiologic and clinical cases. While disease severity is dependent on individual immune response andenvironmental factors, linking viral genetics to its proper epidemiologic case could also support hypothesisgeneration for future reverse genetics and immunological studies in animal models.