Statistical innovation to integrate sequences and phenotypes for scalable phylodynamic inference

  • Funded by National Institutes of Health (NIH)
  • Total publications:0 publications

Grant number: 5R01AI153044-04

Grant search

Key facts

  • Disease

    Lassa Haemorrhagic Fever, COVID-19
  • Start & end year

    2021
    2025
  • Known Financial Commitments (USD)

    $459,019
  • Funder

    National Institutes of Health (NIH)
  • Principal Investigator

    PROFESSOR Marc Suchard
  • Research Location

    United States of America
  • Lead Research Institution

    UNIVERSITY OF CALIFORNIA LOS ANGELES
  • Research Priority Alignment

    N/A
  • Research Category

    Pathogen: natural history, transmission and diagnostics

  • Research Subcategory

    Pathogen genomics, mutations and adaptations

  • Special Interest Tags

    N/A

  • Study Type

    Non-Clinical

  • Clinical Trial Details

    N/A

  • Broad Policy Alignment

    Pending

  • Age Group

    Not Applicable

  • Vulnerable Population

    Not applicable

  • Occupations of Interest

    Not applicable

Abstract

PROJECT SUMMARY/ABSTRACT This proposal targets the design, development and distribution of Bayesian statistical methods and software to study the historical and real-time emergence of rapidly evolving pathogens, such as Ebola, human immunodeficiency, influenza, Lassa, SARS-CoV-2, West Nile, yellow lever and Zika viruses. The proposal exploits novel scalable data integration to equip us for large-scale epidemics and pandemics and help inform actionable public health policy. Our multidisciplinary team carries expertise across statistical thinking, data science, evolutionary biology and infectious diseases to leverage advancing sequencing technology and high-throughput biological experimentation that can characterize 1 000s of pathogen genomes, phenotype measurements, ecological and clinical information from a single outbreak. Our chief innovations are three-fold. First, we will invent and implement scalable Bayesian phylodynamic techniques to integrate phenotypic measurements and study their correlated evolution with disease spread. Second, we will foster biologically-rich evolutionary models to map and understand heterogeneity in disease evolution through new efficient algorithms. Third, we will develop high-dimensional and mixed-type phenotype models to link concerted viral genotype i phenotype changes using massively parallel computing. Although no competing software exists to integrate phenotype and sequence data at this scale, we will compare restricted cases of our models with reduced datasets to current state-of-the-art approaches to evaluate computational performance improvement and bias that these limitations inject using real-world examples. This proposal will deliver low-level toolbox libraries and user-friendly software for deployment across a rapidly expanding range of large-scale problems in statistics and medicine.