Scalable Inference in Statistical Models of Viral Evolution and Human Health

  • Funded by National Institutes of Health (NIH)
  • Total publications:0 publications

Grant number: 1F31AI154824-01A1

Grant search

Key facts

  • Disease

    Lassa Haemorrhagic Fever, Ebola
  • Start & end year

    2021
    2023
  • Known Financial Commitments (USD)

    $37,853
  • Funder

    National Institutes of Health (NIH)
  • Principal Investigator

    PHD STUDENT Gabriel Hassler
  • Research Location

    United States of America
  • Lead Research Institution

    UNIVERSITY OF CALIFORNIA LOS ANGELES
  • Research Priority Alignment

    N/A
  • Research Category

    Epidemiological studies

  • Research Subcategory

    Disease surveillance & mapping

  • Special Interest Tags

    Data Management and Data Sharing

  • Study Type

    Non-Clinical

  • Clinical Trial Details

    N/A

  • Broad Policy Alignment

    Pending

  • Age Group

    Not Applicable

  • Vulnerable Population

    Not applicable

  • Occupations of Interest

    Not applicable

Abstract

Project Summary / Abstract Despite global public health advances, viruses remain a major threat to human health both in the United States and internationally. Recent and continuing outbreaks of SARS-CoV-2, Ebola, Zika, Lassa fever, and Chikungunya, as well as persistent epidemics such as HIV have emphasized the need to understand viral evolution and virus-host interactions during epidemics. Phylogenetic statistical models of viral evolution offer a powerful tool for studying the interplay between viral genetics and environmental or host factors. However, current phylogenetic models are often too inï¬Â'exible to realistically model these relationships, and those that do are computationally intractable for even moderately sized data sets. This project aims to develop new statistical models that are both ï¬Â'exible enough to model complex biological relationships and scalable to large data sets of viral and host traits. The first aim is to develop more efficient and less biased statistical methods for estimating the heritability of viral phenotypes (e.g. viral load, host CD4 T-cell count, replicative capacity). Current statistical practices typically produced biased heritability estimates and are intractable for large data sets. This project seeks to extend state-of-the-art inference techniques to model the heritability of viral pheno- types (enabling both unbiased and efficient inference) and to apply these new methods to better estimate the heritability of viral load in HIV-1. The second aim seeks to develop statistical methods for studying complex, high-dimensional viral phenotypes such as infection severity which cannot be captured with a single measure- ment. These phenotypes are difficult to quantify due to their inherent complexity, confounding rigorous efforts at, say, identifying unusually virulent viral clades. While phylogenetic factor analysis enables identification and quantification of high-dimensional phenotypes, it scales poorly to large data sets. We propose new inference techniques that address these scalability problems and allow previously intractable analyses. We plan to apply these new methods to study patterns of virulence in Ebola and Lassa fever and to identify unusually virulent viral strains. Additionally, these methods are well suited to identifying epistatic interactions between viral mu- tations and phenotypes of interest, and we plan to explore these interactions in HIV, Zika, and Chikungunya viruses. The third aim is to develop new statistical models specifically designed to predict outcomes of viral infections from viral sequence data. To accommodate the necessary ï¬Â'exibility required by these models, we develop new inference strategies that are both highly generalizable (i.e. they do not rely on strict assumptions in existing models) and computationally efficient. Strong predictive performance would enable researchers or clinicians to predict clinically relevant outcomes using viral sequences, which could help inform treatment. We will evaluate these methods using the Ebola and Lassa fever data from mentioned above.