Efficient Statistical Learning Methods for Personalized Medicine Using Large Scale Biomedical Data

  • Funded by National Institutes of Health (NIH)
  • Total publications:0 publications

Grant number: unknown

Grant search

Key facts

  • Disease

    COVID-19
  • Start & end year

    2018
    2022
  • Known Financial Commitments (USD)

    $331,147
  • Funder

    National Institutes of Health (NIH)
  • Principal Investigator

    DONGLIN ZENG
  • Research Location

    United States of America
  • Lead Research Institution

    University of North Carolina at Chapel Hill
  • Research Priority Alignment

    N/A
  • Research Category

    Epidemiological studies

  • Research Subcategory

    Impact/ effectiveness of control measures

  • Special Interest Tags

    Data Management and Data Sharing

  • Study Type

    Non-Clinical

  • Clinical Trial Details

    N/A

  • Broad Policy Alignment

    Pending

  • Age Group

    Unspecified

  • Vulnerable Population

    Unspecified

  • Occupations of Interest

    Unspecified

Abstract

Project Summary: Coronavirus disease 19 (COVID-19) has created a major public health crisis around the world. The novelcoronavirus was observed to have a long incubation period and extremely infectious during this period. No proveneffective treatment or vaccine is available. Massive public interventions have been implemented in many countriesand states in the United States (US) at different phases of the outbreak with varying combinations of social dis-tancing, mobility restriction and population behavioral change. Decisions on how to implement these interventions(e.g., when to impose and relax mitigation measures) rely on important statistics of COVID epidemiology (e.g.,effective reproduction number) that characterize and predict the course of COVID-19 outbreak. However, there isa lack of robust and parsimonious model of COVID epidemic that can accurately reflect the heterogeneity betweensusceptible populations and regions (e.g., demographics, healthcare capacity, social and economic determinants).There is no rigorous study to guide precision public health interventions that are tailored to a population or regiondepending on their characteristics. Furthermore, due to the non-randomized nature of public health interventions,it is critical to account for biases and confounding when comparing mitigation measures of COVID-19 across re-gions. To address these challenges, this project develops robust and generalizable analytic methods to evaluatepublic health interventions and assess individual patient risks of COVID-19 infection and complications. In Aim 1,we will develop dynamic and robust statistical models to predict the disease epidemic. The models will estimatethe date of the first unknown infection case, instantaneous effective reproduction number, and account for the incu-bation period of COVID-19 virus. Furthermore, heterogeneity in population's demographics, social and economicindicators, healthcare capacity and geographic locations will be incorporated to reflect their impacts on COVIDepidemic. Under a longitudinal quasi-experimental design, we will provide valid inference for comparing publichealth interventions implemented at different regions while accounting for confounding bias. Multiple sources ofdata from different states in the US will be analyzed to empirically test which states' response strategies are moreeffective and in which subpopulation. In Aim 2, we will focus on developing precise risk assessment tool of individ-ual COVID-19 patients using electronic health records (EHRs) collected at New York Presbyterian hospital in NewYork City, an epicenter of COVID-19. We will engineer features of patient's pre-conditions associated with severeCOVID complications, recovery, or death. More importantly, we will engineer features that represent proxies of virusexposures from patients' geographic information. We will use machine learning techniques to create quantitativesummaries of patient prognosis (e.g., transitioning to serious clinical stages, discharge, death). We will use inter-nal cross-validation and external calibration to validate developed algorithms. The project will generate evidenceto guide precision public health intervention, optimal patient care, and efficient healthcare resource allocation inanticipation of a second wave of COVID epidemic and in preparation of other infectious disease outbreaks.