Applying a Targeted Machine Learning and Causal Inference Approach to Analyzing Long-Term Sequelae of COVID-19 Infection Through the National COVID Cohort Collaborative.
- Funded by National Institutes of Health (NIH)
- Total publications:0 publications
Grant number: 1K01AI182501-01
Grant search
Key facts
Disease
COVID-19Start & end year
20242029Known Financial Commitments (USD)
$134,923Funder
National Institutes of Health (NIH)Principal Investigator
POSTDOCTORAL SCHOLAR AND INSTRUCTOR Zachary Butzin-DozierResearch Location
United States of AmericaLead Research Institution
UNIVERSITY OF CALIFORNIA BERKELEYResearch Priority Alignment
N/A
Research Category
Pathogen: natural history, transmission and diagnostics
Research Subcategory
Immunity
Special Interest Tags
N/A
Study Type
Non-Clinical
Clinical Trial Details
N/A
Broad Policy Alignment
Pending
Age Group
Unspecified
Vulnerable Population
Unspecified
Occupations of Interest
Unspecified
Abstract
PROJECT SUMMARY / ABSTRACT Candidate: I am an epidemiologist in the Division of Biostatistics at the University of California, Berkeley School of Public Health, and I completed my Ph.D. in Epidemiology in August 2022 at UC Berkeley. Since my graduation, I have worked with the Center for Targeted Machine Learning and Causal Inference (CTML) to apply cutting-edge biostatistical and causal inference methods to pressing COVID-19 research questions using data from the National COVID Cohort Collaborative (N3C). I led a group of CTML epidemiologists and biostatisticians in the NIH Long COVID Computational Challenge (L3C) competition, where we were honored with third place for our ensemble machine learning model that accurately predicted the risk of Long COVID diagnosis based on individual electronic health record (EHR) data in N3C. I aim to become a leader in the application of innovative biostatistical, causal inference, and machine learning methods to impactful research questions related to infectious disease epidemiology. Environment: In order to attain my career goals, my training and mentorship plan will focus on recent advances in biostatistics, causal inference, and data science methods (Targeted Machine Learning) as well as immunology and infectious disease epidemiology. I have assembled an interdisciplinary team of expert biostatisticians, epidemiologists, and clinicians who will support my training. Alan Hubbard (primary mentor) and Mark van der Laan (co-mentor) will provide expert guidance and mentorship on biostatistics, data science, and causal inference. Rena Patel (co-mentor) and Jack Colford (scientific advisor) will provide mentorship and guidance in infectious disease epidemiology and immunology. Research: Researchers and clinicians have made enormous progress in understanding, preventing, and treating acute COVID-19 infection, but there is considerable uncertainty regarding the factors associated with long-term sequelae of COVID-19 infection. Although vaccination is a key strategy for COVID-19 epidemic control, little is known regarding the role of COVID-19 vaccination timing relative to COVID-19 infection (i.e., up-to-date vaccinations and boosters) in preventing long-term sequelae of infection, and the lack of objective Long COVID biomarkers hampers our ability to evaluate, prevent, and treat Long COVID. In Aim 1, I will evaluate the relationship between vaccination timing and Long COVID diagnosis in order to determine an optimized vaccination schedule to minimize Long COVID. In Aim 2, I will assess the relationship between COVID-19 vaccination timing and individual long-term sequelae of COVID-19 infection. In Aim 3, I will assess mediation of the relationship between acute COVID-19 infection and Long COVID via interleukin 6 (IL-6) to evaluate a biological mechanism of interest. I will apply Targeted Machine Learning methods to achieve these aims, which will prepare me for an R01-level application to apply these methods to research questions in infectious disease epidemiology.