CAREER: New Statistical Approaches for Studying Evolutionary Processes: Inference, Attribution and Computation
- Funded by National Science Foundation (NSF)
- Total publications:0 publications
Grant number: 2143242
Grant search
Key facts
Disease
COVID-19Start & end year
20222027Known Financial Commitments (USD)
$233,818Funder
National Science Foundation (NSF)Principal Investigator
Julia PalaciosResearch Location
United States of AmericaLead Research Institution
Stanford UniversityResearch Priority Alignment
N/A
Research Category
Pathogen: natural history, transmission and diagnostics
Research Subcategory
Pathogen genomics, mutations and adaptations
Special Interest Tags
N/A
Study Type
Non-Clinical
Clinical Trial Details
N/A
Broad Policy Alignment
Pending
Age Group
Not Applicable
Vulnerable Population
Not applicable
Occupations of Interest
Not applicable
Abstract
This award is funded in whole or in part under the American Rescue Plan Act of 2021 (Public Law 117-2). Statistical inference from a sample of molecular sequences such as DNA poses a series of fundamental challenges. These challenges include complex modeling of the sample's ancestry and past evolutionary history, large and noisy data. The ongoing large-scale increase of genetic data has led to a situation in which current methods are not applicable to the amount of data available and researchers are forced to down-sample available data or to infer parameters from insufficient summary statistics. This research project will address the need for optimally designed coalescent modeling for inference from modern molecular data. The coalescent is a probability model on genealogies, that is, the trees which represent the ancestry of the sample. Coalescent models are used for inferring parameters of scientific relevance such as effective population size, migration patterns and selection. The research goals of this project are to expand the class of coalescent models and to design novel efficient statistical algorithms, allowing us to address many practical problems that advance science. Furthermore, the outcomes of the projects will foster the development of new statistical theory and tractable methods that contribute to biological solutions. This project also outlines an active plan for a broad range of educational and outreach activities that will broaden participation in statistical sciences and will enhance more inclusive atmosphere in science. The undergraduate and graduate students involved into the project will be offered a unique opportunity for interdisciplinary hands-on research training at the interface of statistical sciences and biology, allowing them to contribute to progress in evolutionary biology, molecular biology, population genetics, phylogenetics, cancer genomics, probabilistic modeling, statistical inference, and related fields. The PI will actively participate in multiple outreach activities such as the Stanford undergraduate summer research program, which will allow for recruiting more diverse pool of future data scientists and for fostering more inclusive climate in science. The research findings of the project will serve as foundation for new program in statistical genetics and will be integrated into undergraduate and graduate courses.
Concretely, this project will expand the class of coalescent models and provide a suite of new algorithmic and statistical approaches by exploiting a metric notion of genealogies, lumpability of Markov chains and divide-and-conquer strategies. The specific aims include (1) develop coalescent models to incorporate various sampling schemes and biological processes such as dynamic population structures, recombination and strong selection; (2) develop a metric framework for coalescent theory and applications; (3) develop scalable strategies for Bayesian inference of evolutionary parameters and (4) implement, validate and analyze molecular sequences of infectious disease such as SARS-CoV-2, ancient and modern human DNA samples and cancer single cell variation. Furthermore, the project will actively contribute to broadening participation in statistical sciences at multiple fronts, from team-based interdisciplinary research training to community outreach.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Concretely, this project will expand the class of coalescent models and provide a suite of new algorithmic and statistical approaches by exploiting a metric notion of genealogies, lumpability of Markov chains and divide-and-conquer strategies. The specific aims include (1) develop coalescent models to incorporate various sampling schemes and biological processes such as dynamic population structures, recombination and strong selection; (2) develop a metric framework for coalescent theory and applications; (3) develop scalable strategies for Bayesian inference of evolutionary parameters and (4) implement, validate and analyze molecular sequences of infectious disease such as SARS-CoV-2, ancient and modern human DNA samples and cancer single cell variation. Furthermore, the project will actively contribute to broadening participation in statistical sciences at multiple fronts, from team-based interdisciplinary research training to community outreach.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.