CAREER: Aequitas: A comprehensive machine learning framework to decode health disparities
- Funded by National Science Foundation (NSF)
- Total publications:0 publications
Grant number: 2145411
Grant search
Key facts
Disease
COVID-19Start & end year
20222027Known Financial Commitments (USD)
$499,985Funder
National Science Foundation (NSF)Principal Investigator
Joyce HoResearch Location
United States of AmericaLead Research Institution
Emory UniversityResearch Priority Alignment
N/A
Research Category
Health Systems Research
Research Subcategory
Health information systems
Special Interest Tags
Data Management and Data Sharing
Study Type
Clinical
Clinical Trial Details
Not applicable
Broad Policy Alignment
Pending
Age Group
Not Applicable
Vulnerable Population
Not applicable
Occupations of Interest
Not applicable
Abstract
Addressing health outcome variability is important to improve the health and well-being of every person. Contextual characteristics of people's lives can be primary drivers disparities observed in health. Unfortunately, rigorous scientific approaches capable of modeling health outcomes are challenged by many issues related to data collection, including data privacy, insufficient sample size, heterogeneous data, missing data, multimodal data, and varying data quality along the various modalities. This project aims to develop a machine learning platform that can integrate a variety of data sources without requiring extensive annotation efforts. The platform will also enable modeling of health outcomes across multiple data sites while protecting patient privacy. The project will ultimately make it easier for researchers to identify relevant characteristics of health using existing public health surveillance data in concert with modern data sources to inform the next steps for achieving health outcomes. The project also proposes educational activities centered around raising awareness of disparities research and broadening participation in computing. The project will advance health outcomes research by taking a comprehensive and holistic approach to unpacking the impact of health factors on patient outcomes. The project will address three fundamental and interrelated research challenges: (1) automated schema integration methods to enable collaborative standardization of data across multiple agencies; (2) broadening of the landscape of federated learning to encompass multimodal and multi-level models to incorporate factors derived from non-traditional data sources, and (3) new human-guided summarization models that leverage existing expert knowledge to reduce the annotation burden. The three research aims will be complemented by an extensive evaluation plan that includes collaboration with public health experts and physicians focused on advancing health equity. The project will demonstrate the utility and feasibility of the machine learning platform on diabetes, cardiovascular disease, and COVID-19 outcomes. This research effort will fill a critical computational gap optimal healthcare for all by harnessing data across multiple data sites and diverse data sources. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.