Deep phenotyping in Electronic Health Records for Genomic Medicine

SUMMARYSharable, innovative and scalable methods for abstracting relevant characteristic patient phenotypes fromelectronic health records (EHRs) and for systematically understanding disease relationships are critical foraccomplishing precise disease diagnoses and personalized disease prevention and treatment for patients.As of May 28, 2020, there are 5,716,271 confirmed 2019 Novel Coronavirus (COVID-19) cases worldwide,including 1,699,933 cases in the United States, and 356,124 deaths across over 200 countries, areas, andterritories including 100,442 deaths in the United States, with the numbers continually climbing. The pandemichas had profound economic, social, and public health impact. As Columbia University Irving Medical Center(CUIMC) has been fighting the virus on the frontline in the epicenter of New York City and treating more than4,100 SARS-CoV-2 positive patients, we aim to address the urgent COVID-19 Public Heath need bydeveloping sharable phenotyping methods to identify and characterize COVID-19 cases using our EHR dataand multiple data standards, including the Observational Medical Outcomes Partnership (OMOP) CommonData Model (CDM) and the Human Phenotype Ontology (HPO), and generate novel knowledge about COVID-19, such as its risk factors, disease subtypes, and temporal clinical courses.Our specific aims for this supplement are as follows: Extension to the original Aim 1: Develop and validatescalable and sharable approaches to abstracting characteristic phenotypes of COVID-19 from both structuredand unstructured EHR data and to standardize the concept representations of these EHR phenotypes usingwidely adopted data standards, including the OMOP CDM, HPO, SNOMED-CT, UMLS, and RxNorm.Extension to the original Aim 3: Develop and validate methods for temporal phenotyping for COVID-19 andmethods for identifying disease subtypes of varying clinical outcomes among heterogeneous populations usingdeep characteristic EHR phenotypes of COVID-19.We will disseminate the resulting methods and knowledge with the broad scientific communities and the nation.We will also leverage this supplement to create research and training opportunities for postdocs and graduatestudents from biomedical informatics, data science and computer science, advancing interdisciplinarycollaborations in data science and biomedical informatics to combat COVID-19 and other health problems.

Deep phenotyping in Electronic Health Records for Genomic Medicine

Key facts

Abstract