Deep Mining With The Covid-19 Data Warehouse
- Funded by Luxembourg National Research Fund
- Total publications:0 publications
Grant number: unknown
Grant search
Key facts
Disease
COVID-19Known Financial Commitments (USD)
$22,788Funder
Luxembourg National Research FundPrincipal Investigator
Christoph SchommerResearch Location
LuxembourgLead Research Institution
University of LuxembourgResearch Priority Alignment
N/A
Research Category
13
Research Subcategory
N/A
Special Interest Tags
Data Management and Data Sharing
Study Type
Non-Clinical
Clinical Trial Details
N/A
Broad Policy Alignment
Pending
Age Group
Not Applicable
Vulnerable Population
Not applicable
Occupations of Interest
Not applicable
Abstract
In a time where COVID-19 is attracting worldwide attention, the data quantity and variety is increasing dramatically. The result are data lakes, where (raw) data appears in different formats and quality. In the case of COVID-19, the Johns Hopkins University Center for Systems Science and Engineering (JSU-CCSE) has compiled a number of various data sources including data from the World Health Organization and others, where the published data itself is largely time-series data that covers worldwide mortality rates, infected and recovered cases of the Covid-19 disease for more than 200 countries. The Open Research Dataset Challenge (CORD-19) is a resource of almost 60000 scholarly articles, where more than 75% of these are full text articles. These are only two examples of publicly available data that aims to provide a comprehensible analysis of the entire disease development. The decisive problem here, however, is that the heterogeneity, diversity, and (partially) unstructuredness of data makes a deep analysis more difficult rather than easier. In this view, DEEPHOUSE has two central goals: first, we consolidate the available text data and time series data in a Covid-19 data warehouse, e.g., along multidimensional axes (time, place, and topic) by applying appropriate data integration techniques. Second, we build a web-based platform being extendable, which demonstrates the successful discovery of time-related sequences or time series, for example by visualization or tracking of topics over time. Since data underpins the warehouse, the methodology of DEEPHOUSE is transferable to other diseases.