MRI: Acquisition of a Massive Database to Accelerate Data Science Discovery

This project is jointly funded by the Major Research Instrumentation and the Established Program to Stimulate Competitive Research (EPSCoR) programs. The project funds construction of DataMountain, a massive database cluster for high performance computing at the University of Vermont (UVM). The large-memory machine will enhance the Vermont Advanced Computing Core, a virtual laboratory supporting the research of over 500 scientists in the state of Vermont. With so many fields transitioning from data-scarce to data-rich environments, many important research areas will benefit from this new machine including research into addiction, mental illness, climate change, drug discovery, food systems, and the spread of online misinformation. DataMountain will allow for fast access to enormous datasets, supporting several projects that require computational power and speed to effectively analyze, describe, and explain rapidly growing datasets.

DataMountain will increase by nearly two orders of magnitude the largest random access memory machine available for computational research at UVM, accelerating large-scale data-driven research requiring rapid reading and writing, and facilitating a broad and diverse set of important scientific investigations not currently possible given the existing hardware. It will also enhance the functionality of the high performance computing clusters BlueMoon and DeepGreen, which are dedicated to parallel processing and machine learning respectively. For example, the machine will allow for interactive access to over 50 terabytes of social media data through http://storywrangling.org and http://hedonometer.org for timely analysis of changes related to the COVID-19 pandemic in population-scale physical and mental health data. In addition, DataMountain will allow for massive increases in the spatial and temporal resolution of computational chemistry simulations being performed for data-driven design of next-generation antimicrobial peptides to combat antibiotic resistance. DataMountain will also enable exploration of petabytes of fMRI, genetic, task performance, and survey data associated with 10,000 adolescents across the United States over the next decade. In addition, the machine will accelerate research using unmanned aerial surveillance imaging for tree canopy assessments, facilitate network science modeling of agricultural diversity of crops and nutritional outcomes globally, and help quantify the impacts of the COVID-19 pandemic on food insecurity.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Visualise a dataset

Research Funding Tracker

Clinical Research Registrations Tracker

Explore a dataset

Research Funding Tracker

Clinical Research Registrations Tracker

MRI: Acquisition of a Massive Database to Accelerate Data Science Discovery

Key facts

Abstract

1 Publication linked via Europe PMC

Computational timeline reconstruction of the stories surrounding Trump: Story turbulence, narrative control, and collective chronopathy.

Authors

Publish Year

Journal

DOI