MRI: Acquisition of a Massive Database to Accelerate Data Science Discovery

  • Funded by National Science Foundation (NSF)
  • Total publications:1 publications

Grant number: 2117345

Grant search

Key facts

  • Disease

    COVID-19
  • Start & end year

    2021
    2023
  • Known Financial Commitments (USD)

    $725,016
  • Funder

    National Science Foundation (NSF)
  • Principal Investigator

    Christopher Danforth
  • Research Location

    United States of America
  • Lead Research Institution

    University of Vermont & State Agricultural College
  • Research Priority Alignment

    N/A
  • Research Category

    Secondary impacts of disease, response & control measures

  • Research Subcategory

    Other secondary impacts

  • Special Interest Tags

    Data Management and Data Sharing

  • Study Type

    Non-Clinical

  • Clinical Trial Details

    N/A

  • Broad Policy Alignment

    Pending

  • Age Group

    Not Applicable

  • Vulnerable Population

    Not applicable

  • Occupations of Interest

    Not applicable

Abstract

This project is jointly funded by the Major Research Instrumentation and the Established Program to Stimulate Competitive Research (EPSCoR) programs. The project funds construction of DataMountain, a massive database cluster for high performance computing at the University of Vermont (UVM). The large-memory machine will enhance the Vermont Advanced Computing Core, a virtual laboratory supporting the research of over 500 scientists in the state of Vermont. With so many fields transitioning from data-scarce to data-rich environments, many important research areas will benefit from this new machine including research into addiction, mental illness, climate change, drug discovery, food systems, and the spread of online misinformation. DataMountain will allow for fast access to enormous datasets, supporting several projects that require computational power and speed to effectively analyze, describe, and explain rapidly growing datasets.

DataMountain will increase by nearly two orders of magnitude the largest random access memory machine available for computational research at UVM, accelerating large-scale data-driven research requiring rapid reading and writing, and facilitating a broad and diverse set of important scientific investigations not currently possible given the existing hardware. It will also enhance the functionality of the high performance computing clusters BlueMoon and DeepGreen, which are dedicated to parallel processing and machine learning respectively. For example, the machine will allow for interactive access to over 50 terabytes of social media data through http://storywrangling.org and http://hedonometer.org for timely analysis of changes related to the COVID-19 pandemic in population-scale physical and mental health data. In addition, DataMountain will allow for massive increases in the spatial and temporal resolution of computational chemistry simulations being performed for data-driven design of next-generation antimicrobial peptides to combat antibiotic resistance. DataMountain will also enable exploration of petabytes of fMRI, genetic, task performance, and survey data associated with 10,000 adolescents across the United States over the next decade. In addition, the machine will accelerate research using unmanned aerial surveillance imaging for tree canopy assessments, facilitate network science modeling of agricultural diversity of crops and nutritional outcomes globally, and help quantify the impacts of the COVID-19 pandemic on food insecurity.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Publicationslinked via Europe PMC

Computational timeline reconstruction of the stories surrounding Trump: Story turbulence, narrative control, and collective chronopathy.