ATD: New Algorithms for Inference and Predictions on Large Geospatial Datasets

  • Funded by National Science Foundation (NSF)
  • Total publications:2 publications

Grant number: 2124222

Grant search

Key facts

  • Disease

    COVID-19
  • Start & end year

    2021
    2024
  • Known Financial Commitments (USD)

    $200,000
  • Funder

    National Science Foundation (NSF)
  • Principal Investigator

    Sayar Karmakar
  • Research Location

    United States of America
  • Lead Research Institution

    University of Florida
  • Research Priority Alignment

    N/A
  • Research Category

    Epidemiological studies

  • Research Subcategory

    N/A

  • Special Interest Tags

    Data Management and Data Sharing

  • Study Type

    Non-Clinical

  • Clinical Trial Details

    N/A

  • Broad Policy Alignment

    Pending

  • Age Group

    Not Applicable

  • Vulnerable Population

    Not applicable

  • Occupations of Interest

    Not applicable

Abstract

The goal of this project is to accurately model, monitor and forecast large-scale, publicly available geo-spatial datasets. In particular, we will focus on different adversarial phenomena with complex interactions between space and time dimensions. The year 2020 has left a significant mark in modern history with the devastating COVID-19 pandemic that impacted every part of the world. This gave rise to large spatio-temporal datasets with interesting time-dynamics, given different government strategies and vaccination rates in different locations. Understanding these spatio-temporal trends from the data and accurately forecasting what the future holds could be a key in mitigating contagious diseases. This type of spatio-temporal data is present in many other important applications as well. For example, one key challenge for law enforcement agencies is to learn from both historical and incoming crime log data in an efficient and effective fashion so as to optimize resource allocation. Other significant examples of spatio-temporal data arise in understanding environmental variables, studying brain images and different nodes for a period of time, understanding traffic flow, and inspection of satellite image over a long horizon of time. Despite being specific to the application at places, the appeal of this proposal is to build a comprehensive and inclusive framework where existing multivariate methods will be curated to highlight how space and time interact with each other. The project will provide research training opportunities for graduate students.

This project will focus on the following methodological aspects: i) Estimate suitable varying coefficient models with proper simultaneous confidence bands for the coefficients to help realize how these vary over time and space and then if plausible, choose simpler modeling of these coefficients over time and space ii) Identify important covariates related to human dynamics, strategic adoption, other external interventions and how they impact these variables spread over time and space and finally iii) Provide an accurate yet robust forecast for both short- and long-time horizon in the future. The existing literature on spatio-temporal data either assumes a very specific model or builds a comparative framework of the spatial distribution for different time-stamps and thus ignores a possible non-linear and non-separable interaction between space and time. This project uses some recent developments in multivariate time-series and extend them to a spatio-temporal scenario to address such generality. Since geo-spatial data are prohibitively large, the project also leverages the recent significant advances made in high-dimensional statistics literature and proposes new methods that can incorporate a very general space-time dependence. The new methods will be tested on a wide array of spatio-temporal datasets and are expected to derive new insights about how these complex stochastic processes are spread over space and time.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Publicationslinked via Europe PMC

Last Updated:an hour ago

View all publications at Europe PMC

An NLP-assisted Bayesian time-series analysis for prevalence of Twitter cyberbullying during the COVID-19 pandemic.

A model-free approach to do long-term volatility forecasting and its variants.