RAPID: Augmented Intelligence for Accelerating Covid-Related Scientific Discovery

The project will develop new artificial intelligence (AI) methods to augment the productivity of biomedical researchers and accelerate scientific discovery in the context of the COVID-19 pandemic. We will issue weekly updates to our widely-used Cord-19 and SciSight resources, which are a critical resource for researchers studying SARS-CoV-2, having already been downloaded over 100,000 times by other researchers. We will also extend these resources to make them more useful to doctors and researchers in two ways. First, we will automatically generate one-sentence summaries of each paper to speed sensemaking of the rapidly changing literature. Second, we will automatically extract a broad range of entities (such as disease symptoms and research challenges) and relations to improve filtering and search.

In order to generate one-sentence summaries of research papers, we will train an abstractive BART model, using two novel techniques: 1) co-training on the auxiliary task of title prediction, and 2) fine-tuning using a set of one-sentence summaries that we will generate by crowd-sourcing edits peer-review comments taken from sites such as OpenReview. We will test our one-sentence summary generation with a combination of automated (Rouge) metrics and user preference. In order to increase the number of entities and relations extracted from research papers, we will bootstrap with data-programming techniques then apply graph-neural-network methods. We will evaluate our progress using a combination of expert-annotated data and held out information from relevant knowledge bases.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

RAPID: Augmented Intelligence for Accelerating Covid-Related Scientific Discovery

Key facts

Abstract