scispace - formally typeset
Search or ask a question
Author

Junyi Jessy Li

Bio: Junyi Jessy Li is an academic researcher from University of Texas at Austin. The author has contributed to research in topics: Computer science & Sentence. The author has an hindex of 15, co-authored 77 publications receiving 727 citations. Previous affiliations of Junyi Jessy Li include University of Illinois at Chicago & University of Pennsylvania.

Papers published on a yearly basis

Papers
More filters
Proceedings ArticleDOI
01 Jul 2018
TL;DR: This paper present a corpus of 5,000 richly annotated abstracts of medical articles describing clinical randomized controlled trials, including demarcations of text spans that describe the Patient population enrolled, the Interventions studied and to what they were Compared, and the Outcomes measured.
Abstract: We present a corpus of 5,000 richly annotated abstracts of medical articles describing clinical randomized controlled trials. Annotations include demarcations of text spans that describe the Patient population enrolled, the Interventions studied and to what they were Compared, and the Outcomes measured (the ‘PICO’ elements). These spans are further annotated at a more granular level, e.g., individual interventions within them are marked and mapped onto a structured medical vocabulary. We acquired annotations from a diverse set of workers with varying levels of expertise and cost. We describe our data collection process and the corpus itself in detail. We then outline a set of challenging NLP tasks that would aid searching of the medical literature and the practice of evidence-based medicine.

153 citations

Journal ArticleDOI
TL;DR: It is shown that not only do humans overwhelmingly prefer GPT-3 summaries, but these also do not suffer from common dataset-specific issues such as poor factuality, and both reference-based and reference-free automatic metrics cannot reliably evaluate zero-shot summaries.
Abstract: The recent success of prompting large language models like GPT-3 has led to a paradigm shift in NLP research. In this paper, we study its impact on text summarization, focusing on the classic benchmark domain of news summarization. First, we investigate how GPT-3 compares against fine-tuned models trained on large summarization datasets. We show that not only do humans overwhelmingly prefer GPT-3 summaries, prompted using only a task description, but these also do not suffer from common dataset-specific issues such as poor factuality. Next, we study what this means for evaluation, particularly the role of gold standard test sets. Our experiments show that both reference-based and reference-free automatic metrics cannot reliably evaluate GPT-3 summaries. Finally, we evaluate models on a setting beyond generic summarization, specifically keyword-based summarization, and show how dominant fine-tuning approaches compare to prompting. To support further research, we release: (a) a corpus of 10K generated summaries from fine-tuned and prompt-based models across 4 standard summarization benchmarks, (b) 1K human preference judgments comparing different systems for generic- and keyword-based summarization.

99 citations

Proceedings Article
25 Jan 2015
TL;DR: A practical system for predicting sentence specificity which exploits only features that require minimum processing and is trained in a semi-supervised manner, and shows that specificity is a useful indicator for finding sentences that need to be simplified and a useful objective for simplification.
Abstract: Recent studies have demonstrated that specificity is an important characterization of texts potentially beneficial for a range of applications such as multi-document news summarization and analysis of science journalism. The feasibility of automatically predicting sentence specificity from a rich set of features has also been confirmed in prior work. In this paper we present a practical system for predicting sentence specificity which exploits only features that require minimum processing and is trained in a semi-supervised manner. Our system outperforms the state-of-the-art method for predicting sentence specificity and does not require part of speech tagging or syntactic parsing as the prior methods did. With the tool that we developed — SPECITELLER — we study the role of specificity in sentence simplification. We show that specificity is a useful indicator for finding sentences that need to be simplified and a useful objective for simplification, descriptive of the differences between original and simplified sentences.

89 citations

Proceedings ArticleDOI
01 Jan 2017
TL;DR: A suite of methods for aggregating sequential crowd labels to infer a best single set of consensus annotations and using crowd annotations as training data for a model that can predict sequences in unannotated text are evaluated.
Abstract: Despite sequences being core to NLP, scant work has considered how to handle noisy sequence labels from multiple annotators for the same text. Given such annotations, we consider two complementary tasks: (1) aggregating sequential crowd labels to infer a best single set of consensus annotations; and (2) using crowd annotations as training data for a model that can predict sequences in unannotated text. For aggregation, we propose a novel Hidden Markov Model variant. To predict sequences in unannotated text, we propose a neural approach using Long Short Term Memory. We evaluate a suite of methods across two different applications and text genres: Named-Entity Recognition in news articles and Information Extraction from biomedical abstracts. Results show improvement over strong baselines. Our source code and data are available online.

81 citations

Proceedings ArticleDOI
01 Sep 2016
TL;DR: This work examines how elementary discourse units from Rhetorical Structure Theory can be used to extend extractive summarizers to produce a wider range of human-like summaries and demonstrates that EDU segmentation is effective in preserving human-labeled summarization concepts within sentences.
Abstract: Although human-written summaries of documents tend to involve significant edits to the source text, most automated summarizers are extractive and select sentences verbatim. In this work we examine how elementary discourse units (EDUs) from Rhetorical Structure Theory can be used to extend extractive summarizers to produce a wider range of human-like summaries. Our analysis demonstrates that EDU segmentation is effective in preserving human-labeled summarization concepts within sentences and also aligns with near-extractive summaries constructed by news editors. Finally, we show that using EDUs as units of content selection instead of sentences leads to stronger summarization performance in near-extractive scenarios, especially under tight budgets.

58 citations


Cited by
More filters
01 Jan 2009

7,241 citations

Journal ArticleDOI
TL;DR: The Nature and Origins of Mass Opinion by John Zaller (1992) as discussed by the authors is a model of mass opinion formation that offers readers an introduction to the prevailing theory of opinion formation.
Abstract: Originally published in Contemporary Psychology: APA Review of Books, 1994, Vol 39(2), 225. Reviews the book, The Nature and Origins of Mass Opinion by John Zaller (1992). The author's commendable effort to specify a model of mass opinion formation offers readers an introduction to the prevailing vi

3,150 citations

Proceedings ArticleDOI
01 Nov 2019
TL;DR: SciBERT leverages unsupervised pretraining on a large multi-domain corpus of scientific publications to improve performance on downstream scientific NLP tasks and demonstrates statistically significant improvements over BERT.
Abstract: Obtaining large-scale annotated data for NLP tasks in the scientific domain is challenging and expensive. We release SciBERT, a pretrained language model based on BERT (Devlin et. al., 2018) to address the lack of high-quality, large-scale labeled scientific data. SciBERT leverages unsupervised pretraining on a large multi-domain corpus of scientific publications to improve performance on downstream scientific NLP tasks. We evaluate on a suite of tasks including sequence tagging, sentence classification and dependency parsing, with datasets from a variety of scientific domains. We demonstrate statistically significant improvements over BERT and achieve new state-of-the-art results on several of these tasks. The code and pretrained models are available at https://github.com/allenai/scibert/.

1,864 citations

Posted Content
TL;DR: A neural network model with a novel intra-attention that attends over the input and continuously generated output separately, and a new training method that combines standard supervised word prediction and reinforcement learning (RL) that produces higher quality summaries.
Abstract: Attentional, RNN-based encoder-decoder models for abstractive summarization have achieved good performance on short input and output sequences. For longer documents and summaries however these models often include repetitive and incoherent phrases. We introduce a neural network model with a novel intra-attention that attends over the input and continuously generated output separately, and a new training method that combines standard supervised word prediction and reinforcement learning (RL). Models trained only with supervised learning often exhibit "exposure bias" - they assume ground truth is provided at each step during training. However, when standard word prediction is combined with the global sequence prediction training of RL the resulting summaries become more readable. We evaluate this model on the CNN/Daily Mail and New York Times datasets. Our model obtains a 41.16 ROUGE-1 score on the CNN/Daily Mail dataset, an improvement over previous state-of-the-art models. Human evaluation also shows that our model produces higher quality summaries.

1,119 citations