Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, +8 more

- 01 Jan 2020 -

Journal of Machine Learning Research

- Vol. 21, Iss: 140, pp 1-67

TLDR

This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.

Abstract:

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

Citations

PDF

Open Access

More filters

Proceedings Article

Explaining Answers with Entailment Trees.

Bhavana Dalvi, +6 more

TL;DR: The authors generate explanations in the form of multistep entailment trees, a tree of multipremise entailment steps from facts that are known, through intermediate conclusions, to the hypothesis of interest (namely the question + answer).

...read moreread less

Proceedings ArticleDOI

Media File Descriptor Using Deep Learning

Parth Dosani, +3 more

TL;DR: The Media File Descriptor as mentioned in this paper is a generalized platform used to provide a summary of any media file that is audio, video or image quickly so that the user has optimum information of the media file beforehand by reading the summary and can decide if the file is relevant to what they are looking for.

...read moreread less

Journal ArticleDOI

Predict-then-Decide: A Predictive Approach for Wait or Answer Task in Dialogue Systems

Zehao Lin, +8 more

- 27 May 2020 -

arXiv: Computation and Language

TL;DR: In this article, the authors propose a predictive approach named Predict-then-Decide (PTD) to solve the Wait-Or-Answer problem in a dialogue system, where the decision of decision model is made with the assistance of two ancillary prediction models: a user prediction and an agent prediction.

...read moreread less

Proceedings Article

Exploring Multitask Learning for Low-Resource Abstractive Summarization

Ahmed Magooda, +2 more

TL;DR: This paper explored the effect of using multitask learning for abstractive summarization in the context of small training corpora and found that a model trained in a multitask setting outperforms a model that is trained only for summarization, with no additional summarization data.

...read moreread less

Posted Content

SynthBio: A Case Study in Human-AI Collaborative Curation of Text Datasets.

Ann Yuan, +5 more

- 11 Nov 2021 -

arXiv: Computation and Language

TL;DR: This paper used a large language model to provide seed generations to human raters, thereby changing dataset authoring from a writing task to an editing task, and used this method to curate SynthBio - a new evaluation set for WikiBio - composed of structured attribute lists describing fictional individuals, mapped to natural language biographies.

...read moreread less