Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, +8 more

- 01 Jan 2020 -

Journal of Machine Learning Research

- Vol. 21, Iss: 140, pp 1-67

TLDR

This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.

Abstract:

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

COM2SENSE: A Commonsense Reasoning Benchmark with Complementary Sentences

Shikhar Singh, +6 more

TL;DR: In this article, the authors introduce a commonsense reasoning benchmark dataset comprising natural language true/false statements, with each sample paired with its complementary counterpart, resulting in 4k sentence pairs.

...read moreread less

Proceedings Article

proScript: Partially Ordered Scripts Generation

Keisuke Sakaguchi, +5 more

TL;DR: This paper used pre-trained neural language models to generate high-quality scripts, at varying levels of granularity, for a wide range of everyday scenarios (e.g., bake a cake).

...read moreread less

Book ChapterDOI

Generating Empathetic Responses with a Pre-trained Conversational Model

Jackylyn Beredo, +3 more

TL;DR: In this article, a pre-trained neural conversational language model named DialoGPT and a new collection of empathetic dialogues tagged with emotions are used in order to investigate the ability of the model in learning and generating more empathic responses.

...read moreread less

Proceedings ArticleDOI

PASS: Perturb-and-Select Summarizer for Product Reviews

Nadav Oved, +1 more

TL;DR: Perturb-and-Select Summarizer (P PASS) as mentioned in this paper employs a large pre-trained Transformer-based model, which follows a few-shot fine-tuning scheme.

...read moreread less

Posted Content

Retrieval-guided Counterfactual Generation for QA

Bhargavi Paranjape, +2 more

- 14 Oct 2021 -

arXiv: Computation and Language

TL;DR: This paper developed a Retrieve-Generate-Filter (RGF) technique to create counterfactual evaluation and training data with minimal human supervision, using an open-domain QA framework and question generation model trained on original task data.

...read moreread less