Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, +8 more

- 01 Jan 2020 -

Journal of Machine Learning Research

- Vol. 21, Iss: 140, pp 1-67

Chats0

TLDR

This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.

Abstract:

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

What Makes Good In-Context Examples for GPT-3?

TL;DR: The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures (DeeLIO 2022) as discussed by the authors was the first workshop dedicated to knowledge extraction and integration for deep learning architectures.

...read moreread less

Proceedings ArticleDOI

SemEval-2021 Task 4: Reading Comprehension of Abstract Meaning.

Boyuan Zheng, +6 more

TL;DR: The SemEval-2021 shared task 4: Reading Comprehension of Abstract Meaning (ReCAM) as discussed by the authors was designed to help evaluate the ability of machines in representing and understanding abstract concepts.

...read moreread less

Proceedings ArticleDOI

Efficient Meta Lifelong-Learning with Limited Memory

Zirui Wang, +3 more

TL;DR: This paper proposed a meta-lifelong framework that combines three common principles of lifelong learning methods and achieved state-of-the-art performance on text classification and question answering benchmarks. But their method suffers from catastrophic forgetting and negative transfer at the same time.

...read moreread less

Proceedings ArticleDOI

CoTexT: Multi-task Learning with Code-Text Transformer

Long Phan, +6 more

TL;DR: CoTexT as discussed by the authors is a transformer-based encoder-decoder model that learns the representative context between natural language (NL) and programming language (PL) using self-supervision.

...read moreread less

Proceedings ArticleDOI

Learning to Decompose and Organize Complex Tasks

Yi Zhang, +4 more

TL;DR: A novel end-to-end pipeline that consumes a complex task and induces a dependency graph from unstructured text to represent sub-tasks and their relationships is proposed, demonstrating that its models outperform a state-of-the-art text generator significantly.

...read moreread less