Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, +8 more

- 01 Jan 2020 -

Journal of Machine Learning Research

- Vol. 21, Iss: 140, pp 1-67

Chats0

TLDR

This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.

Abstract:

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

Citations

PDF

Open Access

More filters

Posted Content

Language Model is All You Need: Natural Language Understanding as Question Answering

Mahdi Namazifar, +3 more

- 05 Nov 2020 -

arXiv: Computation and Language

TL;DR: This work map Natural Language Understanding (NLU) problems to Question Answering (QA) problems and shows that in low data regimes this approach offers significant improvements compared to other approaches to NLU.

...read moreread less

Proceedings Article

Measuring Massive Multitask Language Understanding

Dan Hendrycks, +6 more

TL;DR: This article proposed a new test to measure a text model's multitask accuracy, which covers 57 tasks including elementary mathematics, US history, computer science, law, and more, and found that models must possess extensive world knowledge and problem solving ability.

...read moreread less

Proceedings ArticleDOI

Unsupervised Text Style Transfer with Padded Masked Language Models

Eric Malmi, +2 more

TL;DR: The experiments on sentence fusion and sentiment transfer demonstrate that Masker performs competitively in a fully unsupervised setting, and in low-resource settings, it improves supervised methods' accuracy by over 10 percentage points when pre-training them on silver training data generated by Masker.

...read moreread less

Proceedings ArticleDOI

Continual lifelong learning in natural language processing: a survey

Magdalena Biesialska, +2 more

TL;DR: Continual learning aims to enable information systems to learn from a continuous data stream across time as discussed by the authors, however, it is difficult for existing deep learning architectures to learn a new task without largely forgetting previously acquired knowledge.

...read moreread less

Proceedings ArticleDOI

Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond

Zhengjie Miao, +2 more

TL;DR: Rotom as mentioned in this paper is a meta-learning framework that automatically learns a policy for combining examples from different DA operators, whereby combinatorially reducing the hyper-parameters space is achieved. But it is limited to entity matching and text classification.

...read moreread less