Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, +8 more

- 01 Jan 2020 -

Journal of Machine Learning Research

- Vol. 21, Iss: 140, pp 1-67

TLDR

This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.

Abstract:

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

Citations

PDF

Open Access

More filters

Posted Content

Error Detection in Large-Scale Natural Language Understanding Systems Using Transformer Models

Rakesh Chada, +3 more

- 04 Sep 2021 -

arXiv: Computation and Language

TL;DR: In this article, the authors combine utterance encodings from a RoBERTa model with the Nbest hypothesis produced by the production system and fine-tune end to end in a multitask setting using a small dataset of humanannotated utterances with domain classification errors.

...read moreread less

Proceedings ArticleDOI

RewardsOfSum: Exploring Reinforcement Learning Rewards for Summarisation

Jacob Parnell, +2 more

TL;DR: This article propose two reward functions for abstractive summarization: the first function, referred to as RwB-Hinge, dynamically selects the samples for the gradient update, and the second function, nicknamed RISK, leverages a small pool of strong candidates to inform the reward.

...read moreread less

Proceedings ArticleDOI

MCL@IITK at SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation using Augmented Data, Signals, and Transformers

Rohan Gupta, +3 more

TL;DR: This article used transformer-based language models to detect whether a given word common to both the sentences evokes the same meaning in a sentence pair classification task, achieving the best performance in the SemEval 2021 Task 2.

...read moreread less

Posted Content

Unsupervised and Distributional Detection of Machine-Generated Text.

Matthias Gallé, +3 more

- 04 Nov 2021 -

arXiv: Computation and Language

TL;DR: The authors proposed a method to detect machine-generated documents leveraging repeated higher-order n-grams, which they show over-appear in machine generated text as compared to human ones.

...read moreread less

Posted Content

How not to Lie with a Benchmark: Rearranging NLP Leaderboards.

Shavrina Tatiana, +1 more

- 02 Dec 2021 -

arXiv: Computation and Language

TL;DR: In this paper, the authors examine popular NLP benchmarks' overall scoring methods and rearrange the models by geometric and harmonic mean (appropriate for averaging rates) according to their reported results.

...read moreread less