Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, +8 more

- 01 Jan 2020 -

Journal of Machine Learning Research

- Vol. 21, Iss: 140, pp 1-67

TLDR

This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.

Abstract:

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Character-level Representations Improve DRS-based Semantic Parsing Even in the Age of BERT

Rik van Noord, +2 more

TL;DR: The authors combine character-level and contextual language model representations to improve performance on Discourse Representation Structure parsing, using a new method of analysis based on semantic tags to demonstrate that the character level representations improve performance across a subset of selected semantic phenomena.

...read moreread less

Proceedings ArticleDOI

Language Model Transformers as Evaluators for Open-domain Dialogues.

Rostislav Nedelchev, +2 more

TL;DR: This work investigates whether language models (LM) based on transformer neural networks can indicate the quality of a conversation and demonstrates that human evaluators have a positive correlation between the output of the language models and scores.

...read moreread less

Posted Content

OpenPrompt: An Open-source Framework for Prompt-learning

Ning Ding, +6 more

- 03 Nov 2021 -

arXiv: Computation and Language

TL;DR: OpenPrompt as discussed by the authors is a toolkit for prompt learning over pre-trained language models (PLMs), which can combine different PLMs, task formats, and prompting modules in a unified paradigm.

...read moreread less

Proceedings ArticleDOI

OpenPrompt: An Open-source Framework for Prompt-learning

TL;DR: Ding et al. as discussed by the authors presented a system demonstration at the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations (ACLS).

...read moreread less

Posted Content

A Baseline Analysis for Podcast Abstractive Summarization

Chujie Zheng, +3 more

- 24 Aug 2020 -

arXiv: Computation and Language

TL;DR: A baseline analysis of podcast summarization using the Spotify Podcast Dataset provided by TREC 2020 is presented to help researchers understand current state-of-the-art pre-trained models and hence build a foundation for creating better models.

...read moreread less