scispace - formally typeset
Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.
Abstract
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

Prefix-Tuning: Optimizing Continuous Prompts for Generation

TL;DR: This article proposed prefix-tuning, a lightweight alternative to finetuning for natural language generation tasks, which keeps language model parameters frozen, but optimizes a small continuous task-specific vector (called the prefix), allowing subsequent tokens to attend to this prefix as if it were virtual tokens.
Posted Content

How Context Affects Language Models' Factual Predictions

TL;DR: This paper reports that augmenting pre-trained language models in this way dramatically improves performance and that the resulting system, despite being unsupervised, is competitive with a supervised machine reading baseline.
Journal ArticleDOI

Creating and detecting fake reviews of online products

TL;DR: This research addresses the creation and detection of fake reviews, and shows that a machine classifier can accomplish this goal near-perfectly, whereas human raters exhibit significantly lower accuracy and agreement than the tested algorithms.
Proceedings ArticleDOI

Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering.

TL;DR: This paper augments a general commonsense QA framework with a knowledgeable path generator by extrapolating over existing paths in a KG with a state-of-the-art language model, which learns to connect a pair of entities in text with a dynamic, and potentially novel, multi-hop relational path.
Posted Content

Measuring Association Between Labels and Free-Text Rationales.

TL;DR: It is demonstrated that *pipelines*, models for faithful rationalization on information-extraction style tasks, do not work as well on “reasoning” tasks requiring free-text rationales, and state-of-the-art T5-based joint models exhibit desirable properties for explaining commonsense question-answering and natural language inference.
Related Papers (5)
Trending Questions (1)
What are the limitations of transfer learning with a unified text-to-text transformer?

The paper does not mention the limitations of transfer learning with a unified text-to-text transformer.