scispace - formally typeset
Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Reads0
Chats0
TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.
Abstract
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

Text-to-Text Multi-view Learning for Passage Re-ranking

TL;DR: This article proposed a text-to-text multi-view learning framework by incorporating an additional view, the text generation view, into a typical single-view passage ranking model, which is of help to the ranking performance compared to its singleview counterpart.
Posted Content

Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction

TL;DR: This paper proposed a dual-channel span pruning strategy by incorporating supervision from the Aspect Term Extraction (ATE) and Opinion Term Extraction (OTE) tasks, which not only improves computational efficiency but also distinguishes the opinion and target spans more properly.
Proceedings ArticleDOI

Improving Social Meaning Detection with Pragmatic Masking and Surrogate Fine-Tuning

TL;DR: The authors propose pragmatic masking and surrogate fine-tuning as two complementing strategies that exploit social cues to drive pre-trained representations toward a broad set of concepts useful for a wide class of social meaning tasks.
Proceedings Article

Automatic Text Evaluation through the Lens of Wasserstein Barycenters.

TL;DR: In this paper, a new metric BaryScore is proposed to evaluate text generation based on deep contextualized embeddings (e.g., BERT, Roberta, ELMo).
Proceedings Article

ExplaGraphs: An Explanation Graph Generation Task for Structured Commonsense Reasoning.

TL;DR: Recently, this article presented ExplaGraphs, a new generative and structured commonsense-reasoning task (and an associated dataset) of explanation graph generation for stance prediction, where given a belief and an argument, a model has to predict if the argument supports or counters the belief and also generate a commonsense augmented graph that serves as non-trivial, complete, and unambiguous explanation for the predicted stance.
Related Papers (5)
Trending Questions (1)
What are the limitations of transfer learning with a unified text-to-text transformer?

The paper does not mention the limitations of transfer learning with a unified text-to-text transformer.