scispace - formally typeset
Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.
Abstract
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Element Intervention for Open Relation Extraction

TL;DR: Zhang et al. as discussed by the authors revisited the procedure of OpenRE from a causal view by formulating OpenRE using a structural causal model to identify that the above-mentioned problems stem from the spurious correlations from entities and context to the relation type.
Posted Content

BERT & Family Eat Word Salad: Experiments with Text Understanding

TL;DR: This paper study the response of large models from the BERT family to incoherent inputs that should confuse any model that claims to understand natural language, and show that models are explicitly trained to recognize invalid inputs, they can be robust to such attacks without a drop in performance.
Proceedings ArticleDOI

Controlling Industrial Robots with High-Level Verbal Commands

TL;DR: In this paper, a pre-trained language model is fine-tuned for translating verbal instructions into robot tasks, better than other semantic parsing methods, and the system is capable of handling through dialogue a variety of exceptions that happen during human-robot interaction including unknown tasks, user interruption, and changes in the world state.
Posted Content

HiTab: A Hierarchical Table Dataset for Question Answering and Natural Language Generation.

TL;DR: The HiTab dataset as mentioned in this paper provides fine-grained annotations on both entity and quantity alignment, which helps models to largely reduce spurious predictions in the QA task and also helps NLG models to generate better results in a conditional generation setting.
Posted Content

Augmented Natural Language for Generative Sequence Labeling

TL;DR: This paper proposed a generative framework for joint sequence labeling and sentence-level classification using a single, shared natural language output space, which achieved state-of-the-art performance on several NER tasks.
Related Papers (5)
Trending Questions (1)
What are the limitations of transfer learning with a unified text-to-text transformer?

The paper does not mention the limitations of transfer learning with a unified text-to-text transformer.