scispace - formally typeset
Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Reads0
Chats0
TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.
Abstract
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

RetroNLU: Retrieval Augmented Task-Oriented Semantic Parsing

TL;DR: This paper extended a sequence-to-sequence model with a retrieval component, which is used to retrieve existing similar samples and present them as an additional context to the model, and analyzed the quality, model sensitivity, and performance of the nearest neighbor retrieval component's for semantic parses of varied utterance complexity.
Book ChapterDOI

High Performance Computing for Understanding Natural Language

TL;DR: This chapter gives an overview of state-of-the-art natural language processing problems, algorithms, models, and libraries, and details of a few specific applications that use pre-training or self-supervised learning on large amounts of data in text understanding.
Proceedings Article

K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce

TL;DR: The authors proposed a knowledge-injected pre-trained language model based on the encoder-decoder transformer that can be transferred to both natural language understanding and generation tasks, which can learn a diverse set of domain-specific knowledge.
Journal ArticleDOI

CINS: Comprehensive Instruction for Few-Shot Learning in Task-Oriented Dialog Systems

TL;DR: The authors proposed Comprehensive Instruction (CINS) that exploits pre-trained language models with extra task-specific instructions to better utilize the power of PLMs for few-shot learning in task-oriented dialog.
Posted Content

Exploring Text-transformers in AAAI 2021 Shared Task: COVID-19 Fake News Detection in English

TL;DR: This article proposed an ensemble method of different pre-trained language models such as BERT, Roberta, Ernie, etc. with various training strategies including warm-up, learning rate schedule and k-fold cross-validation.
Related Papers (5)
Trending Questions (1)
What are the limitations of transfer learning with a unified text-to-text transformer?

The paper does not mention the limitations of transfer learning with a unified text-to-text transformer.