scispace - formally typeset
Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Reads0
Chats0
TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.
Abstract
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

CIDER: Commonsense Inference for Dialogue Explanation and Reasoning

TL;DR: The CIDER dataset as mentioned in this paper contains dyadic dialogue explanations in the form of implicit and explicit knowledge triplets inferred using contextual commonsense inference, which are categorized by the type of commonsense knowledge present.
Posted Content

Semantic Relations and Deep Learning.

TL;DR: A new Chapter 5 of the book discusses relation classification/extraction in the deep-learning paradigm which arose after the first edition appeared.
Proceedings ArticleDOI

Keyword Augmentation via Generative Methods

TL;DR: In this paper, the authors proposed a keyword augmentation method based on generative seq2seq model and trie-based search mechanism, which is able to generate high-quality keywords for any products or product lists.
Book ChapterDOI

Using Presentation Slides and Adjacent Utterances for Post-editing of Speech Recognition Results for Meeting Recordings

TL;DR: This article proposed a method for automatically post-editing ASR results by using presentation slides that meeting participants use and utterances adjacent to a target utterance, which can be used for arbitrary speech recognition engines.

Dialogue response generation via contrastive latent representation learning

TL;DR: This paper proposed an utterance-level contrastive learning model to encode predictive information in each context representation for its corresponding response, which can learn a representative latent space of the sentence distribution, making it hard to control the generation.
Related Papers (5)
Trending Questions (1)
What are the limitations of transfer learning with a unified text-to-text transformer?

The paper does not mention the limitations of transfer learning with a unified text-to-text transformer.