scispace - formally typeset
Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.
Abstract
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Character-level Representations Improve DRS-based Semantic Parsing Even in the Age of BERT

TL;DR: The authors combine character-level and contextual language model representations to improve performance on Discourse Representation Structure parsing, using a new method of analysis based on semantic tags to demonstrate that the character level representations improve performance across a subset of selected semantic phenomena.
Proceedings ArticleDOI

Language Model Transformers as Evaluators for Open-domain Dialogues.

TL;DR: This work investigates whether language models (LM) based on transformer neural networks can indicate the quality of a conversation and demonstrates that human evaluators have a positive correlation between the output of the language models and scores.
Posted Content

OpenPrompt: An Open-source Framework for Prompt-learning

TL;DR: OpenPrompt as discussed by the authors is a toolkit for prompt learning over pre-trained language models (PLMs), which can combine different PLMs, task formats, and prompting modules in a unified paradigm.
Proceedings ArticleDOI

OpenPrompt: An Open-source Framework for Prompt-learning

TL;DR: Ding et al. as discussed by the authors presented a system demonstration at the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations (ACLS).
Posted Content

A Baseline Analysis for Podcast Abstractive Summarization

TL;DR: A baseline analysis of podcast summarization using the Spotify Podcast Dataset provided by TREC 2020 is presented to help researchers understand current state-of-the-art pre-trained models and hence build a foundation for creating better models.
Related Papers (5)
Trending Questions (1)
What are the limitations of transfer learning with a unified text-to-text transformer?

The paper does not mention the limitations of transfer learning with a unified text-to-text transformer.