Open AccessJournal Article
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel,Noam Shazeer,Adam Roberts,Katherine Lee,Sharan Narang,Michael Matena,Yanqi Zhou,Wei Li,Peter J. Liu +8 more
TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.Abstract:
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.read more
Citations
More filters
Proceedings ArticleDOI
Character-level Representations Improve DRS-based Semantic Parsing Even in the Age of BERT
TL;DR: The authors combine character-level and contextual language model representations to improve performance on Discourse Representation Structure parsing, using a new method of analysis based on semantic tags to demonstrate that the character level representations improve performance across a subset of selected semantic phenomena.
Proceedings ArticleDOI
Language Model Transformers as Evaluators for Open-domain Dialogues.
TL;DR: This work investigates whether language models (LM) based on transformer neural networks can indicate the quality of a conversation and demonstrates that human evaluators have a positive correlation between the output of the language models and scores.
Posted Content
OpenPrompt: An Open-source Framework for Prompt-learning
TL;DR: OpenPrompt as discussed by the authors is a toolkit for prompt learning over pre-trained language models (PLMs), which can combine different PLMs, task formats, and prompting modules in a unified paradigm.
Proceedings ArticleDOI
OpenPrompt: An Open-source Framework for Prompt-learning
TL;DR: Ding et al. as discussed by the authors presented a system demonstration at the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations (ACLS).
Posted Content
A Baseline Analysis for Podcast Abstractive Summarization
TL;DR: A baseline analysis of podcast summarization using the Spotify Podcast Dataset provided by TREC 2020 is presented to help researchers understand current state-of-the-art pre-trained models and hence build a foundation for creating better models.