Open AccessJournal Article
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel,Noam Shazeer,Adam Roberts,Katherine Lee,Sharan Narang,Michael Matena,Yanqi Zhou,Wei Li,Peter J. Liu +8 more
Reads0
Chats0
TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.Abstract:
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.read more
Citations
More filters
Posted Content
Relation-Guided Pre-Training for Open-Domain Question Answering.
TL;DR: Zhang et al. as mentioned in this paper proposed a relation-guided pre-training (RGPT-QA) framework to improve the generalization performance over questions with long-tail relations.
Posted Content
Revisiting Linformer with a modified self-attention with linear complexity.
TL;DR: In this article, the authors proposed an alternative method for self-attention with linear complexity in time and space and is independent of the projection mapping dimension, which can be used for images as well as audios.
Posted Content
More Than Reading Comprehension: A Survey on Datasets and Metrics of Textual Question Answering.
Yang Bai,Daisy Zhe Wang +1 more
TL;DR: In this article, the authors survey 47 textual QA benchmark datasets and propose a new taxonomy from an application point of view, and summarize 8 evaluation metrics of textual question answering tasks.
Proceedings ArticleDOI
COPER: a Query-adaptable Semantics-based Search Engine for Persian COVID-19 Articles
TL;DR: In this paper, a search engine consisting of a ranker and a re-ranker was used to sift through these documents and rank them, given a user's query.
Posted Content
Analysis and Evaluation of Language Models for Word Sense Disambiguation
TL;DR: This paper provided an in-depth quantitative and qualitative analysis of the celebrated BERT model with respect to lexical ambiguity and found that BERT can accurately capture high-level sense distinctions, even when a limited number of examples is available for each word sense.