scispace - formally typeset
Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Reads0
Chats0
TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.
Abstract
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

Beneath the Tip of the Iceberg: Current Challenges and New Directions in Sentiment Analysis Research

TL;DR: Sentiment analysis as a field has come a long way since it was first introduced as a task nearly 20 years ago and it has widespread commercial applications in various domains like marketing, risk management, market research, and politics as discussed by the authors.
Posted Content

Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets

TL;DR: In this paper, the authors manually audit the quality of 205 language-specific corpora released with five major public datasets (CCAligned, ParaCrawl, WikiMatrix, OSCAR, mC4) and audit the correctness of language codes in a sixth (JW300).
Posted Content

LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding

TL;DR: The LayoutLMv2 as discussed by the authors pre-trained text, layout and image in a multi-modal framework, where new model architectures and pre-training tasks are leveraged.
Proceedings ArticleDOI

Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction

TL;DR: This paper proposed a dual-channel span pruning strategy by incorporating supervision from the Aspect Term Extraction (ATE) and Opinion Term Extraction (OTE) tasks, which not only improves computational efficiency but also distinguishes the opinion and target spans more properly.
Posted Content

Neural Passage Retrieval with Improved Negative Contrast.

TL;DR: The effects of negative sampling in dual encoder models used to retrieve passages for automatic question answering are explored and a new state-of-the-art level of performance is established on two of the open-domain question answering datasets that are evaluated.
Related Papers (5)
Trending Questions (1)
What are the limitations of transfer learning with a unified text-to-text transformer?

The paper does not mention the limitations of transfer learning with a unified text-to-text transformer.