Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Open AccessJournal Article

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, +8 more

- 01 Jan 2020 -

Journal of Machine Learning Research

- Vol. 21, Iss: 140, pp 1-67

TLDR

This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.

Abstract:

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

Citations

PDF

Open Access

More filters

Posted Content

BARTScore: Evaluating Generated Text as Text Generation

Weizhe Yuan, +2 more

- 22 Jun 2021 -

arXiv: Computation and Language

TL;DR: The authors conceptualized the evaluation of generated text as a text generation problem, modeled using pre-trained sequence-to-sequence models and proposed a metric BARTScore with a number of variants that can be flexibly applied in an unsupervised fashion to evaluation of text from different perspectives (e.g. informativeness, fluency, or factuality).

...read moreread less

Posted Content

SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer.

Tu Vu, +4 more

- 15 Oct 2021 -

arXiv: Computation and Language

TL;DR: This paper proposed a soft prompt transfer approach to learn task-specific soft prompts to condition a frozen language model to perform downstream tasks, which significantly boosts the performance of PromptTuning across many tasks.

...read moreread less

Posted Content

Dialogue State Tracking with a Language Model using Schema-Driven Prompting

Chia-Hsuan Lee, +2 more

- 15 Sep 2021 -

arXiv: Computation and Language

TL;DR: In this paper, a new variation of the language modeling approach that uses schema-driven prompting to provide task-aware history encoding that is used for both categorical and non-categorical slots is introduced.

...read moreread less

Journal ArticleDOI

Aggregating Time Series and Tabular Data in Deep Learning Model for University Students’ GPA Prediction

Harjanto Prabowo, +5 more

- 11 Jun 2021 -

IEEE Access

TL;DR: In this article, a dual-input deep learning model that is able to simultaneously process time-series and tabular data for predicting student GPA was proposed. But the proposed model achieved the best performance among all tested models with 0.4142 MSE and 0.418 MAE (Mean Absolute Error) for GPA with a 4.0 scale.

...read moreread less

Proceedings ArticleDOI

Cross-Task Generalization via Natural Language Crowdsourcing Instructions

TL;DR: In this paper , the authors introduce NATURAL INSTRUCTIONS, a dataset of 61 distinct tasks, their human-authored instructions, and 193k task instances (input-output pairs), which are obtained from crowdsourcing instructions used to create existing NLP datasets and mapped to a unified schema.

...read moreread less