Open AccessJournal Article
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel,Noam Shazeer,Adam Roberts,Katherine Lee,Sharan Narang,Michael Matena,Yanqi Zhou,Wei Li,Peter J. Liu +8 more
TLDR
This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.Abstract:
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.read more
Citations
More filters
Posted Content
BARTScore: Evaluating Generated Text as Text Generation
TL;DR: The authors conceptualized the evaluation of generated text as a text generation problem, modeled using pre-trained sequence-to-sequence models and proposed a metric BARTScore with a number of variants that can be flexibly applied in an unsupervised fashion to evaluation of text from different perspectives (e.g. informativeness, fluency, or factuality).
Posted Content
SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer.
TL;DR: This paper proposed a soft prompt transfer approach to learn task-specific soft prompts to condition a frozen language model to perform downstream tasks, which significantly boosts the performance of PromptTuning across many tasks.
Posted Content
Dialogue State Tracking with a Language Model using Schema-Driven Prompting
TL;DR: In this paper, a new variation of the language modeling approach that uses schema-driven prompting to provide task-aware history encoding that is used for both categorical and non-categorical slots is introduced.
Journal ArticleDOI
Aggregating Time Series and Tabular Data in Deep Learning Model for University Students’ GPA Prediction
Harjanto Prabowo,Alam Ahmad Hidayat,Tjeng Wawan Cenggoro,Reza Rahutomo,Kartika Purwandari,Bens Pardamean +5 more
TL;DR: In this article, a dual-input deep learning model that is able to simultaneously process time-series and tabular data for predicting student GPA was proposed. But the proposed model achieved the best performance among all tested models with 0.4142 MSE and 0.418 MAE (Mean Absolute Error) for GPA with a 4.0 scale.
Proceedings ArticleDOI
Cross-Task Generalization via Natural Language Crowdsourcing Instructions
TL;DR: In this paper , the authors introduce NATURAL INSTRUCTIONS, a dataset of 61 distinct tasks, their human-authored instructions, and 193k task instances (input-output pairs), which are obtained from crowdsourcing instructions used to create existing NLP datasets and mapped to a unified schema.