scispace - formally typeset
Open AccessProceedings ArticleDOI

A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios.

TLDR
A structured overview of methods that enable learning when training data is sparse including mechanisms to create additional labeled data like data augmentation and distant supervision as well as transfer learning settings that reduce the need for target supervision are given.
Abstract
Deep neural networks and huge language models are becoming omnipresent in natural language applications. As they are known for requiring large amounts of training data, there is a growing body of work to improve the performance in low-resource settings. Motivated by the recent fundamental changes towards neural models and the popular pre-train and fine-tune paradigm, we survey promising approaches for low-resource natural language processing. After a discussion about the different dimensions of data availability, we give a structured overview of methods that enable learning when training data is sparse. This includes mechanisms to create additional labeled data like data augmentation and distant supervision as well as transfer learning settings that reduce the need for target supervision. A goal of our survey is to explain how these methods differ in their requirements as understanding them is essential for choosing a technique suited for a specific low-resource setting. Further key aspects of this work are to highlight open issues and to outline promising directions for future research.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Self-Training with Weak Supervision

TL;DR: This work develops a weak supervision framework (ASTRA) that leverages all the available data for a given task and develops a rule attention network (teacher) that learns how to aggregate student pseudo-labels with weak rule labels, conditioned on their fidelity and the underlying context of an instance.
Journal ArticleDOI

mGPT: Few-Shot Learners Go Multilingual

TL;DR: This paper introduces two autoregressive GPT-like models with 1.3 billion and 13 billion parameters trained on 60 languages from 25 language families using Wikipedia and Colossal Clean Crawled Corpus, and trains small versions of the model to choose the most optimal multilingual tokenization strategy.
Journal ArticleDOI

Neural Natural Language Generation: A Survey on Multilinguality, Multimodality, Controllability and Learning

TL;DR: This state-of-the-art report investigates the recent developments and applications of NNLG in its full extent from a multidimensional view, covering critical perspectives such as multimodality, multilinguality, controllability and learning strategies.
Posted Content

Neural Machine Translation for Low-Resource Languages: A Survey.

TL;DR: A detailed survey of research advancements in low-resource language NMT (LRL-NMT), along with a quantitative analysis aimed at identifying the most popular solutions is presented in this paper.
Book ChapterDOI

ZeroBERTo: Leveraging Zero-Shot Text Classification by Topic Modeling

TL;DR: ZeroBERTo as discussed by the authors leverages an unsupervised clustering step to obtain a compressed data representation before the classification task, which has better performance for long inputs and shorter execution time.
References
More filters
Proceedings Article

Attention is All you Need

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Journal ArticleDOI

Generative Adversarial Nets

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Proceedings ArticleDOI

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Journal ArticleDOI

A Survey on Transfer Learning

TL;DR: The relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift are discussed.
Posted Content

RoBERTa: A Robustly Optimized BERT Pretraining Approach

TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.
Related Papers (5)