A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios.
Michael A. Hedderich,Lukas Lange,Heike Adel,Jannik Strötgen,Dietrich Klakow +4 more
- pp 2545-2568
TLDR
A structured overview of methods that enable learning when training data is sparse including mechanisms to create additional labeled data like data augmentation and distant supervision as well as transfer learning settings that reduce the need for target supervision are given.Abstract:
Deep neural networks and huge language models are becoming omnipresent in natural language applications. As they are known for requiring large amounts of training data, there is a growing body of work to improve the performance in low-resource settings. Motivated by the recent fundamental changes towards neural models and the popular pre-train and fine-tune paradigm, we survey promising approaches for low-resource natural language processing. After a discussion about the different dimensions of data availability, we give a structured overview of methods that enable learning when training data is sparse. This includes mechanisms to create additional labeled data like data augmentation and distant supervision as well as transfer learning settings that reduce the need for target supervision. A goal of our survey is to explain how these methods differ in their requirements as understanding them is essential for choosing a technique suited for a specific low-resource setting. Further key aspects of this work are to highlight open issues and to outline promising directions for future research.read more
Citations
More filters
Proceedings ArticleDOI
Self-Training with Weak Supervision
TL;DR: This work develops a weak supervision framework (ASTRA) that leverages all the available data for a given task and develops a rule attention network (teacher) that learns how to aggregate student pseudo-labels with weak rule labels, conditioned on their fidelity and the underlying context of an instance.
Journal ArticleDOI
mGPT: Few-Shot Learners Go Multilingual
Oleh Shliazhko,Alena Fenogenova,Maria Tikhonova,Vladislav Mikhailov,A. Kozlova,Tatiana Shavrina +5 more
TL;DR: This paper introduces two autoregressive GPT-like models with 1.3 billion and 13 billion parameters trained on 60 languages from 25 language families using Wikipedia and Colossal Clean Crawled Corpus, and trains small versions of the model to choose the most optimal multilingual tokenization strategy.
Journal ArticleDOI
Neural Natural Language Generation: A Survey on Multilinguality, Multimodality, Controllability and Learning
Erkut Erdem,Menekşe Kuyu,Semih Yagcioglu,Anette Frank,Letitia Parcalabescu,B. Plank,Andrii Babii,Oleksii Turuta,Aykut Erdem,Iacer Calixto,Elena Lloret,Elena Apostol,Ciprian-Octavian Truica,Branislava Šandrih,Sanda Martincic Ipsic,Gábor Berend,Albert Gatt,Gražina Korvel +17 more
TL;DR: This state-of-the-art report investigates the recent developments and applications of NNLG in its full extent from a multidimensional view, covering critical perspectives such as multimodality, multilinguality, controllability and learning strategies.
Posted Content
Neural Machine Translation for Low-Resource Languages: A Survey.
Surangika Ranathunga,En-Shiun Annie Lee,Marjana Prifti Skenduli,Ravi Shekhar,Mehreen Alam,Rishemjit Kaur +5 more
TL;DR: A detailed survey of research advancements in low-resource language NMT (LRL-NMT), along with a quantitative analysis aimed at identifying the most popular solutions is presented in this paper.
Book ChapterDOI
ZeroBERTo: Leveraging Zero-Shot Text Classification by Topic Modeling
Alexandre Alcoforado,Thomas Palmeira Ferraz,R. Gerber,Enzo Bustos,A. Oliveira,Bruno Veloso,A. H. R. Costa +6 more
TL;DR: ZeroBERTo as discussed by the authors leverages an unsupervised clustering step to obtain a compressed data representation before the classification task, which has better performance for long inputs and shorter execution time.
References
More filters
Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Journal ArticleDOI
Generative Adversarial Nets
Ian Goodfellow,Jean Pouget-Abadie,Mehdi Mirza,Bing Xu,David Warde-Farley,Sherjil Ozair,Aaron Courville,Yoshua Bengio +7 more
TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Proceedings ArticleDOI
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Journal ArticleDOI
A Survey on Transfer Learning
Sinno Jialin Pan,Qiang Yang +1 more
TL;DR: The relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift are discussed.
Posted Content
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu,Myle Ott,Naman Goyal,Jingfei Du,Mandar Joshi,Danqi Chen,Omer Levy,Michael Lewis,Luke Zettlemoyer,Veselin Stoyanov +9 more
TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.