Multi-Domain Neural Machine Translation through Unsupervised Adaptation
M. Amin Farajian,Marco Turchi,Matteo Negri,Marcello Federico +3 more
- pp 127-137
Reads0
Chats0
TLDR
This work explores an efficient instance-based adaptation method that, by exploiting the similarity between the training instances and each test sentence, dynamically sets the hyperparameters of the learning algorithm and updates the generic model on-the-fly.Abstract:
We investigate the application of Neural Machine Translation (NMT) under the following three conditions posed by realworld application scenarios. First, we operate with an input stream of sentences coming from many different domains and with no predefined order. Second, the sentences are presented without domain information. Third, the input stream should be processed by a single generic NMT model. To tackle the weaknesses of current NMT technology in this unsupervised multi-domain setting, we explore an efficient instance-based adaptation method that, by exploiting the similarity between the training instances and each test sentence, dynamically sets the hyperparameters of the learning algorithm and updates the generic model on-the-fly. The results of our experiments with multi-domain data show that local adaptation outperforms not only the original generic NMT system, but also a strong phrase-based system and even single-domain NMT models specifically optimized on each domain and applicable only by violating two of our aforementioned assumptions.read more
Citations
More filters
Proceedings ArticleDOI
Simple, Scalable Adaptation for Neural Machine Translation
TL;DR: The proposed approach consists of injecting tiny task specific adapter layers into a pre-trained model, which adapt the model to multiple individual tasks simultaneously, paving the way towards universal machine translation.
Book
Neural Machine Translation
TL;DR: A comprehensive treatment of the topic, ranging from introduction to neural networks, computation graphs, description of the currently dominant attentional sequence-to-sequence model, recent refinements, alternative architectures and challenges.
Proceedings Article
A Survey of Domain Adaptation for Neural Machine Translation
Chenhui Chu,Rui Wang +1 more
TL;DR: A comprehensive survey of the state-of-the-art domain adaptation techniques for NMT is given, which leverages both out- of-domain parallel corpora as well as monolingual corpora for in-domain translation.
Posted Content
Nearest Neighbor Machine Translation
TL;DR: This work introduces $k$-nearest-neighbor machine translation ($k$NN-MT), which predicts tokens with a nearest neighbor classifier over a large datastore of cached examples, using representations from a neural translation model for similarity search.
Posted Content
A Survey of Domain Adaptation for Neural Machine Translation.
Chenhui Chu,Rui Wang +1 more
TL;DR: The authors give a comprehensive survey of the state-of-the-art domain adaptation techniques for NMT, which leverages both out-ofdomain parallel corpora as well as monolingual corpora for in-domain translation.
References
More filters
Book
Deep Learning
TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.
Proceedings ArticleDOI
Bleu: a Method for Automatic Evaluation of Machine Translation
TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Proceedings Article
Neural Machine Translation by Jointly Learning to Align and Translate
TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Proceedings ArticleDOI
Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation
Kyunghyun Cho,Bart van Merriënboer,Caglar Gulcehre,Dzmitry Bahdanau,Fethi Bougares,Holger Schwenk,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio +8 more
TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.
Posted Content
Neural Machine Translation by Jointly Learning to Align and Translate
TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.