scispace - formally typeset
Open AccessProceedings ArticleDOI

Multi-Domain Neural Machine Translation through Unsupervised Adaptation

Reads0
Chats0
TLDR
This work explores an efficient instance-based adaptation method that, by exploiting the similarity between the training instances and each test sentence, dynamically sets the hyperparameters of the learning algorithm and updates the generic model on-the-fly.
Abstract
We investigate the application of Neural Machine Translation (NMT) under the following three conditions posed by realworld application scenarios. First, we operate with an input stream of sentences coming from many different domains and with no predefined order. Second, the sentences are presented without domain information. Third, the input stream should be processed by a single generic NMT model. To tackle the weaknesses of current NMT technology in this unsupervised multi-domain setting, we explore an efficient instance-based adaptation method that, by exploiting the similarity between the training instances and each test sentence, dynamically sets the hyperparameters of the learning algorithm and updates the generic model on-the-fly. The results of our experiments with multi-domain data show that local adaptation outperforms not only the original generic NMT system, but also a strong phrase-based system and even single-domain NMT models specifically optimized on each domain and applicable only by violating two of our aforementioned assumptions.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Simple, Scalable Adaptation for Neural Machine Translation

TL;DR: The proposed approach consists of injecting tiny task specific adapter layers into a pre-trained model, which adapt the model to multiple individual tasks simultaneously, paving the way towards universal machine translation.
Book

Neural Machine Translation

TL;DR: A comprehensive treatment of the topic, ranging from introduction to neural networks, computation graphs, description of the currently dominant attentional sequence-to-sequence model, recent refinements, alternative architectures and challenges.
Proceedings Article

A Survey of Domain Adaptation for Neural Machine Translation

TL;DR: A comprehensive survey of the state-of-the-art domain adaptation techniques for NMT is given, which leverages both out- of-domain parallel corpora as well as monolingual corpora for in-domain translation.
Posted Content

Nearest Neighbor Machine Translation

TL;DR: This work introduces $k$-nearest-neighbor machine translation ($k$NN-MT), which predicts tokens with a nearest neighbor classifier over a large datastore of cached examples, using representations from a neural translation model for similarity search.
Posted Content

A Survey of Domain Adaptation for Neural Machine Translation.

TL;DR: The authors give a comprehensive survey of the state-of-the-art domain adaptation techniques for NMT, which leverages both out-ofdomain parallel corpora as well as monolingual corpora for in-domain translation.
References
More filters
Book

Deep Learning

TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.
Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Proceedings ArticleDOI

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.
Posted Content

Neural Machine Translation by Jointly Learning to Align and Translate

TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.