Context Gates for Neural Machine Translation

doi:10.1162/TACL_A_00048

Open AccessJournal ArticleDOI

Context Gates for Neural Machine Translation

Zhaopeng Tu, +4 more

- 24 Mar 2017 -

Transactions of the Association for Comp...

- Vol. 5, Iss: 1, pp 87-99

Chats0

TLDR

The authors propose context gates which dynamically control the ratios at which source and target contexts contribute to the generation of target words, which can enhance both the adequacy and fluency of NMT with more careful control of the information flow from contexts.

Abstract:

In neural machine translation (NMT), generation of a target word depends on both source and target contexts. We find that source contexts have a direct impact on the adequacy of a translation while target contexts affect the fluency . Intuitively, generation of a content word should rely more on the source context and generation of a functional word should rely more on the target context. Due to the lack of effective control over the influence from source and target contexts, conventional NMT tends to yield fluent but inadequate translations. To address this problem, we propose context gates which dynamically control the ratios at which source and target contexts contribute to the generation of target words. In this way, we can enhance both the adequacy and fluency of NMT with more careful control of the information flow from contexts. Experiments show that our approach significantly improves upon a standard attention-based NMT system by +2.3 BLEU points.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Globally Coherent Text Generation with Neural Checklist Models

Chloé Kiddon, +2 more

TL;DR: The neural checklist model is presented, a recurrent neural network that models global coherence by storing and updating an agenda of text strings which should be mentioned somewhere in the output, and demonstrates high coherence with greatly improved semantic coverage of the agenda.

...read moreread less

Proceedings ArticleDOI

Visualizing and Understanding Neural Machine Translation

Ding Yanzhuo, +3 more

TL;DR: This work proposes to use layer-wise relevance propagation (LRP) to compute the contribution of each contextual word to arbitrary hidden states in the attention-based encoder-decoder framework and shows that visualization with LRP helps to interpret the internal workings of NMT and analyze translation errors.

...read moreread less

Proceedings ArticleDOI

Exploiting Cross-Sentence Context for Neural Machine Translation

Longyue Wang, +3 more

TL;DR: This article proposed a cross-sentence context-aware approach and investigated the influence of historical contextual information on the performance of neural machine translation (NMT) in Chinese-English translation.

...read moreread less

Journal ArticleDOI

Learning to Remember Translation History with a Continuous Cache

Zhaopeng Tu, +3 more

- 02 Jul 2018 -

Transactions of the Association for Comp...

TL;DR: The authors propose to augment NMT models with a very light-weight cache-like memory network, which stores recent hidden representations as translation history, and the probability distribution over generated words is updated online depending on the translation history retrieved from the memory.

...read moreread less

Proceedings ArticleDOI

Modeling Source Syntax for Neural Machine Translation.

Junhui Li, +5 more

TL;DR: The authors propose three different kinds of encoders to incorporate source syntax into NMT: 1) Parallel RNN encoder that learns word and label annotation vectors parallelly, 2) Hierarchical RNN encoding that learns both word and annotation vectors in a two-level hierarchy, and 3) Mixed RNN encode that stitchingly learns both labels and words over sequences where words and labels are mixed.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

...read moreread less

Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, +2 more

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

Proceedings ArticleDOI

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Kyunghyun Cho, +8 more

TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.

...read moreread less

Proceedings Article

Sequence to Sequence Learning with Neural Networks

Ilya Sutskever, +2 more

TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.

...read moreread less

Context Gates for Neural Machine Translation

Citations

Globally Coherent Text Generation with Neural Checklist Models

Visualizing and Understanding Neural Machine Translation

Exploiting Cross-Sentence Context for Neural Machine Translation

Learning to Remember Translation History with a Continuous Cache

Modeling Source Syntax for Neural Machine Translation.

References

Long short-term memory

Bleu: a Method for Automatic Evaluation of Machine Translation

Neural Machine Translation by Jointly Learning to Align and Translate

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Sequence to Sequence Learning with Neural Networks

Related Papers (5)

Neural Machine Translation by Jointly Learning to Align and Translate

Bleu: a Method for Automatic Evaluation of Machine Translation

Sequence to Sequence Learning with Neural Networks

Effective Approaches to Attention-based Neural Machine Translation

Neural Machine Translation of Rare Words with Subword Units