Exploiting Cross-Sentence Context for Neural Machine Translation

doi:10.18653/V1/D17-1301

Open AccessProceedings ArticleDOI

Exploiting Cross-Sentence Context for Neural Machine Translation

Longyue Wang, +3 more

- pp 2826-2831

Chats0

TLDR

This article proposed a cross-sentence context-aware approach and investigated the influence of historical contextual information on the performance of neural machine translation (NMT) in Chinese-English translation.

Abstract:

In translation, considering the document as a whole can help to resolve ambiguities and inconsistencies. In this paper, we propose a cross-sentence context-aware approach and investigate the influence of historical contextual information on the performance of neural machine translation (NMT). First, this history is summarized in a hierarchical way. We then integrate the historical representation into NMT in two strategies: 1) a warm-start of encoder and decoder states, and 2) an auxiliary context source for updating decoder states. Experimental results on a large Chinese-English translation task show that our approach significantly improves upon a strong attention-based NMT system by up to +2.1 BLEU points.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Context-Aware Neural Machine Translation Learns Anaphora Resolution

Elena Voita, +6 more

TL;DR: The authors introduced a context-aware NMT model to control the flow of information from the extended context to the translation model, which can be used to improve pronoun translation in English-Russian subtitles.

...read moreread less

Posted Content

Multilingual Denoising Pre-training for Neural Machine Translation

Yinhan Liu, +7 more

- 22 Jan 2020 -

arXiv: Computation and Language

TL;DR: This paper proposed mBART, a sequence-to-sequence denoising auto-encoder pre-trained on large-scale monolingual corpora in many languages using the BART objective.

...read moreread less

Proceedings ArticleDOI

Evaluating Discourse Phenomena in Neural Machine Translation

Rachel Bawden, +3 more

TL;DR: This paper investigated the performance of multi-encoder NMT models trained on subtitles for English to French and found that decoding the concatenation of the previous and current sentence leads to good performance.

...read moreread less

Proceedings ArticleDOI

Improving the Transformer Translation Model with Document-Level Context.

Jiacheng Zhang, +6 more

TL;DR: This work extends the Transformer model with a new context encoder to represent document-level context, which is then incorporated into the original encoder and decoder, and introduces a two-step training method to take full advantage of abundant sentence-level parallel corpora and limited document- level parallel Corpora.

...read moreread less

Proceedings Article

Search Engine Guided Neural Machine Translation

Jiatao Gu, +3 more

TL;DR: Empirical evaluation on three language pairs shows that the proposed approach significantly outperforms the baseline approach and the improvement is more significant when more relevant sentence pairs were retrieved.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

...read moreread less

Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, +2 more

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

Proceedings Article

Sequence to Sequence Learning with Neural Networks

Ilya Sutskever, +2 more

TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.

...read moreread less

Proceedings ArticleDOI

Effective Approaches to Attention-based Neural Machine Translation

Minh-Thang Luong, +2 more

TL;DR: A global approach which always attends to all source words and a local one that only looks at a subset of source words at a time are examined, demonstrating the effectiveness of both approaches on the WMT translation tasks between English and German in both directions.

...read moreread less

Posted Content

A Neural Conversational Model

Oriol Vinyals, +1 more

- 19 Jun 2015 -

arXiv: Computation and Language

TL;DR: A simple approach to conversational modeling which uses the recently proposed sequence to sequence framework, and is able to extract knowledge from both a domain specific dataset, and from a large, noisy, and general domain dataset of movie subtitles.

...read moreread less

Exploiting Cross-Sentence Context for Neural Machine Translation

Citations

Context-Aware Neural Machine Translation Learns Anaphora Resolution

Multilingual Denoising Pre-training for Neural Machine Translation

Evaluating Discourse Phenomena in Neural Machine Translation

Improving the Transformer Translation Model with Document-Level Context.

Search Engine Guided Neural Machine Translation

References

Bleu: a Method for Automatic Evaluation of Machine Translation

Neural Machine Translation by Jointly Learning to Align and Translate

Sequence to Sequence Learning with Neural Networks

Effective Approaches to Attention-based Neural Machine Translation

A Neural Conversational Model

Related Papers (5)

Attention is All you Need

Context-Aware Neural Machine Translation Learns Anaphora Resolution

Neural Machine Translation of Rare Words with Subword Units

Evaluating Discourse Phenomena in Neural Machine Translation

Improving the Transformer Translation Model with Document-Level Context.