Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation

doi:10.3115/V1/W14-4009

Open AccessProceedings ArticleDOI

Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation

- pp 78-85

TLDR

A way to address the issue of a significant drop in translation quality when translating long sentences by automatically segmenting an input sentence into phrases that can be easily translated by the neural network translation model.

Abstract:

The authors of (Cho et al., 2014a) have shown that the recently introduced neural network translation systems suffer from a significant drop in translation quality when translating long sentences, unlike existing phrase-based translation systems. In this paper, we propose a way to address this issue by automatically segmenting an input sentence into phrases that can be easily translated by the neural network translation model. Once each segment has been independently translated by the neural machine translation model, the translated clauses are concatenated to form a final translation. Empirical results show a significant improvement in translation quality for long sentences.

Citations

PDF

Open Access

More filters

Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, +2 more

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

Posted Content

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, +2 more

- 01 Sep 2014 -

arXiv: Computation and Language

TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

Posted Content

Sequence to Sequence Learning with Neural Networks

Ilya Sutskever, +2 more

- 10 Sep 2014 -

arXiv: Computation and Language

TL;DR: This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.

...read moreread less

Proceedings ArticleDOI

Six Challenges for Neural Machine Translation.

Philipp Koehn, +1 more

TL;DR: The authors explore six challenges for NMT: domain mismatch, amount of training data, rare words, long sentences, word alignment, and beam search, and show both deficiencies and improvements over the quality of phrase-based statistical machine translation.

...read moreread less

Posted Content

A Hierarchical Neural Autoencoder for Paragraphs and Documents

Jiwei Li, +2 more

- 02 Jun 2015 -

arXiv: Computation and Language

TL;DR: This paper proposed a hierarchical LSTM auto-encoder to preserve and reconstruct multi-sentence paragraphs, and evaluated the reconstructed paragraph using standard metrics like ROUGE and Entity Grid, showing that neural models can encode texts in a way that preserve syntactic, semantic, and discourse coherence.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Kyunghyun Cho, +8 more

TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.

...read moreread less

Posted Content

Sequence to Sequence Learning with Neural Networks

Ilya Sutskever, +2 more

- 10 Sep 2014 -

arXiv: Computation and Language

TL;DR: This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.

...read moreread less

Proceedings ArticleDOI

Speech recognition with deep recurrent neural networks

Alex Graves, +2 more

TL;DR: This paper investigates deep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long range context that empowers RNNs.

...read moreread less

Posted Content

Speech Recognition with Deep Recurrent Neural Networks

Alex Graves, +2 more

- 22 Mar 2013 -

arXiv: Neural and Evolutionary Computing

TL;DR: In this paper, deep recurrent neural networks (RNNs) are used to combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long range context that empowers RNNs.

...read moreread less

Proceedings ArticleDOI

On the Properties of Neural Machine Translation: Encoder--Decoder Approaches

Kyunghyun Cho, +5 more

TL;DR: In this paper, a gated recursive convolutional neural network (GRNN) was proposed to learn a grammatical structure of a sentence automatically, which performed well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase.

...read moreread less