Joint Language and Translation Modeling with Recurrent Neural Networks

Open AccessProceedings Article

Joint Language and Translation Modeling with Recurrent Neural Networks

Michael Auli, +3 more

- pp 1044-1054

Chats0

TLDR

This work presents a joint language and translation model based on a recurrent neural network which predicts target words based on an unbounded history of both source and target words which shows competitive accuracy compared to the traditional channel model features.

Abstract:

We present a joint language and translation model based on a recurrent neural network which predicts target words based on an unbounded history of both source and target words. The weaker independence assumptions of this model result in a vastly larger search space compared to related feedforward-based language or translation models. We tackle this issue with a new lattice rescoring algorithm and demonstrate its effectiveness empirically. Our joint model builds on a well known recurrent neural network language model (Mikolov, 2012) augmented by a layer of additional inputs from the source language. We show competitive accuracy compared to the traditional channel model features. Our best results improve the output of a system trained on WMT 2012 French-English data by up to 1.5 BLEU, and by 1.1 BLEU on average across several test sets.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Kyunghyun Cho, +8 more

TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.

...read moreread less

Proceedings Article

Sequence to Sequence Learning with Neural Networks

Ilya Sutskever, +2 more

TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.

...read moreread less

Posted Content

Sequence to Sequence Learning with Neural Networks

Ilya Sutskever, +2 more

- 10 Sep 2014 -

arXiv: Computation and Language

TL;DR: This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.

...read moreread less

Posted Content

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

Kyunghyun Cho, +8 more

- 03 Jun 2014 -

arXiv: Computation and Language

TL;DR: Qualitatively, the proposed RNN Encoder‐Decoder model learns a semantically and syntactically meaningful representation of linguistic phrases.

...read moreread less

Journal ArticleDOI

Recent Trends in Deep Learning Based Natural Language Processing [Review Article]

Tom Young, +3 more

- 20 Jul 2018 -

IEEE Computational Intelligence Magazine

TL;DR: This paper reviews significant deep learning related models and methods that have been employed for numerous NLP tasks and provides a walk-through of their evolution.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book ChapterDOI

Learning internal representations by error propagation

David E. Rumelhart, +2 more

TL;DR: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion.

...read moreread less

Book

Learning internal representations by error propagation

David E. Rumelhart, +2 more

TL;DR: In this paper, the problem of the generalized delta rule is discussed and the Generalized Delta Rule is applied to the simulation results of simulation results in terms of the generalized delta rule.

...read moreread less

Journal ArticleDOI

Indexing by Latent Semantic Analysis

Scott Deerwester, +4 more

- 01 Sep 1990 -

Journal of the Association for Informati...

TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.

...read moreread less

Proceedings Article

Recurrent neural network based language model

Tomas Mikolov, +4 more

TL;DR: Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model.

...read moreread less

Collapse

Related Papers (5)

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

A neural probabilistic language model

Yoshua Bengio, +3 more

- 01 Mar 2003 -

Journal of Machine Learning Research

Joint Language and Translation Modeling with Recurrent Neural Networks

Citations

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Sequence to Sequence Learning with Neural Networks

Sequence to Sequence Learning with Neural Networks

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

Recent Trends in Deep Learning Based Natural Language Processing [Review Article]

References

Learning internal representations by error propagation

Learning internal representations by error propagation

Indexing by Latent Semantic Analysis

Moses: Open Source Toolkit for Statistical Machine Translation

Recurrent neural network based language model

Related Papers (5)

Long short-term memory

A neural probabilistic language model

Bleu: a Method for Automatic Evaluation of Machine Translation

Sequence to Sequence Learning with Neural Networks

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation