scispace - formally typeset
Open AccessProceedings ArticleDOI

Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search

Chris Hokamp, +1 more
- Vol. 1, pp 1535-1546
Reads0
Chats0
TLDR
The authors extend beam search to allow the inclusion of pre-specified lexical constraints, such as phrases or words that must be present in the output sequence, which can be used to incorporate auxiliary knowledge into a model's output without requiring any modification of the parameters or training data.
Abstract
We present Grid Beam Search (GBS), an algorithm which extends beam search to allow the inclusion of pre-specified lexical constraints. The algorithm can be used with any model which generates sequences token by token. Lexical constraints take the form of phrases or words that must be present in the output sequence. This is a very general way to incorporate auxillary knowledge into a model’s output without requiring any modification of the parameters or training data. We demonstrate the feasibility and flexibility of Lexically Constrained Decoding by conducting experiments on Neural Interactive-Predictive Translation, as well as Domain Adaptation for Neural Machine Translation. Experiments show that GBS can provide large improvements in translation quality in interactive scenarios, and that, even without any user input, GBS can be used to achieve significant gains in performance in domain adaptation scenarios.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Survey of Hallucination in Natural Language Generation

TL;DR: This survey serves tofacilitate collaborative efforts among researchers in tackling the challenge of hallucinated texts in NLG by providing a broad overview of the research progress and challenges in the hallucination problem inNLG.
Posted Content

Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation

TL;DR: This work presents a algorithm for lexically constrained decoding with a complexity of O(1) in the number of constraints and demonstrates the algorithm’s remarkable ability to properly place constraints, and uses it to explore the shaky relationship between model and BLEU scores.
Proceedings ArticleDOI

Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation

TL;DR: This paper presented an algorithm for lexically constrained decoding with a complexity of O(1) in the number of constraints, and used it to explore the shaky relationship between model and BLEU scores.
Posted Content

CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning

TL;DR: A constrained text generation task, CommonGen associated with a benchmark dataset, to explicitly test machines for the ability of generative commonsense reasoning, and demonstrates that the learned generative Commonsense reasoning capability can be transferred to improve downstream tasks such as CommonsenseQA by generating additional context.
Proceedings ArticleDOI

Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting

TL;DR: The authors describe vectorized dynamic beam allocation, which extends work in lexically-constrained decoding to work with batching, leading to a five-fold improvement in throughput when working with positive constraints.
References
More filters
Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Proceedings ArticleDOI

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.
Proceedings Article

Sequence to Sequence Learning with Neural Networks

TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.
Proceedings ArticleDOI

Neural Machine Translation of Rare Words with Subword Units

TL;DR: This paper introduces a simpler and more effective approach, making the NMT model capable of open-vocabulary translation by encoding rare and unknown words as sequences of subword units, and empirically shows that subword models improve over a back-off dictionary baseline for the WMT 15 translation tasks English-German and English-Russian by 1.3 BLEU.
Proceedings Article

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

TL;DR: An attention based model that automatically learns to describe the content of images is introduced that can be trained in a deterministic manner using standard backpropagation techniques and stochastically by maximizing a variational lower bound.