Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search
Chris Hokamp,Qun Liu +1 more
- Vol. 1, pp 1535-1546
Reads0
Chats0
TLDR
The authors extend beam search to allow the inclusion of pre-specified lexical constraints, such as phrases or words that must be present in the output sequence, which can be used to incorporate auxiliary knowledge into a model's output without requiring any modification of the parameters or training data.Abstract:
We present Grid Beam Search (GBS), an algorithm which extends beam search to allow the inclusion of pre-specified lexical constraints. The algorithm can be used with any model which generates sequences token by token. Lexical constraints take the form of phrases or words that must be present in the output sequence. This is a very general way to incorporate auxillary knowledge into a model’s output without requiring any modification of the parameters or training data. We demonstrate the feasibility and flexibility of Lexically Constrained Decoding by conducting experiments on Neural Interactive-Predictive Translation, as well as Domain Adaptation for Neural Machine Translation. Experiments show that GBS can provide large improvements in translation quality in interactive scenarios, and that, even without any user input, GBS can be used to achieve significant gains in performance in domain adaptation scenarios.read more
Citations
More filters
Journal ArticleDOI
Survey of Hallucination in Natural Language Generation
Ziwei Ji,Nayeon Lee,Rita Frieske,Tiezheng Yu,D. Su,Yan Xu,Etsuko Ishii,Yejin Bang,Wenliang Dai,Andrea Madotto,Pascale Fung +10 more
TL;DR: This survey serves tofacilitate collaborative efforts among researchers in tackling the challenge of hallucinated texts in NLG by providing a broad overview of the research progress and challenges in the hallucination problem inNLG.
Posted Content
Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation
Matt Post,David Vilar +1 more
TL;DR: This work presents a algorithm for lexically constrained decoding with a complexity of O(1) in the number of constraints and demonstrates the algorithm’s remarkable ability to properly place constraints, and uses it to explore the shaky relationship between model and BLEU scores.
Proceedings ArticleDOI
Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation
Matt Post,David Vilar +1 more
TL;DR: This paper presented an algorithm for lexically constrained decoding with a complexity of O(1) in the number of constraints, and used it to explore the shaky relationship between model and BLEU scores.
Posted Content
CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning
Bill Yuchen Lin,Wangchunshu Zhou,Ming Shen,Pei Zhou,Chandra Bhagavatula,Yejin Choi,Xiang Ren +6 more
TL;DR: A constrained text generation task, CommonGen associated with a benchmark dataset, to explicitly test machines for the ability of generative commonsense reasoning, and demonstrates that the learned generative Commonsense reasoning capability can be transferred to improve downstream tasks such as CommonsenseQA by generating additional context.
Proceedings ArticleDOI
Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting
J. Edward Hu,Huda Khayrallah,Ryan Culkin,Patrick Xia,Tongfei Chen,Matt Post,Benjamin Van Durme +6 more
TL;DR: The authors describe vectorized dynamic beam allocation, which extends work in lexically-constrained decoding to work with batching, leading to a five-fold improvement in throughput when working with positive constraints.
References
More filters
Proceedings Article
Neural Machine Translation by Jointly Learning to Align and Translate
TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Proceedings ArticleDOI
Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation
Kyunghyun Cho,Bart van Merriënboer,Caglar Gulcehre,Dzmitry Bahdanau,Fethi Bougares,Holger Schwenk,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio +8 more
TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.
Proceedings Article
Sequence to Sequence Learning with Neural Networks
TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.
Proceedings ArticleDOI
Neural Machine Translation of Rare Words with Subword Units
TL;DR: This paper introduces a simpler and more effective approach, making the NMT model capable of open-vocabulary translation by encoding rare and unknown words as sequences of subword units, and empirically shows that subword models improve over a back-off dictionary baseline for the WMT 15 translation tasks English-German and English-Russian by 1.3 BLEU.
Proceedings Article
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Kelvin Xu,Jimmy Ba,Ryan Kiros,Kyunghyun Cho,Aaron Courville,Ruslan Salakhudinov,Ruslan Salakhudinov,Rich Zemel,Rich Zemel,Yoshua Bengio,Yoshua Bengio +10 more
TL;DR: An attention based model that automatically learns to describe the content of images is introduced that can be trained in a deterministic manner using standard backpropagation techniques and stochastically by maximizing a variational lower bound.