Selective Encoding for Abstractive Sentence Summarization
Qingyu Zhou,Nan Yang,Furu Wei,Ming Zhou +3 more
- Vol. 1, pp 1095-1104
TLDR
The experimental results show that the proposed selective encoding model outperforms the state-of-the-art baseline models.Abstract:
We propose a selective encoding model to extend the sequence-to-sequence framework for abstractive sentence summarization. It consists of a sentence encoder, a selective gate network, and an attention equipped decoder. The sentence encoder and decoder are built with recurrent neural networks. The selective gate network constructs a second level sentence representation by controlling the information flow from encoder to decoder. The second level representation is tailored for sentence summarization task, which leads to better performance. We evaluate our model on the English Gigaword, DUC 2004 and MSR abstractive sentence summarization datasets. The experimental results show that the proposed selective encoding model outperforms the state-of-the-art baseline models.read more
Citations
More filters
Proceedings ArticleDOI
Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting
Yen-Chun Chen,Mohit Bansal +1 more
TL;DR: The authors proposed a sentence-level policy gradient method to bridge the non-differentiable computation between these two neural networks in a hierarchical way, which achieved state-of-the-art performance on the CNN/Daily Mail dataset.
Posted Content
Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting
Yen-Chun Chen,Mohit Bansal +1 more
TL;DR: An accurate and fast summarization model that first selects salient sentences and then rewrites them abstractively to generate a concise overall summary is proposed, which achieves the new state-of-the-art on all metrics on the CNN/Daily Mail dataset, as well as significantly higher abstractiveness scores.
Proceedings ArticleDOI
Classical Structured Prediction Losses for Sequence to Sequence Learning.
TL;DR: The authors survey a range of classical objective functions that have been widely used to train linear models for structured prediction and apply them to neural sequence-to-sequence models, showing that these losses can perform surprisingly well by slightly outperforming beam search optimization in a like for like setup.
Proceedings ArticleDOI
Global Encoding for Abstractive Summarization
TL;DR: This paper proposed a global encoding framework, which controls the information flow from the encoder to the decoder based on the global information of the source context to improve the representations of source-side information.
Proceedings ArticleDOI
Adapting the Neural Encoder-Decoder Framework from Single to Multi-Document Summarization
TL;DR: The authors exploited the maximal marginal relevance method to select representative sentences from multi-document input, and leveraged an abstractive encoder-decoder model to fuse disparate sentences to generate abstractive summary.
References
More filters
Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Proceedings Article
Neural Machine Translation by Jointly Learning to Align and Translate
TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Proceedings ArticleDOI
Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation
Kyunghyun Cho,Bart van Merriënboer,Caglar Gulcehre,Dzmitry Bahdanau,Fethi Bougares,Holger Schwenk,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio +8 more
TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.
Proceedings Article
Sequence to Sequence Learning with Neural Networks
TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.
Proceedings Article
Understanding the difficulty of training deep feedforward neural networks
Xavier Glorot,Yoshua Bengio +1 more
TL;DR: The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future.