scispace - formally typeset
Open AccessProceedings ArticleDOI

Selective Encoding for Abstractive Sentence Summarization

TLDR
The experimental results show that the proposed selective encoding model outperforms the state-of-the-art baseline models.
Abstract
We propose a selective encoding model to extend the sequence-to-sequence framework for abstractive sentence summarization. It consists of a sentence encoder, a selective gate network, and an attention equipped decoder. The sentence encoder and decoder are built with recurrent neural networks. The selective gate network constructs a second level sentence representation by controlling the information flow from encoder to decoder. The second level representation is tailored for sentence summarization task, which leads to better performance. We evaluate our model on the English Gigaword, DUC 2004 and MSR abstractive sentence summarization datasets. The experimental results show that the proposed selective encoding model outperforms the state-of-the-art baseline models.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting

TL;DR: The authors proposed a sentence-level policy gradient method to bridge the non-differentiable computation between these two neural networks in a hierarchical way, which achieved state-of-the-art performance on the CNN/Daily Mail dataset.
Posted Content

Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting

TL;DR: An accurate and fast summarization model that first selects salient sentences and then rewrites them abstractively to generate a concise overall summary is proposed, which achieves the new state-of-the-art on all metrics on the CNN/Daily Mail dataset, as well as significantly higher abstractiveness scores.
Proceedings ArticleDOI

Classical Structured Prediction Losses for Sequence to Sequence Learning.

TL;DR: The authors survey a range of classical objective functions that have been widely used to train linear models for structured prediction and apply them to neural sequence-to-sequence models, showing that these losses can perform surprisingly well by slightly outperforming beam search optimization in a like for like setup.
Proceedings ArticleDOI

Global Encoding for Abstractive Summarization

TL;DR: This paper proposed a global encoding framework, which controls the information flow from the encoder to the decoder based on the global information of the source context to improve the representations of source-side information.
Proceedings ArticleDOI

Adapting the Neural Encoder-Decoder Framework from Single to Multi-Document Summarization

TL;DR: The authors exploited the maximal marginal relevance method to select representative sentences from multi-document input, and leveraged an abstractive encoder-decoder model to fuse disparate sentences to generate abstractive summary.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Proceedings ArticleDOI

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.
Proceedings Article

Sequence to Sequence Learning with Neural Networks

TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.
Proceedings Article

Understanding the difficulty of training deep feedforward neural networks

TL;DR: The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future.