A Neural Attention Model for Abstractive Sentence Summarization

doi:10.18653/V1/D15-1044

Open AccessProceedings ArticleDOI

A Neural Attention Model for Abstractive Sentence Summarization

Alexander M. Rush, +2 more

- pp 379-389

Chats0

TLDR

The authors propose a local attention-based model that generates each word of the summary conditioned on the input sentence, which shows significant performance gains on the DUC-2004 shared task compared with several strong baselines.

Abstract:

Summarization based on text extraction is inherently limited, but generation-style abstractive methods have proven challenging to build. In this work, we propose a fully data-driven approach to abstractive sentence summarization. Our method utilizes a local attention-based model that generates each word of the summary conditioned on the input sentence. While the model is structurally simple, it can easily be trained end-to-end and scales to a large amount of training data. The model shows significant performance gains on the DUC-2004 shared task compared with several strong baselines.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Get To The Point: Summarization with Pointer-Generator Networks

Abigail See, +2 more

TL;DR: A novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways, using a hybrid pointer-generator network that can copy words from the source text via pointing, which aids accurate reproduction of information, while retaining the ability to produce novel words through the generator.

...read moreread less

Journal ArticleDOI

Recent Trends in Deep Learning Based Natural Language Processing [Review Article]

Tom Young, +3 more

- 20 Jul 2018 -

IEEE Computational Intelligence Magazine

TL;DR: This paper reviews significant deep learning related models and methods that have been employed for numerous NLP tasks and provides a walk-through of their evolution.

...read moreread less

Book

Neural Networks and Deep Learning

Charu C. Aggarwal

Proceedings ArticleDOI

Attention-based LSTM for Aspect-level Sentiment Classification

Yequan Wang, +3 more

TL;DR: This paper reveals that the sentiment polarity of a sentence is not only determined by the content but is also highly related to the concerned aspect, and proposes an Attention-based Long Short-Term Memory Network for aspect-level sentiment classification.

...read moreread less

Proceedings ArticleDOI

SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing

Taku Kudo, +1 more

TL;DR: SentencePiece, a language-independent subword tokenizer and detokenizer designed for Neural-based text processing, finds that it is possible to achieve comparable accuracy to direct subword training from raw sentences.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, +2 more

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

Proceedings ArticleDOI

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Kyunghyun Cho, +8 more

TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.

...read moreread less

Posted Content

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, +2 more

- 01 Sep 2014 -

arXiv: Computation and Language

TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

Proceedings Article

Sequence to Sequence Learning with Neural Networks

Ilya Sutskever, +2 more

TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.

...read moreread less

Posted Content

Sequence to Sequence Learning with Neural Networks

Ilya Sutskever, +2 more

- 10 Sep 2014 -

arXiv: Computation and Language

TL;DR: This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.

...read moreread less

Collapse

A Neural Attention Model for Abstractive Sentence Summarization

Citations

Get To The Point: Summarization with Pointer-Generator Networks

Recent Trends in Deep Learning Based Natural Language Processing [Review Article]

Neural Networks and Deep Learning

Attention-based LSTM for Aspect-level Sentiment Classification

SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing

References

Neural Machine Translation by Jointly Learning to Align and Translate

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Neural Machine Translation by Jointly Learning to Align and Translate

Sequence to Sequence Learning with Neural Networks

Sequence to Sequence Learning with Neural Networks

Related Papers (5)

Neural Machine Translation by Jointly Learning to Align and Translate

ROUGE: A Package for Automatic Evaluation of Summaries

Sequence to Sequence Learning with Neural Networks

Attention is All you Need

Long short-term memory