Paragraph-level Neural Question Generation with Maxout Pointer and Gated Self-attention Networks

doi:10.18653/V1/D18-1424

Open AccessProceedings ArticleDOI

Paragraph-level Neural Question Generation with Maxout Pointer and Gated Self-attention Networks

- pp 3901-3910

TLDR

A maxout pointer mechanism with gated self-attention encoder to address the challenges of processing long text inputs for question generation, which outperforms previous approaches with either sentence-level or paragraph-level inputs.

Abstract:

Question generation, the task of automatically creating questions that can be answered by a certain span of text within a given passage, is important for question-answering and conversational systems in digital assistants such as Alexa, Cortana, Google Assistant and Siri Recent sequence to sequence neural models have outperformed previous rule-based systems Existing models mainly focused on using one or two sentences as the input Long text has posed challenges for sequence to sequence neural models in question generation – worse performances were reported if using the whole paragraph (with multiple sentences) as the input In reality, however, it often requires the whole paragraph as context in order to generate high quality questions In this paper, we propose a maxout pointer mechanism with gated self-attention encoder to address the challenges of processing long text inputs for question generation With sentence-level inputs, our model outperforms previous approaches with either sentence-level or paragraph-level inputs Furthermore, our model can effectively utilize paragraphs as inputs, pushing the state-of-the-art result from 139 to 163 (BLEU_4)

Citations

PDF

Open Access

More filters

Posted Content

Unified Language Model Pre-training for Natural Language Understanding and Generation

Li Dong, +8 more

- 08 May 2019 -

arXiv: Computation and Language

TL;DR: A new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks that compares favorably with BERT on the GLUE benchmark, and the SQuAD 2.0 and CoQA question answering tasks.

...read moreread less

Proceedings Article

Pseudo-Masked Language Models for Unified Language Model Pre-Training

Hangbo Bao, +10 more

TL;DR: The experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of natural language understanding and generation tasks across several widely used benchmarks.

...read moreread less

Posted Content

MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers

Wenhui Wang, +5 more

- 25 Feb 2020 -

arXiv: Computation and Language

TL;DR: This work presents a simple and effective approach to compress large Transformer (Vaswani et al., 2017) based pre-trained models, termed as deep self-attention distillation, and demonstrates that the monolingual model outperforms state-of-the-art baselines in different parameter size of student models.

...read moreread less

Journal ArticleDOI

Cross-Lingual Natural Language Generation via Pre-Training

Zewen Chi, +5 more

TL;DR: Experimental results on question generation and abstractive summarization show that the model outperforms the machine-translation-based pipeline methods for zero-shot cross-lingual generation and improves NLG performance of low-resource languages by leveraging rich-resource language data.

...read moreread less

Proceedings ArticleDOI

A Recurrent BERT-based Model for Question Generation

Ying-Hong Chan, +1 more

TL;DR: This study investigates the employment of the pre-trained BERT language model to tackle question generation tasks and proposes another two models by restructuring the authors' BERT employment into a sequential manner for taking information from previous decoded results.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Proceedings ArticleDOI

Glove: Global Vectors for Word Representation

Jeffrey Pennington, +2 more

TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.

...read moreread less

Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

...read moreread less

Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, +2 more

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

Posted Content

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, +2 more

- 01 Sep 2014 -

arXiv: Computation and Language

TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

Collapse

Paragraph-level Neural Question Generation with Maxout Pointer and Gated Self-attention Networks

Citations

Unified Language Model Pre-training for Natural Language Understanding and Generation

Pseudo-Masked Language Models for Unified Language Model Pre-Training

MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers

Cross-Lingual Natural Language Generation via Pre-Training

A Recurrent BERT-based Model for Question Generation

References

Long short-term memory

Glove: Global Vectors for Word Representation

Bleu: a Method for Automatic Evaluation of Machine Translation

Neural Machine Translation by Jointly Learning to Align and Translate

Neural Machine Translation by Jointly Learning to Align and Translate

Related Papers (5)

SQuAD: 100,000+ Questions for Machine Comprehension of Text

Bleu: a Method for Automatic Evaluation of Machine Translation

ROUGE: A Package for Automatic Evaluation of Summaries

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Get To The Point: Summarization with Pointer-Generator Networks