scispace - formally typeset
Open AccessProceedings ArticleDOI

Paragraph-level Neural Question Generation with Maxout Pointer and Gated Self-attention Networks

TLDR
A maxout pointer mechanism with gated self-attention encoder to address the challenges of processing long text inputs for question generation, which outperforms previous approaches with either sentence-level or paragraph-level inputs.
Abstract
Question generation, the task of automatically creating questions that can be answered by a certain span of text within a given passage, is important for question-answering and conversational systems in digital assistants such as Alexa, Cortana, Google Assistant and Siri Recent sequence to sequence neural models have outperformed previous rule-based systems Existing models mainly focused on using one or two sentences as the input Long text has posed challenges for sequence to sequence neural models in question generation – worse performances were reported if using the whole paragraph (with multiple sentences) as the input In reality, however, it often requires the whole paragraph as context in order to generate high quality questions In this paper, we propose a maxout pointer mechanism with gated self-attention encoder to address the challenges of processing long text inputs for question generation With sentence-level inputs, our model outperforms previous approaches with either sentence-level or paragraph-level inputs Furthermore, our model can effectively utilize paragraphs as inputs, pushing the state-of-the-art result from 139 to 163 (BLEU_4)

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

Unified Language Model Pre-training for Natural Language Understanding and Generation

TL;DR: A new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks that compares favorably with BERT on the GLUE benchmark, and the SQuAD 2.0 and CoQA question answering tasks.
Proceedings Article

Pseudo-Masked Language Models for Unified Language Model Pre-Training

TL;DR: The experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of natural language understanding and generation tasks across several widely used benchmarks.
Posted Content

MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers

TL;DR: This work presents a simple and effective approach to compress large Transformer (Vaswani et al., 2017) based pre-trained models, termed as deep self-attention distillation, and demonstrates that the monolingual model outperforms state-of-the-art baselines in different parameter size of student models.
Journal ArticleDOI

Cross-Lingual Natural Language Generation via Pre-Training

TL;DR: Experimental results on question generation and abstractive summarization show that the model outperforms the machine-translation-based pipeline methods for zero-shot cross-lingual generation and improves NLG performance of low-resource languages by leveraging rich-resource language data.
Proceedings ArticleDOI

A Recurrent BERT-based Model for Question Generation

TL;DR: This study investigates the employment of the pre-trained BERT language model to tackle question generation tasks and proposes another two models by restructuring the authors' BERT employment into a sequential manner for taking information from previous decoded results.
References
More filters
Journal ArticleDOI

Long short-term memory

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Proceedings ArticleDOI

Glove: Global Vectors for Word Representation

TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Posted Content

Neural Machine Translation by Jointly Learning to Align and Translate

TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.