Paragraph-level Neural Question Generation with Maxout Pointer and Gated Self-attention Networks
Yao Zhao,Xiaochuan Ni,Yuanyuan Ding,Qifa Ke +3 more
- pp 3901-3910
TLDR
A maxout pointer mechanism with gated self-attention encoder to address the challenges of processing long text inputs for question generation, which outperforms previous approaches with either sentence-level or paragraph-level inputs.Abstract:
Question generation, the task of automatically creating questions that can be answered by a certain span of text within a given passage, is important for question-answering and conversational systems in digital assistants such as Alexa, Cortana, Google Assistant and Siri Recent sequence to sequence neural models have outperformed previous rule-based systems Existing models mainly focused on using one or two sentences as the input Long text has posed challenges for sequence to sequence neural models in question generation – worse performances were reported if using the whole paragraph (with multiple sentences) as the input In reality, however, it often requires the whole paragraph as context in order to generate high quality questions In this paper, we propose a maxout pointer mechanism with gated self-attention encoder to address the challenges of processing long text inputs for question generation With sentence-level inputs, our model outperforms previous approaches with either sentence-level or paragraph-level inputs Furthermore, our model can effectively utilize paragraphs as inputs, pushing the state-of-the-art result from 139 to 163 (BLEU_4)read more
Citations
More filters
Posted Content
Unified Language Model Pre-training for Natural Language Understanding and Generation
Li Dong,Nan Yang,Wenhui Wang,Furu Wei,Xiaodong Liu,Yu Wang,Jianfeng Gao,Ming Zhou,Hsiao-Wuen Hon +8 more
TL;DR: A new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks that compares favorably with BERT on the GLUE benchmark, and the SQuAD 2.0 and CoQA question answering tasks.
Proceedings Article
Pseudo-Masked Language Models for Unified Language Model Pre-Training
Hangbo Bao,Li Dong,Furu Wei,Wenhui Wang,Nan Yang,Xiaodong Liu,Yu Wang,Jianfeng Gao,Songhao Piao,Ming Zhou,Hsiao-Wuen Hon +10 more
TL;DR: The experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of natural language understanding and generation tasks across several widely used benchmarks.
Posted Content
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
TL;DR: This work presents a simple and effective approach to compress large Transformer (Vaswani et al., 2017) based pre-trained models, termed as deep self-attention distillation, and demonstrates that the monolingual model outperforms state-of-the-art baselines in different parameter size of student models.
Journal ArticleDOI
Cross-Lingual Natural Language Generation via Pre-Training
TL;DR: Experimental results on question generation and abstractive summarization show that the model outperforms the machine-translation-based pipeline methods for zero-shot cross-lingual generation and improves NLG performance of low-resource languages by leveraging rich-resource language data.
Proceedings ArticleDOI
A Recurrent BERT-based Model for Question Generation
Ying-Hong Chan,Yao-Chung Fan +1 more
TL;DR: This study investigates the employment of the pre-trained BERT language model to tackle question generation tasks and proposes another two models by restructuring the authors' BERT employment into a sequential manner for taking information from previous decoded results.
References
More filters
Journal ArticleDOI
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Proceedings ArticleDOI
Glove: Global Vectors for Word Representation
TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Proceedings ArticleDOI
Bleu: a Method for Automatic Evaluation of Machine Translation
TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Proceedings Article
Neural Machine Translation by Jointly Learning to Align and Translate
TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Posted Content
Neural Machine Translation by Jointly Learning to Align and Translate
TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.