scispace - formally typeset
Open AccessJournal ArticleDOI

Break It Down: A Question Understanding Benchmark

Reads0
Chats0
TLDR
The authors introduce a Question Decomposition Meaning Representation (QDMR) for questions, which constitutes the ordered list of steps, expressed through natural language, that are necessary for answering a question.
Abstract
Understanding natural language questions entails the ability to break down a question into the requisite steps for computing its answer. In this work, we introduce a Question Decomposition Meaning Representation (QDMR) for questions. QDMR constitutes the ordered list of steps, expressed through natural language, that are necessary for answering a question. We develop a crowdsourcing pipeline, showing that quality QDMRs can be annotated at scale, and release the Break dataset, containing over 83K pairs of questions and their QDMRs. We demonstrate the utility of QDMR by showing that (a) it can be used to improve open-domain question answering on the HotpotQA dataset, (b) it can be deterministically converted to a pseudo-SQL formal language, which can alleviate annotation in semantic parsing applications. Last, we use Break to train a sequence-to-sequence model with copying that parses questions into QDMR structures, and show that it substantially outperforms several natural baselines.

read more

Citations
More filters
Posted Content

Dense Passage Retrieval for Open-Domain Question Answering

TL;DR: This work shows that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework.
Proceedings ArticleDOI

Dense Passage Retrieval for Open-Domain Question Answering

TL;DR: In this paper, a dual-encoder framework is proposed to learn dense representations from a small number of questions and passages by a simple dual encoder framework, which outperforms a strong Lucene-BM25 system greatly.
Proceedings ArticleDOI

KILT: a Benchmark for Knowledge Intensive Language Tasks

TL;DR: It is found that a shared dense vector index coupled with a seq2seq model is a strong baseline, outperforming more tailor-made approaches for fact checking, open-domain question answering and dialogue, and yielding competitive results on entity linking and slot filling, by generating disambiguated text.
Journal ArticleDOI

Measuring and Narrowing the Compositionality Gap in Language Models

TL;DR: In the GPT-3 family of models, as model size increases, it is shown that the single-hop question answering performance improves faster than the multihop performance does, therefore the compositionality gap does not decrease.
Posted Content

Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval

TL;DR: This work proposes a simple and efficient multi-hop dense retrieval approach for answering complex open-domain questions, which achieves state-of-the-art performance on twoMulti-hop datasets, HotpotQA and multi-evidence FEVER, and can be applied to any unstructured text corpus.
References
More filters
Proceedings ArticleDOI

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Journal ArticleDOI

A Formal Basis for the Heuristic Determination of Minimum Cost Paths

TL;DR: How heuristic information from the problem domain can be incorporated into a formal mathematical theory of graph searching is described and an optimality property of a class of search strategies is demonstrated.
Journal ArticleDOI

A relational model of data for large shared data banks

TL;DR: In this article, a model based on n-ary relations, a normal form for data base relations, and the concept of a universal data sublanguage are introduced, and certain operations on relations are discussed and applied to the problems of redundancy and consistency in the user's model.
Proceedings ArticleDOI

SQuAD: 100,000+ Questions for Machine Comprehension of Text

TL;DR: The Stanford Question Answering Dataset (SQuAD) as mentioned in this paper is a reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage.
Proceedings ArticleDOI

VQA: Visual Question Answering

TL;DR: The task of free-form and open-ended Visual Question Answering (VQA) is proposed, given an image and a natural language question about the image, the task is to provide an accurate natural language answer.
Related Papers (5)