scispace - formally typeset
Open AccessProceedings Article

Elementary school science and math tests as a driver for AI: take the aristo challenge!

Reads0
Chats0
TLDR
This work is working on a specific version of this challenge, namely having the computer pass Elementary School Science and Math exams, the most difficult requiring significant progress in AI.
Abstract
While there has been an explosion of impressive, data-driven AI applications in recent years, machines still largely lack a deeper understanding of the world to answer questions that go beyond information explicitly stated in text, and to explain and discuss those answers. To reach this next generation of AI applications, it is imperative to make faster progress in areas of knowledge, modeling, reasoning, and language. Standardized tests have often been proposed as a driver for such progress, with good reason: Many of the questions require sophisticated understanding of both language and the world, pushing the boundaries of AI, while other questions are easier, supporting incremental progress. In Project Aristo at the Allen Institute for AI, we are working on a specific version of this challenge, namely having the computer pass Elementary School Science and Math exams. Even at this level there is a rich variety of problems and question types, the most difficult requiring significant progress in AI. Here we propose this task as a challenge problem for the community, and are providing supporting datasets. Solutions to many of these problems would have a major impact on the field so we encourage you: Take the Aristo Challenge!

read more

Citations
More filters
Journal ArticleDOI

CoQA: A Conversational Question Answering Challenge

TL;DR: The CoQA dataset as mentioned in this paper contains 127k questions with answers, obtained from 8k conversations about text passages from seven diverse domains, and the answers are free-form text with their corresponding evidence highlighted in the passage.
Proceedings ArticleDOI

Solving General Arithmetic Word Problems

TL;DR: This is the first algorithmic approach that can handle arithmetic problems with multiple steps and operations, without depending on additional annotations or predefined templates, and it outperforms existing systems, achieving state of the art performance on benchmark datasets of arithmetic word problems.
Proceedings Article

Combining retrieval, statistics, and inference to answer elementary science questions

TL;DR: This paper evaluates the methods on six years of unseen, unedited exam questions from the NY Regents Science Exam, and shows that the overall system's score is 71.3%, an improvement of 23.8% (absolute) over the MLN-based method described in previous work.
Journal ArticleDOI

Reasoning about Quantities in Natural Language

TL;DR: A computational approach is developed which is shown to successfully recognize and normalize textual expressions of quantities and is used to further develop algorithms to assist reasoning in the context of the aforementioned tasks.
Proceedings ArticleDOI

Learning to use formulas to solve simple arithmetic problems

TL;DR: A novel method to learn to use formulas to solve simple arithmetic word problems and beats the state-of-the-art by 86.07% of the problems in a corpus of standard primary school test questions.
References
More filters
Proceedings Article

The Winograd schema challenge

TL;DR: The Winograd Schema Challenge as mentioned in this paper is an alternative to the Turing Test that has some conceptual and practical advantages, such as the ability to be easily found using selectional restrictions or statistical techniques over text corpora.
Proceedings Article

MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text

TL;DR: MCTest is presented, a freely available set of stories and associated questions intended for research on the machine comprehension of text that requires machines to answer multiple-choice reading comprehension questions about fictional stories, directly tackling the high-level goal of open-domain machine comprehension.
Proceedings Article

The Winograd Schema Challenge

TL;DR: This paper presents an alternative to the Turing Test that has some conceptual and practical advantages, and English-speaking adults will have no difficulty with it, and the subject is not required to engage in a conversation and fool an interrogator into believing she is dealing with a person.
Proceedings Article

Diagram understanding in geometry questions

TL;DR: This paper presents a method for diagram understanding that identifies visual elements in a diagram while maximizing agreement between textual and visual data, and shows that the method's objective function is submodular.
Proceedings ArticleDOI

A study of the knowledge base requirements for passing an elementary science test

TL;DR: The analysis suggests that as well as fact extraction from text and statistically driven rule extraction, three other styles of automatic knowledge base construction (AKBC) would be useful: acquiring definitional knowledge, direct 'reading' of rules from texts that state them, and, given a particular representational framework, acquisition of specific instances of those models from text.
Related Papers (5)