Towards Neural Network-based Reasoning

Open AccessPosted Content

Towards Neural Network-based Reasoning

- 22 Aug 2015 -

TLDR

The empirical studies show that Neural Reasoner can outperform existing neural reasoning systems with remarkable margins on two difficult artificial tasks (Positional Reasoning and Path Finding) proposed in [8].

Abstract:

We propose Neural Reasoner, a framework for neural network-based reasoning over natural language sentences. Given a question, Neural Reasoner can infer over multiple supporting facts and find an answer to the question in specific forms. Neural Reasoner has 1) a specific interaction-pooling mechanism, allowing it to examine multiple facts, and 2) a deep architecture, allowing it to model the complicated logical relations in reasoning tasks. Assuming no particular structure exists in the question and facts, Neural Reasoner is able to accommodate different types of reasoning and different forms of language expressions. Despite the model complexity, Neural Reasoner can still be trained effectively in an end-to-end manner. Our empirical studies show that Neural Reasoner can outperform existing neural reasoning systems with remarkable margins on two difficult artificial tasks (Positional Reasoning and Path Finding) proposed in [8]. For example, it improves the accuracy on Path Finding(10K) from 33.4% [6] to over 98%.

Citations

PDF

Open Access

More filters

Proceedings Article

End-to-end memory networks

Sainbayar Sukhbaatar, +3 more

TL;DR: This paper proposed an end-to-end memory network with a recurrent attention model over a possibly large external memory, which can be seen as an extension of RNNsearch to the case where multiple computational steps (hops) are performed per output symbol.

...read moreread less

Posted Content

End-To-End Memory Networks

Sainbayar Sukhbaatar, +3 more

- 31 Mar 2015 -

arXiv: Neural and Evolutionary Computing

TL;DR: A neural network with a recurrent attention model over a possibly large external memory that is trained end-to-end, and hence requires significantly less supervision during training, making it more generally applicable in realistic settings.

...read moreread less

Proceedings Article

Learning multiagent communication with backpropagation

Sainbayar Sukhbaatar, +2 more

TL;DR: A simple neural model is explored, called CommNet, that uses continuous communication for fully cooperative tasks and the ability of the agents to learn to communicate amongst themselves is demonstrated, yielding improved performance over non-communicative agents and baselines.

...read moreread less

Proceedings Article

Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks

Jason Weston, +6 more

TL;DR: This paper proposed a set of proxy tasks that evaluate reading comprehension via question answering, such as chaining facts, simple induction, deduction and many more, which are designed to be prerequisites for any system that aims to be capable of conversing with a human.

...read moreread less

Proceedings Article

Dynamic memory networks for visual and textual question answering

Caiming Xiong, +2 more

TL;DR: The new DMN+ model improves the state of the art on both the Visual Question Answering dataset and the \babi-10k text question-answering dataset without supporting fact supervision.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Kyunghyun Cho, +8 more

TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.

...read moreread less

Proceedings Article

Sequence to Sequence Learning with Neural Networks

Ilya Sutskever, +2 more

TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.

...read moreread less

Posted Content

Sequence to Sequence Learning with Neural Networks

Ilya Sutskever, +2 more

- 10 Sep 2014 -

arXiv: Computation and Language

TL;DR: This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.

...read moreread less

Posted Content

Empirical evaluation of gated recurrent neural networks on sequence modeling

Junyoung Chung, +5 more

- 11 Dec 2014 -

arXiv: Neural and Evolutionary Computing

TL;DR: These advanced recurrent units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU), are found to be comparable to LSTM.

...read moreread less

Posted Content

ADADELTA: An Adaptive Learning Rate Method

Matthew D. Zeiler

- 22 Dec 2012 -

arXiv: Learning

TL;DR: A novel per-dimension learning rate method for gradient descent called ADADELTA that dynamically adapts over time using only first order information and has minimal computational overhead beyond vanilla stochastic gradient descent is presented.

...read moreread less

Neural Computation

Teaching machines to read and comprehend

Karl Moritz Hermann, +6 more

Towards Neural Network-based Reasoning

Citations

End-to-end memory networks

End-To-End Memory Networks

Learning multiagent communication with backpropagation

Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks

Dynamic memory networks for visual and textual question answering

References

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Sequence to Sequence Learning with Neural Networks

Sequence to Sequence Learning with Neural Networks

Empirical evaluation of gated recurrent neural networks on sequence modeling

ADADELTA: An Adaptive Learning Rate Method

Related Papers (5)

Neural Machine Translation by Jointly Learning to Align and Translate

Glove: Global Vectors for Word Representation

Adam: A Method for Stochastic Optimization

Long short-term memory

Teaching machines to read and comprehend