scispace - formally typeset
Search or ask a question

Showing papers by "Matthew Turk published in 2019"


Proceedings ArticleDOI
01 Jul 2019
TL;DR: This work proposes an end-to-end goal-oriented visual dialogue system, that combines reinforcement learning with regularized information gain, and is motivated by the Rational Speech Act framework, which models the process of human inquiry to reach a goal.
Abstract: The ability to engage in goal-oriented conversations has allowed humans to gain knowledge, reduce uncertainty, and perform tasks more efficiently. Artificial agents, however, are still far behind humans in having goal-driven conversations. In this work, we focus on the task of goal-oriented visual dialogue, aiming to automatically generate a series of questions about an image with a single objective. This task is challenging since these questions must not only be consistent with a strategy to achieve a goal, but also consider the contextual information in the image. We propose an end-to-end goal-oriented visual dialogue system, that combines reinforcement learning with regularized information gain. Unlike previous approaches that have been proposed for the task, our work is motivated by the Rational Speech Act framework, which models the process of human inquiry to reach a goal. We test the two versions of our model on the GuessWhat?! dataset, obtaining significant results that outperform the current state-of-the-art models in the task of generating questions to find an undisclosed object in an image.

33 citations


Posted Content
TL;DR: This article proposed an end-to-end goal-oriented visual dialogue system that combines reinforcement learning with regularized information gain to generate a series of questions about an image with a single objective.
Abstract: The ability to engage in goal-oriented conversations has allowed humans to gain knowledge, reduce uncertainty, and perform tasks more efficiently. Artificial agents, however, are still far behind humans in having goal-driven conversations. In this work, we focus on the task of goal-oriented visual dialogue, aiming to automatically generate a series of questions about an image with a single objective. This task is challenging since these questions must not only be consistent with a strategy to achieve a goal, but also consider the contextual information in the image. We propose an end-to-end goal-oriented visual dialogue system, that combines reinforcement learning with regularized information gain. Unlike previous approaches that have been proposed for the task, our work is motivated by the Rational Speech Act framework, which models the process of human inquiry to reach a goal. We test the two versions of our model on the GuessWhat?! dataset, obtaining significant results that outperform the current state-of-the-art models in the task of generating questions to find an undisclosed object in an image.

16 citations


Proceedings ArticleDOI
14 Oct 2019
TL;DR: The results suggest that the incorporation of additional modalities related to eye-movements and muscle activity may improve the efficacy of mobile EEG-based BCI systems, creating the potential for ubiquitous BCI.
Abstract: Brain Computer Interfaces (BCIs) typically utilize electroencephalography (EEG) to enable control of a computer through brain signals. However, EEG is susceptible to a large amount of noise, especially from muscle activity, making it difficult to use in ubiquitous computing environments where mobility and physicality are important features. In this work, we present a novel multimodal approach for classifying the P300 event related potential (ERP) component by coupling EEG signals with nonscalp electrodes (NSE) that measure ocular and muscle artifacts. We demonstrate the effectiveness of our approach on a new dataset where the P300 signal was evoked with participants on a stationary bike under three conditions of physical activity: rest, low-intensity, and high-intensity exercise. We show that intensity of physical activity impacts the performance of both our proposed model and existing state-of-the-art models. After incorporating signals from nonscalp electrodes our proposed model performs significantly better for the physical activity conditions. Our results suggest that the incorporation of additional modalities related to eye-movements and muscle activity may improve the efficacy of mobile EEG-based BCI systems, creating the potential for ubiquitous BCI.

8 citations