scispace - formally typeset
Search or ask a question

Showing papers by "James L. McClelland published in 2017"



Posted Content
TL;DR: This work highlights a simple technique by which deep recurrent networks can similarly exploit their prior knowledge to learn a useful representation for a new word from little data, which could make natural language processing systems much more flexible, by allowing them to learn continually from the new words they encounter.
Abstract: Standard deep learning systems require thousands or millions of examples to learn a concept, and cannot integrate new concepts easily. By contrast, humans have an incredible ability to do one-shot or few-shot learning. For instance, from just hearing a word used in a sentence, humans can infer a great deal about it, by leveraging what the syntax and semantics of the surrounding words tells us. Here, we draw inspiration from this to highlight a simple technique by which deep recurrent networks can similarly exploit their prior knowledge to learn a useful representation for a new word from little data. This could make natural language processing systems much more flexible, by allowing them to learn continually from the new words they encounter.

15 citations


Journal ArticleDOI
TL;DR: Theoretically, a new Averaging Diffusion Model is developed in which the decision variable is the mean rather than the sum of evidence samples and used as a base for comparing three alternative models of multimodal integration, allowing us to assess the optimality of this integration.
Abstract: We combine extant theories of evidence accumulation and multi-modal integration to develop an integrated framework for modeling multimodal integration as a process that unfolds in real time. Many studies have formulated sensory processing as a dynamic process where noisy samples of evidence are accumulated until a decision is made. However, these studies are often limited to a single sensory modality. Studies of multimodal stimulus integration have focused on how best to combine different sources of information to elicit a judgment. These studies are often limited to a single time point, typically after the integration process has occurred. We address these limitations by combining the two approaches. Experimentally, we present data that allow us to study the time course of evidence accumulation within each of the visual and auditory domains as well as in a bimodal condition. Theoretically, we develop a new Averaging Diffusion Model in which the decision variable is the mean rather than the sum of evidence samples and use it as a base for comparing three alternative models of multimodal integration, allowing us to assess the optimality of this integration. The outcome reveals rich individual differences in multimodal integration: while some subjects’ data are consistent with adaptive optimal integration, reweighting sources of evidence as their relative reliability changes during evidence integration, others exhibit patterns inconsistent with optimality.

12 citations


BookDOI
01 Oct 2017
TL;DR: In this article, the authors proposed a method to solve the problem of "uniformity" in the literature.and.and, and, respectively, the authors' work.
Abstract: and

9 citations


Posted ContentDOI
16 May 2017-bioRxiv
TL;DR: This work simulates N400 amplitudes as the change induced by an incoming stimulus in an implicit and probabilistic representation of meaning captured by the hidden unit activation pattern in a neural network model of sentence comprehension, and proposes that the process underlying the N400 also drives implicit learning in the network.
Abstract: The N400 component of the event-related brain potential has aroused much interest because it is thought to provide an online measure of meaning processing in the brain. This component, however, has been hard to capture within traditional approaches to language processing. Here, we show that a neural network that eschews these traditions can capture a wide range of findings on the factors that affect the amplitude of the N400. The model simulates the N400 as the change induced by an incoming word in an initial, implicit probabilistic representation of the situation or event described by the linguistic input, captured by the hidden unit activation pattern in a neural network. We further propose a new learning rule in which the process underlying the N400 drives implicit learning in the network. The model provides a unified account of a large body of findings and connects human language processing with successful deep learning approaches to language processing.

8 citations


Journal ArticleDOI
TL;DR: It is proposed that people rely on “start-up software,’ “causal models,” and “intuitive theories” built using compositional representations to learn new tasks more efficiently than some deep neural network models.
Abstract: Lake et al. propose that people rely on "start-up software," "causal models," and "intuitive theories" built using compositional representations to learn new tasks more efficiently than some deep neural network models. We highlight the many drawbacks of a commitment to compositional representations and describe our continuing effort to explore how the ability to build on prior knowledge and to learn new tasks efficiently could arise through learning in deep neural networks.

7 citations



Journal Article
TL;DR: This issue is explored by generalizing linear analysis techniques to explore two sets of analogous tasks, show that analogical structure is commonly extracted, and address some potential implications.

6 citations


Journal Article
TL;DR: This work trains a Dueling Deep Q-Network on a shape sorting task requiring implicit knowledge of geometric properties, then queries this network with classification and preference selection tasks, demonstrating that scalar reinforcement provides sufficient signal to learn representations of shape categories.

1 citations


Book ChapterDOI
01 Jan 2017
TL;DR: It is argued that a particular visuospatial model called the unit circle acts as an integrated conceptual structure that supports solving problems encountered during learning and transfers to a broader range of problems in the same domain.
Abstract: Learning trigonometry poses a challenge to many high school students, impeding their access to careers in science, technology, engineering, and mathematics. We argue that a particular visuospatial model called the unit circle acts as an integrated conceptual structure that supports solving problems encountered during learning and transfers to a broader range of problems in the same domain. We have found that individuals who reported visualizing trigonometric expressions on the unit circle framework performed better than those who did not report using this visualization. Further, a brief lesson in use of the unit circle produced postlesson benefits relative to no lesson or a rule-based lesson, but only for participants who exhibited some partial understanding of the relevant concepts in a pretest. The difficulties encountered by students without sufficient prior knowledge of the unit circle underscores the challenge we face in helping them build grounded conceptual structures that support acquiring mathematical abilities.

1 citations