Home
/
Authors
/
Josh H. McDermott

Author

Josh H. McDermott

Other affiliations: McGovern Institute for Brain Research, Harvard University, New York University ...read more

Bio: Josh H. McDermott is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Natural sounds & Auditory cortex. The author has an hindex of 38, co-authored 116 publications receiving 12527 citations. Previous affiliations of Josh H. McDermott include McGovern Institute for Brain Research & Harvard University.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
1997
1996

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The Fusiform Face Area: A Module in Human Extrastriate Cortex Specialized for Face Perception

[...]

Nancy Kanwisher¹, Josh H. McDermott¹, Marvin M. Chun², Marvin M. Chun¹•Institutions (2)

Harvard University¹, Yale University²

01 Jun 1997-The Journal of Neuroscience

TL;DR: The data allow us to reject alternative accounts of the function of the fusiform face area (area “FF”) that appeal to visual attention, subordinate-level classification, or general processing of any animate or human forms, demonstrating that this region is selectively involved in the perception of faces.

...read moreread less

Abstract: Using functional magnetic resonance imaging (fMRI), we found an area in the fusiform gyrus in 12 of the 15 subjects tested that was significantly more active when the subjects viewed faces than when they viewed assorted common objects. This face activation was used to define a specific region of interest individually for each subject, within which several new tests of face specificity were run. In each of five subjects tested, the predefined candidate “face area” also responded significantly more strongly to passive viewing of (1) intact than scrambled two-tone faces, (2) full front-view face photos than front-view photos of houses, and (in a different set of five subjects) (3) three-quarter-view face photos (with hair concealed) than photos of human hands; it also responded more strongly during (4) a consecutive matching task performed on three-quarter-view faces versus hands. Our technique of running multiple tests applied to the same region defined functionally within individual subjects provides a solution to two common problems in functional imaging: (1) the requirement to correct for multiple statistical comparisons and (2) the inevitable ambiguity in the interpretation of any study in which only two or three conditions are compared. Our data allow us to reject alternative accounts of the function of the fusiform face area (area “FF”) that appeal to visual attention, subordinate-level classification, or general processing of any animate or human forms, demonstrating that this region is selectively involved in the perception of faces.

...read moreread less

7,059 citations

Book Chapter•DOI•

Ambient Sound Provides Supervision for Visual Learning

[...]

Andrew Owens¹, Jiajun Wu¹, Josh H. McDermott¹, William T. Freeman², William T. Freeman¹, Antonio Torralba¹ - Show less +2 more•Institutions (2)

Massachusetts Institute of Technology¹, Google²

08 Oct 2016

TL;DR: This work trains a convolutional neural network to predict a statistical summary of the sound associated with a video frame, and shows that this representation is comparable to that of other state-of-the-art unsupervised learning methods.

...read moreread less

Abstract: The sound of crashing waves, the roar of fast-moving cars – sound conveys important information about the objects in our surroundings. In this work, we show that ambient sounds can be used as a supervisory signal for learning visual models. To demonstrate this, we train a convolutional neural network to predict a statistical summary of the sound associated with a video frame. We show that, through this process, the network learns a representation that conveys information about objects and scenes. We evaluate this representation on several recognition tasks, finding that its performance is comparable to that of other state-of-the-art unsupervised learning methods. Finally, we show through visualizations that the network learns units that are selective to objects that are often associated with characteristic sounds.

...read moreread less

483 citations

Journal Article•DOI•

The cocktail party problem.

[...]

Josh H. McDermott¹•Institutions (1)

Center for Neural Science¹

01 Dec 2009-Current Biology

TL;DR: The cocktail party problem is the task of hearing a sound of interest, often a speech signal, in this sort of complex auditory setting, and there has been longstanding interest in how humans manage to solve it.

...read moreread less

464 citations

Journal Article•DOI•

A Task-Optimized Neural Network Replicates Human Auditory Behavior, Predicts Brain Responses, and Reveals a Cortical Processing Hierarchy

[...]

Alexander J. E. Kell¹, Daniel L. K. Yamins², Erica N. Shook¹, Sam V. Norman-Haignere¹, Josh H. McDermott¹, Josh H. McDermott³ - Show less +2 more•Institutions (3)

Massachusetts Institute of Technology¹, Stanford University², Harvard University³

02 May 2018-Neuron

TL;DR: A core goal of auditory neuroscience is to build quantitative models that predict cortical responses to natural sounds, and hierarchical neural networks for speech and music recognition were optimized to solve ecologically relevant tasks.

...read moreread less

403 citations

Journal Article•DOI•

Sound Texture Perception via Statistics of the Auditory Periphery: Evidence from Sound Synthesis

[...]

Josh H. McDermott¹, Eero P. Simoncelli¹, Eero P. Simoncelli², Eero P. Simoncelli³•Institutions (3)

Howard Hughes Medical Institute¹, Center for Neural Science², Courant Institute of Mathematical Sciences³

08 Sep 2011-Neuron

TL;DR: The results suggest that sound texture perception is mediated by relatively simple statistics of early auditory representations, presumably computed by downstream neural populations, and the synthesis methodology offers a powerful tool for their further investigation.

...read moreread less

342 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

Image-to-Image Translation with Conditional Adversarial Networks

[...]

Phillip Isola¹, Jun-Yan Zhu¹, Tinghui Zhou¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

21 Jul 2017

TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.

...read moreread less

Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.

...read moreread less

11,958 citations

Posted Content•

Image-to-Image Translation with Conditional Adversarial Networks

[...]

Phillip Isola¹, Jun-Yan Zhu¹, Tinghui Zhou¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

21 Nov 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: Conditional Adversarial Network (CA) as discussed by the authors is a general-purpose solution to image-to-image translation problems, which can be used to synthesize photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.

...read moreread less

Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

...read moreread less

11,127 citations

Journal Article•DOI•

The Fusiform Face Area: A Module in Human Extrastriate Cortex Specialized for Face Perception

[...]

Nancy Kanwisher¹, Josh H. McDermott¹, Marvin M. Chun², Marvin M. Chun¹•Institutions (2)

Harvard University¹, Yale University²

01 Jun 1997-The Journal of Neuroscience

...read moreread less

7,059 citations

Remembering. A Study in Experimental and Social Psychology, Cambridge (University Press) 1964.

[...]

F. C. Bartlett

01 Jan 1964

TL;DR: In this paper, the notion of a collective unconscious was introduced as a theory of remembering in social psychology, and a study of remembering as a study in Social Psychology was carried out.

...read moreread less

Abstract: Part I. Experimental Studies: 2. Experiment in psychology 3. Experiments on perceiving III Experiments on imaging 4-8. Experiments on remembering: (a) The method of description (b) The method of repeated reproduction (c) The method of picture writing (d) The method of serial reproduction (e) The method of serial reproduction picture material 9. Perceiving, recognizing, remembering 10. A theory of remembering 11. Images and their functions 12. Meaning Part II. Remembering as a Study in Social Psychology: 13. Social psychology 14. Social psychology and the matter of recall 15. Social psychology and the manner of recall 16. Conventionalism 17. The notion of a collective unconscious 18. The basis of social recall 19. A summary and some conclusions.

...read moreread less

5,690 citations

Journal Article•DOI•

The distributed human neural system for face perception.

[...]

James V. Haxby¹, Elizabeth Hoffman¹, M. Ida Gobbini¹•Institutions (1)

National Institutes of Health¹

01 Jun 2000-Trends in Cognitive Sciences

TL;DR: A model for the organization of this system that emphasizes a distinction between the representation of invariant and changeable aspects of faces is proposed and is hierarchical insofar as it is divided into a core system and an extended system.

...read moreread less

4,430 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse