scispace - formally typeset
Open AccessJournal ArticleDOI

Faces and objects in macaque cerebral cortex.

TLDR
It is found that macaques do have discrete face-selective patches, similar in relative size and number to face patches in humans, and these results suggest that humans and macaques share a similar brain architecture for visual object processing.
Abstract
How are different object categories organized by the visual system? Current evidence indicates that monkeys and humans process object categories in fundamentally different ways. Functional magnetic resonance imaging (fMRI) studies suggest that humans have a ventral temporal face area, but such evidence is lacking in macaques. Instead, face-responsive neurons in macaques seem to be scattered throughout temporal cortex, with some relative concentration in the superior temporal sulcus (STS). Here, using fMRI in alert fixating macaque monkeys and humans, we found that macaques do have discrete face-selective patches, similar in relative size and number to face patches in humans. The face patches were embedded within a large swath of object-selective cortex extending from V4 to rostral TE. This large region responded better to pictures of intact objects compared to scrambled objects, with different object categories eliciting different patterns of activity, as in the human. Overall, our results suggest that humans and macaques share a similar brain architecture for visual object processing.

read more

Content maybe subject to copyright    Report

Faces and objects in macaque cerebral cortex
Doris Y Tsao
1,2
,Winrich A Freiwald
35
,Tamara A Knutsen
1
,Joseph B Mandeville
1
& Roger B H Tootell
1,6
1
Athinoula A. Martinos Center, Charlestown, Massachusetts 02129, USA.
2
Department of Neurobiology, Harvard Medical School, Boston, Massachusetts 02115, USA.
3
Center for Advanced Imaging, University of Bremen, Bremen, Germany.
4
Department of Brain and Cognitive Science, Massachusetts Institute of Technology, Cambridge,
Massachusetts 02139, USA.
5
Hanse Institute for Advanced Study, Delmenhorst, Germany.
6
Department of Radiology, Harvard Medical School, Boston, Massachusetts 02115, USA.
Correspondence should be addressed to D.Y.T. (doris@nmr.mgh.harvard.edu).
Abstract
How are different object categories organized by the visual system? Current evidence indicates that
monkeys and humans process object categories in fundamentally different ways. Functional magnetic
resonance imaging (fMRI) studies suggest that humans have a ventral temporal face area, but such
evidence is lacking in macaques. Instead, face-responsive neurons in macaques seem to be scattered
throughout temporal cortex, with some relative concentration in the superior temporal sulcus (STS).
Here, using fMRI in alert fixating macaque monkeys and humans, we found that macaques do have
discrete face-selective patches, similar in relative size and number to face patches in humans. The face
patches were embedded within a large swath of object-selective cortex extending from V4 to rostral TE.
This large region responded better to pictures of intact objects compared to scrambled objects, with
different object categories eliciting different patterns of activity, as in the human. Overall, our results
suggest that humans and macaques share a similar brain architecture for visual object processing.
Main text
The ability to identify and categorize objects is crucial to an animal’s survival. In primates, object
recognition is thought to be accomplished primarily in the ventral visual pathway, a chain of
interconnected areas including areas TEO and TE of the inferior temporal lobe1,2. A central question
regarding the mechanism of object recognition is whether the representation of different objects is
distributed throughout the entire ventral stream or localized to distinct areas.
Much of the original data addressing this issue was based on lesions and single-unit recordings from
macaque monkeys37,but new data has been obtained from fMRI experiments in humans810.
Unfortunately, it is difficult to directly compare the data from these two realms because differences
between species (humans versus macaques) are confounded with differences between techniques (fMRI
versus single-unit recording).
Functional imaging results in humans indicate that object recognition is mediated by both distributed
and localized representations. For example, objects such as scissors and chairs can be distinguished
based on the distributed and overlapping brain activity they elicit, even though there is no ‘scissor area’

or ‘chair area’ in cortex9,10.There are, however, specialized regions of human cortex dedicated to
processing categories of high biological relevance such as faces8, places11 and bodies12.
Single-unit and optical imaging experiments in the monkey provide evidence predominantly for
distributed mechanisms7,1315. Although face-selective cells have been reported throughout the
macaque temporal lobe36,14,with some relative concentration in the STS6,there has never been a
description of a face-selective area analogous to the human fusiform face area (FFA). However, single-
unit and optical imaging techniques are not optimal for revealing large-scale/global functional
architectures, especially within sulci. The advent of fMRI in macaque monkeys provides a solution to this
technical problem1620.
To image the global organization of visual object processing in macaques and compare it to that in
humans, we used fMRI in awake behaving macaques and humans viewing the same stimuli. We found
face-selective cortical patches within area TE of the macaque. These patches were embedded within a
large region of object-selective cortex extending from V4 to rostral TE. Although faces and bodies were
the only categories that activated specialized patches, other object categories elicited different
distributed activity patterns across the temporal lobe, as in the human9,10.In addition, faces and bodies
elicited unique distributed response patterns outside the specialized patches.
Results
Our first goal was to locate cortical regions important for object recognition in the macaque and
compare these to analogous regions in the human. We obtained functional images of the brain at 1.25
mm isotropic resolution from three monkeys while they fixated grayscale images of objects and grid-
scrambled counterparts in separate blocks. Within each block, the objects consisted of hands, bodies,
fruits and technological objects (see Supplementary Fig. 1 online for examples of the stimuli). These
provided four different categories of stimuli for the monkeys (three familiar biological forms, plus man-
made objects such as clocks and cameras).
Consistent with data from other techniques1,2, the largest swath of activation to intact compared to
scrambled objects occurred in the ventral stream of visual cortex, encompassing areas TEO and TE, as
well as foveal, ventral V4. Figure 1a shows activation from the right hemisphere of one monkey,
displayed on inflated and flattened views of the cortex. BOLD-based activation patterns were similar in
three other hemispheres, as were activation patterns obtained with MION contrast agent18,20
(Supplementary Fig. 2 online). Areal boundaries were determined by registering a surface-based atlas21
onto the individual hemisphere. Importantly, the response in early visual areas such as V1 was even
stronger to scrambled than to intact objects (Fig. 1b). This implies that the observed object-based
activation was not due to an increase in the effectiveness of low-level features during object epochs.
Smaller but significant foci of activation appeared in areas LIP and AIP (lateral and anterior intraparietal
areas, respectively), as well as in prefrontal cortex between the principle sulcus and the inferior arcuate
sulcus, near the boundary between areas 45 and 8A22 (Fig. 1a,inflated views). Parietal cortex is
considered to be important for planning actions and encoding eye and hand move-ments23,whereas
prefrontal cortex is thought to be involved in the maintenance of short-term memories24. Our results
indicate that these same areas of the macaque cortex can also be activated by the mere percept of
objects, without the performance of any explicit saccade, grasping or memory task.

Only time points in which the monkey maintained fixation within a 2° window for the entire repetition
time (TR = 2 s) were used to generate Fig. 1a.To further ensure that the activations in Fig. 1a were not
due to increased eye movements during object epochs, we compared the variance in eye position
during scrambled versus intact object epochs. We found that the variance was actually slightly smaller
during the latter (F-test, horizontal eye position: standard deviation (s.d.)
scrambled
= 0.15°, s.d.
intact
= 0.12°,
F960,960 = 1.57, P < 1.0 × 1013; vertical eye position: s.d.
scrambled
= 0.16°, s.d.
intact
= 0.13°, F960,960 =
1.51, P < 1.3 × 1012). Each epoch was 16 s long, and the eye position was sampled at 60 Hz (60
times/s), thus there were a total of 960 samples during each of the scrambled and the intact epochs.
The same stimulus comparison in a human subject (Fig. 1c) showed activation of the lateral occipital
complex (LOC), a large region of non-retinotopic human ventral cortex that is significantly more
responsive to intact than to scrambled objects25.It thus appears that human LOC includes homologs of
macaque areas TEO and TE (compare Fig. 1a and c). Unlike in the macaque, human object-related
activation extended prominently into area V3A. As in the macaque, there were several foci of object-
selective activity in parietal cortex.
Within the macaque cortical territory that is object-selective (Fig. 1a), do different object categories
activate unique, segregated regions of cortex (localized representation), or do they elicit different
activity patterns within a common cortical region (distributed representation)? We initially tested this
with the object class of faces, since faces are a behaviorally important natural category for monkeys26,
and the strongest evidence for localized object representations comes from the response specificity of
the human fusiform face area (FFA)8.
When tested with pictures of faces versus pictures of non-face objects (same objects as those used in
Fig. 1), a human subject showed two face-specific patches in the left hemisphere (in the left posterior
inferior temporal gyrus and in the fusiform gyrus) and two face-specific patches in the right hemisphere
(in the anterior superior temporal sulcus and in the fusiform gyrus; Fig. 2a).
To test whether macaques have analogous face-selective region(s), we presented the identical stimuli to
three monkeys. In one monkey, we found three face-specific patches located in the fundus and lower
bank of the STS in caudal TE (Fig. 2b). In addition, we found two smaller face-specific patches located
bilaterally in the upper bank of the anterior middle temporal sulcus (AMTS), in rostral TE. These five face
patches were statistically robust and reliably imaged across ten experimental sessions spanning almost
nine months (Supplementary Fig. 3 online). Indeed, these five face-selective patches were activated
even when line drawings of faces and objects were presented instead of grayscale images (Fig. 2c). This
strongly suggests that the macaque face patches are detecting a high-level gestalt of facial form,
independent of low-level features. The overall topography of face-selective patches was consistent
across all three monkeys (Supplementary Fig. 4 online).
Time courses from the face patches in the macaque and human showed clear face-selective responses
(Fig. 2d). Again, exactly the same stimuli were used in the two species. The stimulus sequence alternated
between faces and objects, interdigitated with Fourier-phase scrambled counterparts of each. The
relative response to non-face objects was even smaller in the macaque than in the human, suggesting
greater face selectivity.
In Figs. 2ad, the face stimuli consisted of pictures of human faces. We also tested pictures of macaque
faces (for example stimuli, see Supplementary Fig. 1 online), using the same object stimuli for

comparison. In the human, patches selective for macaque faces were identical to those for human faces.
In the macaque, however, patches selective for macaque faces were larger than those for human faces,
and spread posteriorly into area TEO (Supplementary Fig. 3k online). Whereas the human face patches
responded similarly to the presentation of both human and macaque faces, the macaque face patches
responded more than twice as strongly to macaque faces compared to human faces (Fig. 2e).
Modular functional organization is usually described in terms of stereotypically spaced columns or
larger, functionally distinct areas. However, the face-selective patches shown in Fig. 2b and c and
Supplementary Fig. 4 online seem to represent a level of functional organization intermediate between
these two extremes. Although the macaque face patches were larger and less numerous than classical
columns (e.g.,ocular dominance columns), they were smaller and more numerous than typical visual
areas. This may be the reason why their existence has so far eluded single-unit physiologists.
Face selectivity in humans is most consistently found in the FFA8.As we found the strongest face
selectivity in the macaque in the lower bank of the STS in caudal TE, it is tempting to suggest a homology
between this region and the human FFA. When the macaque face patches are computationally
deformed onto a human flat map (using CARET27), they lie quite close to the human FFA
(Supplementary Fig. 5 online). However, additional face patches also exist in ventral cortex of both
species1,28 (Fig. 2ac), thus further functional analysis will be necessary to fully elucidate the
homologies. Furthermore, the finding that the human FFA does not differentiate between human and
monkey faces, whereas the macaque face patches respond twice as strongly to monkey faces (Fig. 2e),
raises the possibility that the macaque face patches may be more important for processing social/
emotional signals than the human FFA.
Are faces ‘special,’ or do other object categories activate unique patches of cortex as well? To address
this question, we presented a stimulus sequence in which five categories (faces, hands, bodies, fruits
and technological objects) and scrambled objects were presented with equal frequency during each
scan, with each category in a different block. In addition to face-selective patches, we found a
specialized patch for bodies, which intriguingly was located adjacent to a face patch (Fig. 3a,b).
The lack of specialized patches for categories other than faces and bodies raises the possibility that the
macaque brain does not use specialized patches of cortex to represent the majority of categories (such
as fruits and technological objects). In agreement with this, we found extensive regions of ‘relative
selectivity’ to each category, meaning that the response of the region to the category was significantly
greater than to scrambled counterparts, but not significantly greater than to every other intact object
category (Fig. 3cf).
Are the overlapping response patterns within these regions nevertheless consistent and distinctive
enough to allow accurate category discrimination? Figure 4a shows the distributed patterns of activity,
averaged over even and odd scans independently, to two different object categories, in an exemplary
slice from each monkey. The unique and repeatable patterns shown here suggest that distributed
response patterns in the macaque can indeed subserve accurate category discrimination.
To address this issue more quantitatively, we used a previously described method7 to compute a matrix
of correlation values between the activity patterns elicited by the different object categories during
even and odd scans (Fig. 4b,c). A set of consistent and unique responses should result in high correlation
values for same-category patterns, and low correlation values for different-category patterns, and such

activity patterns could subserve category discrimination. Of course, whether the brain actually uses
these patterns to perform category discrimination is a further question not addressed here.
If a diagonal entry in the correlation matrix (Fig. 4b) is the highest (red) in both its row and column, this
indicates that the corresponding category can be perfectly discriminated. The correlation matrix
revealed that in both monkeys tested, the distributed pattern of face responses allowed perfect
discrimination of faces from other object categories. In fact, the correlation value for faces was the
greatest of all five categories, demonstrating that faces are indeed ‘special’. Information about other
categories was also present. For example, the distributed responses to technological objects and fruits
were distinct from that to four other stimulus categories (but similar to each other).
The correlation matrix in Fig. 4b was obtained from visually activated voxels in the temporal lobe. We
compared the performance on different discrimination types in different visual areas, and Fig. 4c shows
mean percentage pairwise correct discrimination (see Methods) for different discrimination types in the
prefrontal lobe, parietal lobe, temporal lobe, whole brain excluding face-selective voxels, and face-
selective voxels only. Temporal lobe performance was above chance for each of the three discrimination
types, whereas prefrontal performance was at chance. Parietal performance was above chance for
discriminating scrambled objects and faces from other categories, but at chance for object-versus-object
discriminations. These results support the idea that object recognition is accomplished primarily in
temporal cortex1 (but see refs. 29,30).
The near-negligible response to non-face objects within the face patches (Fig. 2d) suggests that cells in
these patches are truly specialized for discriminating faces, and carry little information about other
object categories. Confirming this prediction, correlation analysis restricted to face-selective (P < 0.01)
voxels revealed only chance performance for discriminating non-face objects from one another (Fig.
4c,right cluster, striped bar). In humans, the coding specificity of face-selective patches is currently a
topic of debate810,28.
The activity patterns analyzed in Fig. 4c were smoothed with a Gaussian kernel of 2 mm full-width-at-
half-maximum. One might expect spatial smoothing to diminish discrimination performance. However,
when we redid the analysis without any prior spatial smoothing (top graph in Supplementary Fig. 6
online), we found essentially the same pattern of discrimination indices. In particular, the ability of face-
selective voxels to discriminate among non-face object categories remained at chance.
We also addressed the converse question: whether the response pattern to faces in voxels not
maximally responsive to faces was nevertheless sufficient to distinguish faces. Even without face-
selective voxels, the distributed response pattern to faces was sufficient to yield 100% accuracy for
discriminating faces from other objects in both monkeys (Fig. 4c,middle cluster, stippled bar). Similar
conclusions were reached in human fMRI studies9,10.It has been proposed that the distributed
response patterns elicited by faces are used to distinguish faces from other categories, whereas face-
specific patches are used to recognize the identity of specific faces10.
A simple explanation of the superior discrimination ability of the temporal lobe could be that there were
many more visually-activated voxels in the temporal lobe compared to other regions. When the analysis
in Fig. 4c was restricted to the 30 most visually active voxels in each region (Supplementary Fig. 6b),
discrimination performance in prefrontal cortex was substantially improved. However, discrimination

Citations
More filters
Journal ArticleDOI

Beyond mind-reading: multi-voxel pattern analysis of fMRI data

TL;DR: How researchers are using multi-voxel pattern analysis methods to characterize neural coding and information processing in domains ranging from visual perception to memory search is reviewed.
Journal ArticleDOI

How Does the Brain Solve Visual Object Recognition

TL;DR: It is proposed that understanding the algorithm that produces core object recognition will require using neuronal and psychophysical data to sift through many computational models, each based on building blocks of small, canonical subnetworks with a common functional goal.
Journal ArticleDOI

The fusiform face area: a cortical region specialized for the perception of faces.

TL;DR: It is argued that the F FA is engaged both in detecting faces and in extracting the necessary perceptual information to recognize them, and that the properties of the FFA mirror previously identified behavioural signatures of face-specific processing.
Journal ArticleDOI

Matching Categorical Object Representations in Inferior Temporal Cortex of Man and Monkey

TL;DR: It is suggested that primate IT across species may host a common code, which combines a categorical and a continuous representation of objects.
Journal ArticleDOI

Gaze cueing of attention: visual attention, social cognition, and individual differences.

TL;DR: This review aims to provide a comprehensive overview of past and current research into the perception of gaze behavior and its effect on the observer, including gaze-cueing paradigm that has been used to investigate the mechanisms of joint attention.
References
More filters
Journal ArticleDOI

Eigenfaces for recognition

TL;DR: A near-real-time computer system that can locate and track a subject's head, and then recognize the person by comparing characteristics of the face to those of known individuals, and that is easy to implement using a neural network architecture.
Journal ArticleDOI

The Fusiform Face Area: A Module in Human Extrastriate Cortex Specialized for Face Perception

TL;DR: The data allow us to reject alternative accounts of the function of the fusiform face area (area “FF”) that appeal to visual attention, subordinate-level classification, or general processing of any animate or human forms, demonstrating that this region is selectively involved in the perception of faces.
Journal ArticleDOI

Neurophysiological investigation of the basis of the fMRI signal

TL;DR: These findings suggest that the BOLD contrast mechanism reflects the input and intracortical processing of a given area rather than its spiking output, and that LFPs yield a better estimate of BOLD responses than the multi-unit responses.
Journal ArticleDOI

Distributed and Overlapping Representations of Faces and Objects in Ventral Temporal Cortex

TL;DR: The functional architecture of the object vision pathway in the human brain was investigated using functional magnetic resonance imaging to measure patterns of response in ventral temporal cortex while subjects viewed faces, cats, five categories of man-made objects, and nonsense pictures, and a distinct pattern of response was found for each stimulus category.
Related Papers (5)
Frequently Asked Questions (8)
Q1. What contributions have the authors mentioned in the paper "Faces and objects in macaque cerebral cortex" ?

This paper found that macaques do have discrete face-selective patches, similar in relative size and number to face patches in humans. 

Best efforts were made to align the anterior-posterior position of the slices from different days, but since the slice separation was 1.25 mm, there may be a ±0.63 mm offset between slices in the same column. 

In total, the authors obtained 251,464 functional volumes (6,115,240 slices) during 53 scan sessions in one monkey, 164,560 functional volumes (4,183,360 slices) during 40 scan sessions in a second monkey, 4760 functional volumes (142,800 slices) during 2 scan sessions in a third monkey, and 4624 functional volumes (217,600 slices) during 6 scan sessions in six human subjects. 

7. Therefore, assuming a linear relationship between BOLD signal and spike rate, the authors predict that the summed spike output of neurons within the face-selective patches to the face stimuli should be at least seven times as strong as the summed spike output to the non-face stimuli. 

The authors found that the BOLD response to faces within the macaque face-selective patches was seven times as strong as that to non-face objects: (Responsefaces – Responsebaseline)/ (Responseobjects – Responsebaseline) = 

One possibility is that the patches are innately wired to represent faces; another possibility is that they are adapted, through learning, to represent any set of overtrained stimuli, including but not limited to faces28. 

The correlation matrix revealed that in both monkeys tested, the distributed pattern of face responses allowed perfect discrimination of faces from other object categories. 

A simple explanation of the superior discrimination ability of the temporal lobe could be that there were many more visually-activated voxels in the temporal lobe compared to other regions.