scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Patterns of Individual Differences in the Perception of Missing-Fundamental Tones

TL;DR: There are genuine, stable individual differences underlying the diverse findings, but also that there are more than two general types of listeners, and that stimulus variables strongly affect some listeners' responses.
Abstract: Recent experimental findings suggest stable individual differences in the perception of auditory stimuli with missing fundamental frequency (F0). Specifically, some individuals readily identify the pitch of such tones with the missing F0 ('F0 listeners'), and some base their judgement on the frequency of the partials that make up the tones ('spectral listeners'). However, the diversity of goals and methods in recent research makes it difficult to draw clear conclusions about individual differences. The first purpose of this paper is to discuss the influence of methodological choices on listeners' responses. The second goal is to report findings on individual differences in our own studies of the missing-fundamental phenomenon. We conclude that there are genuine, stable individual differences underlying the diverse findings, but also that there are more than two general types of listeners, and that stimulus variables strongly affect some listeners' responses. This suggests that it is generally misleading to classify individuals as 'F0 listeners' or 'spectral listeners'. It may be more accurate to speak of two modes of perception ('F0 listening' and 'spectral listening'), both of which are available to many listeners. The individual differences lie in what conditions the choice between the two modes.

Summary (5 min read)

Introduction

  • Specifically, some individuals readily identify the pitch of such tones with the missing F0 (“F0 listeners”), and some base their judgment on the frequency of the partials that make up the tones (“spectral listeners”).
  • The diversity of goals and methods in recent research makes it difficult to draw clear conclusions about individual differences.
  • A missing fundamental (MF) tone is an artificially constructed acoustic stimulus consisting of a number of component frequencies, chosen so that they could be the harmonics of some funda- mental frequency (F0) that is itself not present in the stimulus.
  • The starting point for this article is the finding that many individuals seem to have stable biases in the way they perceive MF stimuli, preferentially hearing the pitch of the stimulus either on the basis of the MF or of the partials that are actually present.

Basic Design

  • The first systematic exploration of individual differences in responses to MF tones was carried out by Smoorenburg (1970), who seems to have stumbled on the existence of the individual differences while researching the basic physics of the phenomenon (p. 927).
  • This behavioral task is the experimental tool on which subsequent research has been based.
  • This ambiguity can be achieved even while keeping the highest partials at the same frequency in both tones; all that is needed is to treat that top frequency as the nth harmonic in Tone A and the (n 1)th harmonic in Tone B.
  • In the various studies discussed here, actual stimulus pairs were of course presented in either order (AB or BA) or in both orders.

Individual Differences in MF Perception

  • Smoorenburg’s (1970) experiment suggested that most individuals fairly consistently perceive the pitch of the MF tones either in terms of the missing F0 or on the basis of the component frequencies.
  • One of the striking features of the two studies just summarized is that they make very different methodological choices in their procedures and in constructing their stimuli, yet both find evidence for Smoorenburg’s (1970) basic conclusion that listeners exhibit two essentially different types of behavior in processing MF stimuli.
  • Systematic manipulation of stimulus variables in subsequent work (e.g., Moore, Glasberg, & Peters, 1985; Houtsma & Fleuren, 1991) further established the role of stimulus properties in determining response patterns, independent of individual differences.
  • At the same time, they also report an effect of harmonic rank, such that partials lower in the harmonic series evoke more F0 responses ; Seither-Preisler et al. (p. 745 ff.) mention a similar effect of harmonic rank in a variable they call “spectral profile.”.

Classification of Listeners

  • Given the forced-choice approach of the experiments just discussed, labels such as “F0 listening” and “spectral listening” can certainly be applied to individual responses.
  • It is less clear that these labels can also be appropriately used to describe the overall behavior of listeners—that is, whether individuals clearly fall into two groups with distinct behavioral strategies.
  • In such a stimulus, a listener who truly perceived the MF as the pitch of Tone B would respond “down,” but a listener who perceived the second harmonic would respond “up.” Schneider et al. excluded such octave-shifted responses from their analysis altogether, calculating the SI only on the basis of responses that could be clearly classed as F0 or spectral.
  • There is, unfortunately, a discrepancy between the formula given on page 1242 of Schneider et al.’s paper and the published graphs in the same paper: Subsequent work by Schneider and his colleagues (e.g., Schneider and Wengenroth, 2009) has settled on the polarity shown in the graphs, and this is reflected in the formula the authors use here.

Our Studies

  • The authors studies of this topic are ultimately motivated by an interest in individual perceptual and cognitive differences that are potentially relevant to language.
  • The focus of the present article is more basic.
  • The authors also report findings on test–retest reliability, and, in a limited way, they deal with the related issue of the effects of stimulus variables, discussed previously.
  • The specific issues addressed in the last four experiments have been (or will be) reported elsewhere, and the present article includes only the basic behavioral data from those experiments.
  • The same task is used and methodologies are broadly similar.

Method

  • Within a stimulus, the authors always held the top frequency of Tone A and Tone B constant, and always kept the harmonic rank of the two sets of partials close .
  • It also means that the F0 value of the tones was determined entirely by the top frequency and the harmonic rank of the partials, and that the range of F0 values was therefore, especially in their earlier experiments, quite large.
  • In Experiment 1, the authors had only 15 stimulus types, based on a twodimensional stimulus matrix with five settings of the top frequency and three settings of the harmonic rank of the partials.
  • Pants whose responses are not consistently at one end or the other of the SI scale.

Summary of Experiments

  • Table 1 summarizes the details of all the experiments on which this report is based.
  • All are based on a systematic two-dimensional matrix of stimulus types like the one just exemplified for Experiment 1, with top frequency and spectral composition (here defined as the harmonic rank of the partials) as the two dimensions of the matrix.
  • Experiments whose identifiers share a number (e.g., 3a and 3b) have the same stimulus matrix but are not otherwise related.
  • As previously noted, most of the experiments were motivated by research questions beyond the basic goal of clarifying the nature of response patterns in the MF task.
  • These additional questions are summarized in the notes for Table 1.

Participants

  • In the experiments involving students, the participants were paid a small sum for participating; in Experiment 3a, a small donation was made to the choirs in which the participants sang.
  • Overall, the great majority of participants were native speakers of English, but there were also native speakers of quite a few other languages as well, in particular, Dutch and Chinese.
  • Participants’ ages varied from about 15 to about 75, but in all the experiments that relied on students as participants (3b, 4a, 4b, and 4c), most were in their early 20s.

Stimulus Preparation

  • The authors created their stimuli using an application written for us by Simon Kirby, based on Max/MSP software, which allowed us to specify (a) the F0 and duration of the two tones; (b) the number, harmonic rank, and relative amplitude of the partials; and (c) the duration of the gap between the tones.
  • All their stimulus tones had flat spectra, again following Schneider et al. rather than Seither-Preisler et al.; the authors experimented informally with modifying the spectral slope and concluded that, except in cases of very steep slope, there was no readily perceptible difference, but they have not done a controlled comparison.
  • Experiment 4c used a newer version in which phase can be controlled, and the stimuli were created with the partials in phase.
  • This experiment was part of a study on individual differences in a task involving implicit learning of an artificial tone language (Caldwell-Harris, Biller, Ladd, Dediu, & Christiansen, 2012).

Procedure

  • Most of the experiments were run using an e-prime script written for us by Eddie Dubourg, but Experiments 2 and 4a used a Presentation software (http://www.neurobs.com) script written by Dan Dediu.
  • Intensity was set at a comfortable level for each listener, also known as Listening conditions varied somewhat.
  • In all cases, the stimuli were presented in a blocked random order, fixed between participants in Experiments 2 and 4a, and generated at run time in the others.
  • As noted in Table 1, this experiment was intended to explore hemispheric differences in pitch processing, and stimuli were presented to one ear with white noise in the other ear.
  • In Experiment 4a, assessing reliability was a specific goal of the larger study: Participants were retested on exactly the same material using exactly the same procedures after an interval of 1 to 2 weeks.

Data Reduction and Analysis

  • The patterns of responses are broadly similar in all seven of their experiments, and, except where specified, the analyses reported here are for all experiments pooled.
  • The retest data from Experiments 4a and 4c are not included in these pooled analyses.
  • On the other hand, the fact that some participants respond very consistently should not blind us to the fact that many others do not.
  • To better understand the patterning of the cell-level SI and to investigate the structure of the participants’ responses, the authors also carried out a principal components analysis (Jolliffe, 2002) for each experiment separately.

Individual Differences Between Participants

  • Figure 2 shows the distribution of the SI for all experiments pooled.
  • This suggests that one theoretically possible pattern of responses, namely, F0 responses to stimuli with lower overall frequency and spectral responses to those with higher overall frequency, occurs only very rarely, whereas the opposite pattern is quite common.
  • As explained in the Procedure section, the authors report two different assessments of the test–retest reliability, not only for the SI but also for the CI and the order effect.

Effect of Stimulus Variables

  • The distributions then become flatter, with considerable numbers of clear F0 listeners and spectral listeners in the midrange of frequencies, around 1000 Hz.
  • For comparison, the right-hand panel of Figure 5 presents a similar analysis, showing the effect of spectral composition on patterns of responses.
  • Individual responses of one “inhomogeneous” participant from Experiment 4c, representative of roughly 7.5% of participants who give mostly spectral responses to stimuli with low frequency level and mostly F0 responses to those with high frequency level.

Cluster Analysis

  • Given the apparent diversity of response patterns, the authors subjected the data to a k-means cluster analysis, locating every participant in a three-dimensional space defined by SI, CI, and the order effect.
  • Figure 6. Scatterplots of the 7 k-means clusters for their data.
  • Only 2 two-dimensional projections are shown here: Panel A plots the Schneider index (SI) against the consistency index, and Panel B shows the SI plotted against the order effect.
  • Cluster 6 (gray rectangles) comprises weak spectral listeners, and Clusters 7 (black circles) and 1 (gray inverted triangles) comprise weak F0 listeners, some of whom show clear effects of frequency level on their pattern of responses.

Effect of Participant Variables

  • Recall that both Schneider et al. and Seither-Preisler et al. were interested in the effects of musical training.
  • This is at least consistent with the idea that musical listeners are performing the task as intended, that is, hearing Tone A and Tone B separately and judging their relative pitch level, whereas nonmusical listeners may be treating the pair of tones as some sort of holistic unit.
  • Schneider and Wengenroth (2009) found no effect of age or gender on the SI.

Discussion and Conclusion

  • The authors investigations have confirmed that there are robust individual differences in the perception of MF stimuli.
  • The authors findings more obviously agree with those of Seither-Preisler et al., who excluded roughly a quarter of their participants from further analysis on the grounds that they were guessing.
  • Second, the authors have confirmed and extended others’ findings that certain stimulus variables have predictable effects on responses to MF stimuli.
  • The effect of overall frequency level is strong enough that 7.5% of individuals (in their analysis, those in Cluster 3) give consistently opposite responses in different areas of the stimulus space, responding as “spectral listeners” at low overall frequencies and as “F0 listeners” at high overall frequencies.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Patterns of Individual Differences in the Perception of
Missing-Fundamental Tones
D. Robert Ladd, Rory Turnbull,
and Charlotte Browne
University of Edinburgh
Catherine Caldwell-Harris
Boston University
Lesya Ganushchak
Max Planck Institute for Psycholinguistics, Nijmegen,
The Netherlands
Kate Swoboda and Verity Woodfield
University of Edinburgh
Dan Dediu
Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
Recent experimental findings suggest stable individual differences in the perception of auditory stimuli
lacking energy at the fundamental frequency (F0), here called missing fundamental (MF) tones.
Specifically, some individuals readily identify the pitch of such tones with the missing F0 (“F0
listeners”), and some base their judgment on the frequency of the partials that make up the tones
(“spectral listeners”). However, the diversity of goals and methods in recent research makes it difficult
to draw clear conclusions about individual differences. The first purpose of this article is to discuss the
influence of methodological choices on listeners’ responses. The second goal is to report findings on
individual differences in our own studies of the MF phenomenon. In several experiments, participants
judged the direction of pitch change in stimuli composed of two MF tones, constructed so as to reveal
whether the pitch percept was based on the MF or the partials. The reported difference between F0
listeners and spectral listeners was replicated, but other stable patterns of responses were also observed.
Test-retest reliability is high. We conclude that there are genuine, stable individual differences underlying
the diverse findings, but also that there are more than two general types of listeners, and that stimulus
variables strongly affect some listeners’ responses. This suggests that it is generally misleading to classify
individuals as “F0 listeners” or “spectral listeners.” It may be more accurate to speak of two modes of
perception (“F0 listening” and “spectral listening”), both of which are available to many listeners. The
individual differences lie in what conditions the choice between the two modes.
Keywords: missing fundamental, pitch perception, individual differences
Supplemental materials: http://dx.doi.org/10.1037/a0031261.supp
A missing fundamental (MF) tone is an artificially constructed
acoustic stimulus consisting of a number of component frequen-
cies, chosen so that they could be the harmonics of some funda-
mental frequency (F0) that is itself not present in the stimulus. For
example, consider a tone consisting of energy at 750 Hz, 1000 Hz,
and 1250 Hz. The lowest common factor of these frequencies is
This article was published Online First February 11, 2013.
D. Robert Ladd, Rory Turnbull, and Charlotte Browne, School of Philosophy,
Psychology and Language Sciences, University of Edinburgh, Edinburgh, Scot-
land; Catherine Caldwell-Harris, Department of Psychology, Boston University;
Lesya Ganushchak, Max Planck Institute for Psycholinguistics, Nijmegen, The
Netherlands; Kate Swoboda, School of Philosophy, Psychology and Language
Sciences, University of Edinburgh; Verity Woodfield, School of Philosophy,
Psychology and Language Sciences, University of Edinburgh; Dan Dediu,
Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands.
Rory Turnbull is now at Department of Linguistics, Ohio State Univer-
sity, and Kate Swoboda is now at the School of Life and Health Sciences,
Aston University, Birmingham, England.
Thanks to Eddie Dubourg and Simon Kirby for their technical con-
tributions and to Richard Shillcock, Morten Christiansen, Tim Bates,
and Antje Meyer for discussion. The pilot experiments, which deter-
mined the general direction of subsequent work, were carried out jointly
by D. Robert Ladd, Rory Turnbull, and Dan Dediu. The other authors
were each involved in one of the four larger studies: Catherine
Caldwell-Harris in Exp. 3b, Lesya Ganushchak in Exp. 4a, Kate Swo-
boda in Exp. 4b, and Charlotte Browne and Verity Woodfield in Exp.
4c. Exp. 4c was part of two student papers. D. Robert Ladd prepared the
stimuli for all experiments and had primary responsibility for writing
the paper, and Dan Dediu had primary responsibility for the statistical
analyses.
Correspondence concerning this article should be addressed to D.
Robert Ladd, School of Philosophy, Psychology and Language Sci-
ences, University of Edinburgh, 3 Charles Street, Edinburgh EH8 9AD,
Scotland. E-mail: bob.ladd@ed.ac.uk
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Journal of Experimental Psychology:
Human Perception and Performance
© 2013 American Psychological Association
2013, Vol. 39, No. 5, 1386 –1397
0096-1523/13/$12.00 DOI: 10.1037/a0031261
1386

250 Hz and, in general, such a tone is often perceived as having a
pitch of 250 Hz; that is, the pitch percept may be based on a
frequency that is, in some sense, not physically present in the
stimulus. This frequency—also referred to in the literature as
“virtual pitch” (e.g., Terhardt, 1979), “periodicity pitch” (e.g.,
Licklider, 1951), and “residue pitch” (e.g., Schouten, 1940)—is
the missing fundamental. However, it is also possible to perceive
the MF tone just described as a chord consisting of the component
frequencies (the “partials”) that are actually present in the
stimulus—specifically, in musical terms, as an inverted major triad
(roughly a very flat G
5
C
6
E
6
[g== c=== e===]). The starting point for
this article is the finding that many individuals seem to have stable
biases in the way they perceive MF stimuli, preferentially hearing
the pitch of the stimulus either on the basis of the MF or of the
partials that are actually present.
The source of these individual differences is not known. Recent
interest in this topic has arisen within cognitive neuroscience,
especially among those interested in music perception and cogni-
tion. Some of this work seeks to correlate different patterns of
responses to MF stimuli with neuroanatomical (e.g., Schneider et
al., 2005) or neurophysiological (e.g., Patel & Balaban, 2001)
differences; other work emphasizes the influence of experience,
particularly musical training, on the patterns of perceptual re-
sponses (e.g., Seither-Preisler et al., 2007). However, it is also
known that there are purely physical effects that influence the
actual acoustic nature of signals consisting of a small number of
partials, and that these may affect the cochlear response to the
signals; probably the most important effect of this sort is the
existence of “combination tones” (see, e.g., Terhardt, 1974;
Moore, 2012). There is a separate line of recent research on MF
perception among hearing researchers that seeks to understand
these basic physical mechanisms (e.g., Bernstein & Oxenham,
2006; Gockel, Plack, & Carlyon, 2005; Gockel, Carlyon, & Plack,
2010; see Moore & Gockel, 2011, for a recent review). It is
entirely possible that some of the individual differences under
discussion here are based on different cochlear responses to dif-
ferences in the signal, rather than originating in the brain.
However, the present article is concerned not with the basis of
the behavioral differences but with a clearer definition of the
differences themselves. Recent work is extremely diverse meth-
odologically and has focused on testing hypotheses about the
effect of specific individual differences (e.g., differences of musi-
cal training) on the perception of MF stimuli. Moreover, it has
tended to proceed as if the behavioral differences are straightfor-
wardly binary, describing individuals as belonging to one of two
basic types of listeners. Our investigations have shown that this
approach oversimplifies the nature of the individual differences,
and we believe that this oversimplification directly affects our
ability to look for their underlying causes. Our aim in this article
is to present a more refined characterization of the behavioral
differences, which will be of use to subsequent research on any
aspect of the MF phenomenon.
The MF Task
Basic Design
The first systematic exploration of individual differences in
responses to MF tones was carried out by Smoorenburg (1970),
who seems to have stumbled on the existence of the individual
differences while researching the basic physics of the phenomenon
(p. 927). Smoorenburg developed an ostensibly simple way to
determine whether a listener is taking the missing F0 or one of the
partials as the pitch of a MF tone. By presenting MF tones in pairs,
he was able to construct stimuli that would appear to go either up
or down in pitch from the first member of the pair to the second,
depending on whether the pitch of the individual members of the
pair was being perceived on the basis of the MF or of the partials.
This behavioral task is the experimental tool on which subsequent
research has been based.
The basic design of stimuli in the MF task is diagrammed in
Figure 1. In this example, it can be seen that the MF “goes down”
(i.e., is lower in Tone B than in Tone A), but the lowest partial
actually present in the stimuli “goes up” (i.e., is lower in Tone A
than in Tone B). This ambiguity can be achieved even while
keeping the highest partials at the same frequency in both tones; all
that is needed is to treat that top frequency as the nth harmonic in
Tone A and the (n1)th harmonic in Tone B. To avoid misunder-
standing, it is worth mentioning that the terms “Tone A” and
“Tone B” are used only for clarity of reference and imply nothing
about order of presentation. In the various studies discussed here,
actual stimulus pairs were of course presented in either order (AB
or BA) or in both orders. No source reports any order effects, but
as we shall see, such effects do occur, which complicates the
interpretation of what listeners are actually doing in the MF task.
Figure 1. Basic design of missing fundamental (MF) task stimuli. Tone
A (on the left) consists of three partials that could be the 3rd, 4th, and 5th
harmonic of a fundamental frequency (the “first harmonic”) that is not
physically present in the signal. Tone B (on the right) also consists of three
partials, which could be the 4th, 5th, and 6th harmonics of a fundamental
frequency (also not physically present). Crucially, the MF in Tone B is
lower than that in Tone A, and the lowest frequency actually present in
Tone B is higher than the lowest frequency actually present in Tone A.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
1387
PERCEPTION OF MISSING FUNDAMENTAL

Individual Differences in MF Perception
Smoorenburg’s (1970) experiment suggested that most individ-
uals fairly consistently perceive the pitch of the MF tones either in
terms of the missing F0 or on the basis of the component frequen-
cies. His data also made it appear that there are roughly equal
numbers of the two types of listeners. However, his procedure
involved only two different stimuli, presented repeatedly (i.e., a
single pair of “Tone A” and “Tone B” in both orders of presen-
tation). If there is a genuine source of individual difference, it is
not surprising that this procedure would lead to a strong separation
of the two response patterns. More recent studies seem to show
that if listeners are presented with a range of different MF stimuli,
their behavior may be more variable, and that the properties of the
stimulus may have consistent influences on which way listeners
tend to hear it.
1
We focus here on a comparison of two large
studies—Schneider et al. (2005) and Seither-Preisler et al. (2007).
Schneider et al. (2005) was a large study of musicians and
nonmusicians, with the primary aim of relating differences in MF
perception to differences in neuroanatomy, specifically to differ-
ences in the volume of the pitch-detection areas in left and right
Heschl’s gyrus. A secondary aim was to explore the effect of
certain stimulus variables (e.g., number of partials present in the
stimulus tones) on the perception of MF tones. Schneider et al.
reduced listeners’ overall pattern of responses to a quotient whose
value ranges from 1to1, according to the proportion of
responses based on F0 and on the partials. They report a bimodal
(broadly “U”-shaped) distribution in the value of this quotient,
with a minimum in the middle of the range (around 0, where an
individual’s responses are mixed). On this basis, they divide the
range in half and classify listeners as “F0 listeners” or “spectral
listeners.” We will adopt this terminology here.
2
Schneider et al.
also report that, on average, spectral listeners have greater cortical
volume in Heschl’s gyrus in the right hemisphere than in the left,
whereas F0 listeners have greater volume in the left than in the
right. They found no consistent difference in responses or in
hemispheric asymmetry between musicians and nonmusicians, but
report overall larger Heschl’s gyrus volume in musicians. Related
work by Schneider and Wengenroth (2009) suggests that there
may be differences among musicians depending on their instru-
ment or the type of music they play; for example, jazz musicians
are more likely to be spectral listeners than classical musicians.
Seither-Preisler et al. (2007) also studied musicians and nonmu-
sicians; they did not do any brain imaging, but their hypotheses are
implicitly driven by assumptions about brain plasticity, specifi-
cally the effect of musical training. Like Schneider et al., their
materials manipulated a number of different stimulus variables, but
the variables they explored differed quite considerably from those
studied by Schneider et al. They also used very different (and more
complex) statistical reductions of individuals’ behavioral response
patterns that ultimately abstracted away from the effect of stimulus
variables. Like Schneider et al., they found that many participants
responded as F0 listeners or spectral listeners, and indeed, they
report a sharper dichotomy between the two groups than was found
by Schneider et al. However, this sharper dichotomy is due in part
to their analysis procedures, which led them to exclude roughly a
quarter of their participants on the grounds that their responses
were not reliably distinguishable from guesswork. They also,
unlike Schneider et al., showed a clear effect of musical training,
with professional musicians responding far more often as F0
listeners. Note in this connection that Seither-Preisler et al.’s
repeated references to “guessing” may seem to suggest that there
is a right answer (viz., F0 response), an implication that we find
unjustified.
Stimulus Variables in the MF Task
One of the striking features of the two studies just summarized
is that they make very different methodological choices in their
procedures and in constructing their stimuli, yet both find evidence
for Smoorenburg’s (1970) basic conclusion that listeners exhibit
two essentially different types of behavior in processing MF stim-
uli. Other recent studies, based on still other methodological ap-
proaches, lead to the same conclusion. For example, Patel and
Balaban (2001), a study of neural activity in pitch perception with
a focus on the relation between time-domain and frequency-
domain processing, also finds clear evidence that individuals tend
to favor one of two different modes of behavior. The fact that these
differences show up in a wide variety of experimental situations
suggests that the underlying phenomenon is very robust.
At the same time, early psychoacoustic work into the nature of
MF perception in general (Plomp, 1967; Ritsma, 1962, 1963a,
1963b) has demonstrated that stimulus properties can have con-
sistent effects on listeners’ responses. Systematic manipulation of
stimulus variables in subsequent work (e.g., Moore, Glasberg, &
Peters, 1985; Houtsma & Fleuren, 1991) further established the
role of stimulus properties in determining response patterns, inde-
pendent of individual differences. These effects were not absent
from Schneider et al. and Seither-Preisler et al.’s results. Two such
findings emerge clearly from these two articles:
As the musical interval between the missing fundamentals in
Tone A and Tone B increases, listeners are more likely to give F0
responses. This effect was demonstrated clearly by Seither-Preisler
et al. (p. 746, Figure 3; cf. Meddis & Hewitt, 1991; Moore et al.,
1985).
As the number of partials in the tones increases, listeners are
also more likely to base their pitch judgment on the missing F0.
This effect was systematically shown by Schneider et al. (p. 1242,
Figure 1d; cf. Faulkner, 1985; Ritsma, 1962).
This means that, irrespective of an individual’s bias toward F0
or spectral listening, responses can be influenced by differences of
detail in the stimuli. It therefore seems important to consider
methodological choices in stimulus construction more closely.
Unfortunately, this is not as straightforward as it might sound,
because the stimulus variables are highly interdependent. We
cannot simply vary them orthogonally to explore their effects.
1
Louis Pols (personal communication, September 2011) tells us that he
worked in the same lab as Smoorenburg at the time of the experiments on
which the 1970 paper was based, and says that Smoorenburg was well
aware that some MF tones would elicit F0 percepts from most listeners.
Stimuli had to be carefully chosen in order to draw out the difference
between individuals.
2
“Synthetic” and “analytic” are two common terms used for F0 and
spectral listeners, respectively, and are widely used in the literature (e.g.,
Schneider & Wengenroth, 2009). Although this pair of terms has a long
history (Houtsma & Fleuren, 1991, attribute the terms to Hermann von
Helmholtz), we prefer the terms from Schneider et al. (2005), which are
more theoretically neutral.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
1388
LADD ET AL.

This interdependency can be illustrated clearly by the relation
among what we might refer to as top frequency (the frequency of
the highest partial), harmonic rank (the position of the partials in
the harmonic series, e.g., 5th and 6th harmonics), and the interval
between the missing F0 of Tone A and Tone B. If top frequency
is held constant within a stimulus pair (as was done by Schneider
et al.), then interval is completely determined by the choice of
harmonic rank for the two stimulus tones (or vice versa); if interval
is systematically varied (as was done by Seither-Preisler et al.),
then the top frequency of the two stimulus tones is completely
determined by their harmonic rank (or vice versa). For example, if
the top frequency of a stimulus pair is kept constant at 600 Hz and
we specify the top partials in tones A and B as having harmonic
rank 5 and 6, respectively—which roughly corresponds to the
procedure of Schneider et al.—the interval will necessarily be a
minor third (three semitones), because the ratio of the virtual F0 of
the two stimulus tones will be 6:5 (120 Hz and 100 Hz). If the top
frequency is held constant at 600 Hz and we want to specify an
interval of a fifth (ratio 3:2), we would have to use harmonic rank
4 and 6 (or 6 and 9, or 8 and 12, etc.). Conversely, if we specify
an interval of a fifth and also specify the harmonic rank of tone A
and B—which corresponds roughly to the procedure of Seither-
Preisler et al.—then the top frequency of one stimulus tone will
necessarily be higher than the other. Similar interdependencies
affect other stimulus variables; fuller discussion is beyond the
scope of this report.
This interdependency makes it difficult to interpret some of the
findings reported in the articles under consideration or to investi-
gate apparent contradictions. The most obvious discrepancy here
involves overall frequency level and harmonic rank. Schneider et
al. report an effect of “average spectral frequency” (p. 1242, Figure
1c): As the average frequency of the stimulus tones increases, so,
too (albeit rather irregularly), does the number of F0 responses. At
the same time, they also report an effect of harmonic rank, such
that partials lower in the harmonic series evoke more F0 responses
(p. 1242, Figure 1d); Seither-Preisler et al. (p. 745 ff.) mention a
similar effect of harmonic rank in a variable they call “spectral
profile.” These findings make exactly opposite predictions about
the effect of manipulating the partials in a MF tone pair at a given
F0 level: Higher partials will raise the average spectral frequency
and therefore should lead to more F0 responses, yet higher partials
will also be higher in the harmonic series and therefore should lead
to more spectral responses. Furthermore, in a pair of MF tones
constructed according to Schneider et al.’s procedures, higher
partials will yield smaller intervals between the missing F0 of the
two tones, which (given Seither-Preisler et al.’s results) should
lead to more spectral responses as well. Because it is physically
impossible to vary harmonic rank, MF interval, and average spec-
tral frequency orthogonally while keeping F0 within a constrained
range, we cannot resolve these contradictory predictions in con-
ventional experimental ways.
Classification of Listeners
Given the forced-choice approach of the experiments just dis-
cussed, labels such as “F0 listening” and “spectral listening” can
certainly be applied to individual responses. However, it is less
clear that these labels can also be appropriately used to describe
the overall behavior of listeners—that is, whether individuals
clearly fall into two groups with distinct behavioral strategies. It
seems likely that there really are distinct behavioral strategies, but
the matter is not simple and it depends, to some extent, on how we
quantify overall patterns of individual responses.
Schneider et al. add each participant’s responses together and
compute an individual “index” that expresses the proportion of F0
and spectral responses on a scale from 1to 1. We refer to this
score in what follows as the Schneider index (SI). Their formula is
as follows:
SI
sp f0
sp f0
(1)
where f0 refers to the number of F0 responses and sp refers to the
number of spectral responses. Seither-Preisler et al. use a similar
score to describe individual performance on their Auditory Ambi-
guity Test (AAT), which simply reports the overall proportion of
F0 responses on a scale from 0 to 1.0. These two measures are
completely equivalent, with an SI of 1 corresponding to 1.0 on
the AAT, an SI of 1 corresponding to 0, and an SI of 0
corresponding to 0.5.
3
As previously noted, both teams report
bimodal distributions of these quantitative measures, with many
listeners having scores near the ends of the range and fewer in the
middle.
The most important problem with this approach to data reduc-
tion is intraindividual consistency. Some participants give com-
pletely consistent responses—that is, 100% of their responses are
either “F0” or “spectral.” In these cases, there is no issue about
describing individuals as “F0 listeners” or “spectral listeners.”
However, many participants give a mix of responses, which can
yield an SI near 0. It is not immediately obvious how to treat such
mixed behavior.
Schneider et al. hypothesized that some degree of inconsistency
might arise through what they called “octave shifting,” that is,
perceiving the second harmonic (one octave higher than the miss-
ing F0) as the pitch of a MF tone. They attempted to allow for this
kind of inconsistency by including control stimuli in which Tone
A actually includes the F0 (in terms of the example shown in
Figure 1, Tone A would have included partials at 1200 and 600 Hz
in addition to the higher harmonics). In such a stimulus, a listener
who truly perceived the MF as the pitch of Tone B would respond
“down,” but a listener who perceived the second harmonic would
respond “up.” Schneider et al. excluded such octave-shifted re-
sponses from their analysis altogether, calculating the SI only on
the basis of responses that could be clearly classed as F0 or
spectral. In keeping with the importance of stimulus variables
discussed in the preceding section, Schneider et al. note that
octave-shifted responses were given primarily to stimuli with
relative high MF values.
3
There is, unfortunately, a discrepancy between the formula given on
page 1242 of Schneider et al.’s paper and the published graphs in the same
paper: In the formula, F0 responses are positively poled (i.e.100% F0
responses yields an SI of 1), whereas in the graphs, F0 responses are
negatively poled (i.e., 100% F0 responses yields an SI of 1). Subsequent
work by Schneider and his colleagues (e.g., Schneider and Wengenroth,
2009) has settled on the polarity shown in the graphs, and this is reflected
in the formula we use here. Note, though, that this is, in some sense,
opposite to the polarity implicit in Seither-Preisler et al.’s AAT. Ulti-
mately, of course, the choice is arbitrary, and for exactly that reason, there
is considerable potential for confusion. Caveat lector.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
1389
PERCEPTION OF MISSING FUNDAMENTAL

Seither-Preisler et al. took a different approach to inconsistent
responses; as noted earlier, they simply excluded many partici-
pants whose AAT scores fall in the middle of the range on the
grounds that such response patterns cannot be distinguished from
guesswork. At the same time, they suggest that such midrange
scores might arise for two distinct reasons: Either the participants
responded inconsistently (that is, giving opposite responses to
different presentations of the same stimulus) or they responded
inhomogeneously (that is, consistently giving F0 responses to some
stimuli and spectral responses to others). This is a valuable dis-
tinction, especially in light of the clear findings, previously sum-
marized, that certain stimulus variables systematically influence
the overall proportion of F0 responses, and in light of Schneider et
al.’s finding that some stimulus types seem to yield more octave-
shifted percepts. If many individuals exhibit systematic inhomo-
geneous behavior, then it is obviously an oversimplification to
describe everyone as either a spectral listener or an F0 listener.
However, Seither-Preisler et al. were limited in their ability to
detect inhomogeneity directly, because their participants heard
only a few presentations of each of many stimulus types. Conse-
quently, some of the participants with midrange AAT scores who
were excluded for inconsistency might more appropriately have
been treated as inhomogeneous. Furthermore, the very notions of
inconsistency and inhomogeneity are based on an easily over-
looked assumption underlying the MF task itself. Despite its ap-
parent simplicity, the task presupposes that listeners’ responses
reflect independent percepts of the pitch of the two tones in each
stimulus. That is, it assumes that listeners perceive the pitch of
Tone A and Tone B according to either the MF or the partials, and
report a pitch rise or fall across the stimulus on that basis. It does
not allow for the possibility that listeners who are asked to report
the direction of pitch across the stimulus do so on some more
holistic basis that does not simply reflect how they perceive static
pitch in a single tone (cf. the discussion of contour and interval in
Patel, 2008, Chapter 4); as we shall see, there is reason to think that
this possibility must be taken seriously. In any case, one of the
central goals of the work reported here is a better understanding of
response patterns that yield intermediate values of the SI.
Our Studies
Our studies of this topic are ultimately motivated by an interest
in individual perceptual and cognitive differences that are poten-
tially relevant to language. However, the focus of the present
article is more basic. In order to draw convincing connections
between specific individual behavioral differences and other cog-
nitive traits, we will need a well-understood and well-
operationalized measure of the behavior in question. As can be
seen from the foregoing review, this is precisely what we do not
have in the case of the MF task. What we report here is therefore
a set of experiments aimed primarily at clarifying what it is that the
MF task reveals. Our principal concern is with the distinction
between inconsistency and inhomogeneity, and with explanations
for midrange SI scores. We also report findings on test–retest
reliability, and, in a limited way, we deal with the related issue of
the effects of stimulus variables, discussed previously. In keeping
with our ultimate interest in individual differences, we also report
findings on the influence of three participant variables, namely,
age, gender, and musical background.
The data reported in Experiments 1, 2, and 3a come from strictly
exploratory experiments. The remaining data, in Experiments 3b,
4a, 4b, and 4c, are drawn from four studies that focused on the
relation between the MF task and other perceptual measures rele-
vant to language. The specific issues addressed in the last four
experiments have been (or will be) reported elsewhere, and the
present article includes only the basic behavioral data from those
experiments. Although the studies had different purposes, the same
task is used and methodologies are broadly similar. Most impor-
tantly, the minor methodological differences in our experiments
had no impact on the conclusion that there are two different ways
of responding to MF stimuli; indeed, as we have already discussed,
experiments in the literature have diverged radically in their meth-
odological choices yet have all converged on this conclusion. It
thus made sense to pool the data across the studies, given the
benefits of increasing generalizability and statistical power. De-
tailed discussion of the comparability of the different experiments
is provided in the online supplemental appendix.
Method
Stimulus Variables
By and large, our approach to stimulus construction was closer
to that of Schneider et al. than to that of Seither-Preisler et al.
Within a stimulus, we always held the top frequency of Tone A
and Tone B constant, and always kept the harmonic rank of the two
sets of partials close (that is, our stimuli resemble the one illus-
trated in Figure 1). This, in turn, means that the interval between
the two missing F0 values was always quite small, between two
and four semitones. It also means that the F0 value of the tones was
determined entirely by the top frequency and the harmonic rank of
the partials, and that the range of F0 values was therefore, espe-
cially in our earlier experiments, quite large. In the later experi-
ments (Experiments 4a, 4b, and 4c), influenced by Seither-Preisler
et al., we narrowed the range of top frequencies and used lower
harmonic ranks, thereby narrowing the range of the missing F0. In
the earlier experiments, the tones consisted of three partials, but in
the Experiment 4 set, we used a mix of two-partial and three-
partial stimuli.
The most significant respect in which our work diverges meth-
odologically from that of Seither-Preisler et al., and especially
Schneider et al., is that our experiments involve fewer stimulus
types and more responses to each type. For example, in Experi-
ment 1, we had only 15 stimulus types, based on a two-
dimensional stimulus matrix with five settings of the top frequency
and three settings of the harmonic rank of the partials. In each of
the 15 cells of this stimulus matrix, every participant gave 10
judgments during the course of the experiment, five in each order
of presentation (AB or BA). By comparison, Seither-Preisler et al.
had 50 stimulus types, and participants gave only four judgments
per stimulus type, two in each order. Schneider et al. had 144
stimulus types and obtained only one response per type; the order
of Tone A and Tone B within each stimulus type was randomly
assigned. By contrast, all of our data (with minor exceptions due to
errors and missing responses and with the systematic exception of
Experiment 4b) are based on 10 responses per stimulus type. This
gives us a good basis for investigating Seither-Preisler et al.’s
distinction between inhomogeneity and inconsistency in partici-
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
1390
LADD ET AL.

Citations
More filters
Journal ArticleDOI
14 Aug 2013-PLOS ONE
TL;DR: It is shown that this kind of approach can find links between such unlikely cultural traits as traffic accidents, levels of extra-martial sex, political collectivism and linguistic diversity, and the ease of finding apparently rigorous correlations between cultural traits.
Abstract: The recent proliferation of digital databases of cultural and linguistic data, together with new statistical techniques becoming available has lead to a rise in so-called nomothetic studies [1]–[8]. These seek relationships between demographic variables and cultural traits from large, cross-cultural datasets. The insights from these studies are important for understanding how cultural traits evolve. While these studies are fascinating and are good at generating testable hypotheses, they may underestimate the probability of finding spurious correlations between cultural traits. Here we show that this kind of approach can find links between such unlikely cultural traits as traffic accidents, levels of extra-martial sex, political collectivism and linguistic diversity. This suggests that spurious correlations, due to historical descent, geographic diffusion or increased noise-to-signal ratios in large datasets, are much more likely than some studies admit. We suggest some criteria for the evaluation of nomothetic studies and some practical solutions to the problems. Since some of these studies are receiving media attention without a widespread understanding of the complexities of the issue, there is a risk that poorly controlled studies could affect policy. We hope to contribute towards a general skepticism for correlational studies by demonstrating the ease of finding apparently rigorous correlations between cultural traits. Despite this, we see well-controlled nomothetic studies as useful tools for the development of theories.

104 citations

Journal ArticleDOI
TL;DR: This paper showed that there is an under-appreciated amount of inter-individual variation in vocal tract anatomy and physiology, which results in systematic differences in phonetics and phonology between languages.

53 citations

Journal ArticleDOI
25 Mar 2016-PLOS ONE
TL;DR: It is shown that subjects can learn to reversibly select between either fundamental or spectral perception, and that this is accompanied both by changes to the fundamental representation in the FFR and to cortical-based gamma activity, suggesting that both fundamental and spectral representations coexist.
Abstract: The scalp-recorded frequency-following response (FFR) is a measure of the auditory nervous system’s representation of periodic sound, and may serve as a marker of training-related enhancements, behavioural deficits, and clinical conditions. However, FFRs of healthy normal subjects show considerable variability that remains unexplained. We investigated whether the FFR representation of the frequency content of a complex tone is related to the perception of the pitch of the fundamental frequency. The strength of the fundamental frequency in the FFR of 39 people with normal hearing was assessed when they listened to complex tones that either included or lacked energy at the fundamental frequency. We found that the strength of the fundamental representation of the missing fundamental tone complex correlated significantly with people's general tendency to perceive the pitch of the tone as either matching the frequency of the spectral components that were present, or that of the missing fundamental. Although at a group level the fundamental representation in the FFR did not appear to be affected by the presence or absence of energy at the same frequency in the stimulus, the two conditions were statistically distinguishable for some subjects individually, indicating that the neural representation is not linearly dependent on the stimulus content. In a second experiment using a within-subjects paradigm, we showed that subjects can learn to reversibly select between either fundamental or spectral perception, and that this is accompanied both by changes to the fundamental representation in the FFR and to cortical-based gamma activity. These results suggest that both fundamental and spectral representations coexist, and are available for later auditory processing stages, the requirements of which may also influence their relative strength and thus modulate FFR variability. The data also highlight voluntary mode perception as a new paradigm with which to study top-down vs bottom-up mechanisms that support the emerging view of the FFR as the outcome of integrated processing in the entire auditory system.

39 citations


Cites background or methods or result from "Patterns of Individual Differences ..."

  • ...Because both the FFR f0 strength and perceptual mode bias vary considerably between subjects [7,31] and pertain to the representation of pitch, we wanted to test the hypothesis that there is a relationship between the two....

    [...]

  • ...This finding shows that relatively stable MF perceptual biases [31] are paralleled not only by anatomical differences in cortical structures like Heschl's gyrus and in electrophysiological evoked responses of cortical origin [30,33], but also by differences in the FFR f0: a measure of fast temporal fluctuations related to basic neural representation of sound....

    [...]

  • ...and has also been used in subsequent work [31]....

    [...]

  • ...The valence of this measure has been used inconsistently in literature; here, negative values represent more fundamental answers [30,31,34] rather than the reverse [58]....

    [...]

  • ...higher or lower in pitch as compared to the first [31,33,35]....

    [...]

Journal ArticleDOI
TL;DR: It is speculated that musicians were more likely to discern components within complex auditory scenes, perhaps because of enhanced attentional resolution, and thus discovered the ambiguity.
Abstract: Because musicians are trained to discern sounds within complex acoustic scenes, such as an orchestra playing, it has been hypothesized that musicianship improves general auditory scene analysis abi...

32 citations

Journal ArticleDOI
TL;DR: This paper examined individual differences in the perception of prosody through the lens of prosodic annotation and found that prosody perception is systemically related to acoustic and contextual cues, there are also individual differences that are limited to the selection and magnitude of the factors that influence prosodic rating, and the relative weighting among those factors.
Abstract: The challenge of prosodic annotation is reflected in commonly reported variability among trained annotators in the assignment of prosodic labels. The present study examines individual differences in the perception of prosody through the lens of prosodic annotation. First, Generalized Additive Mixed Models (GAMMs) reveal the non-linear pattern of some acoustic cues on the perception of prosodic features. Second, these same models reveal that while some of the untrained annotators are using these cues to determine prosodic features, the magnitude of effect differs quite dramatically across the annotators. Finally, the trained annotators follow the same cues as subsets of the untrained annotators, but present a much stronger effect for many of the cues. The findings show that while prosody perception is systemically related to acoustic and contextual cues, there are also individual differences that are limited to the selection and magnitude of the factors that influence prosodic rating, and the relative weighting among those factors.

24 citations


Cites background from "Patterns of Individual Differences ..."

  • ...This study contributes to a growing interest in individual differences in the study of prosody (e.g., Dilley & Heffner, 2013; Ladd et al., 2013; Cangemi et al., 2015; Bishop, 2017)....

    [...]

References
More filters
Journal Article
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
Abstract: Copyright (©) 1999–2012 R Foundation for Statistical Computing. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the R Core Team.

272,030 citations

Journal ArticleDOI
TL;DR: In this paper, a simple and widely accepted multiple test procedure of the sequentially rejective type is presented, i.e. hypotheses are rejected one at a time until no further rejections can be done.
Abstract: This paper presents a simple and widely ap- plicable multiple test procedure of the sequentially rejective type, i.e. hypotheses are rejected one at a tine until no further rejections can be done. It is shown that the test has a prescribed level of significance protection against error of the first kind for any combination of true hypotheses. The power properties of the test and a number of possible applications are also discussed.

20,459 citations

Book
01 May 1986
TL;DR: In this article, the authors present a graphical representation of data using Principal Component Analysis (PCA) for time series and other non-independent data, as well as a generalization and adaptation of principal component analysis.
Abstract: Introduction * Properties of Population Principal Components * Properties of Sample Principal Components * Interpreting Principal Components: Examples * Graphical Representation of Data Using Principal Components * Choosing a Subset of Principal Components or Variables * Principal Component Analysis and Factor Analysis * Principal Components in Regression Analysis * Principal Components Used with Other Multivariate Techniques * Outlier Detection, Influential Observations and Robust Estimation * Rotation and Interpretation of Principal Components * Principal Component Analysis for Time Series and Other Non-Independent Data * Principal Component Analysis for Special Types of Data * Generalizations and Adaptations of Principal Component Analysis

17,446 citations

Reference EntryDOI
15 Oct 2005
TL;DR: Principal component analysis (PCA) as discussed by the authors replaces the p original variables by a smaller number, q, of derived variables, the principal components, which are linear combinations of the original variables.
Abstract: When large multivariate datasets are analyzed, it is often desirable to reduce their dimensionality. Principal component analysis is one technique for doing this. It replaces the p original variables by a smaller number, q, of derived variables, the principal components, which are linear combinations of the original variables. Often, it is possible to retain most of the variability in the original variables with q very much smaller than p. Despite its apparent simplicity, principal component analysis has a number of subtleties, and it has many uses and extensions. A number of choices associated with the technique are briefly discussed, namely, covariance or correlation, how many components, and different normalization constraints, as well as confusion with factor analysis. Various uses and extensions are outlined. Keywords: dimension reduction; factor analysis; multivariate analysis; variance maximization

14,773 citations

Frequently Asked Questions (2)
Q1. What future works have the authors mentioned in the paper "Patterns of individual differences in the perception of missing-fundamental tones" ?

This group of listeners needs to be treated separately in drawing conclusions about MF perception and may be interesting to study in its own right. This has implications for the construction of appropriate stimuli in further research. It may also be interesting to investigate brain structure and function ( as in the studies by Patel & Balaban, 2001, and Schneider et al., 2005 ) with a more fine-grained characterization of individual behavioral differences than simply “ F0 listener ” and “ spectral listener. ” the authors believe they have provided the research community with a better-calibrated tool for all these purposes. 

The first purpose of this article is to discuss the influence of methodological choices on listeners ’ responses. The second goal is to report findings on individual differences in their own studies of the MF phenomenon. The reported difference between F0 listeners and spectral listeners was replicated, but other stable patterns of responses were also observed. This suggests that it is generally misleading to classify individuals as “ F0 listeners ” or “ spectral listeners. ” It may be more accurate to speak of two modes of perception ( “ F0 listening ” and “ spectral listening ” ), both of which are available to many listeners.