scispace - formally typeset
Search or ask a question
Author

Jessica E. D. Alexander

Bio: Jessica E. D. Alexander is an academic researcher from Emory University. The author has contributed to research in topics: Perceptual learning & American English. The author has an hindex of 4, co-authored 4 publications receiving 222 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: It is suggested that the speech perceptual system dynamically adjusts to the acoustic consequences of changes in talker's voice and accent.
Abstract: Spoken language is characterized by an enormous amount of variability in how linguistic segments are realized. In order to investigate how speech perceptual processes accommodate to multiple sources of variation, adult native speakers of American English were trained with English words or sentences produced by six Spanish-accented talkers. At test, listeners transcribed utterances produced by six familiar or unfamiliar Spanish-accented talkers. With only brief exposure, listeners perceptually adapted to accent-general regularities in spoken language, generalizing to novel accented words and sentences produced by unfamiliar accented speakers. Acoustic properties of vowel production and their relation to identification performance were assessed to determine if the English listeners were sensitive to systematic variation in the realization of accented vowels. Vowels that showed the most improvement after Spanish-accented training were distinct from nearby vowels in terms of their acoustic characteristics. These findings suggest that the speech perceptual system dynamically adjusts to the acoustic consequences of changes in talker’s voice and accent.

144 citations

Journal ArticleDOI
TL;DR: It is suggested that readers engage in a type of auditory imagery while reading that preserves the perceptual details of an author's voice.
Abstract: A series of experiments was conducted to determine if linguistic representations accessed during reading include auditory imagery for characteristics of a talker's voice. In 3 experiments, participants were familiarized with two talkers during a brief prerecorded conversation. One talker spoke at a fast speaking rate, and one spoke at a slow speaking rate. Each talker was identified by name. At test, participants were asked to either read aloud (Experiment 1) or silently (Experiments 1, 2, and 3) a passage that they were told was written by either the fast or the slow talker. Reading times, both silent and aloud, were significantly slower when participants thought they were reading a passage written by the slow talker than when reading a passage written by the fast talker. Reading times differed as a function of passage author more for difficult than for easy texts, and individual differences in general auditory imagery ability were related to reading times. These results suggest that readers engage in a type of auditory imagery while reading that preserves the perceptual details of an author's voice.

81 citations

Journal ArticleDOI
TL;DR: The results suggest that the structure of training exposure, specifically trial-to-trial variation on both speaker's voice and linguistic content, facilitates learning of the systematic properties of accented speech.
Abstract: Foreign-accented speech contains multiple sources of variation that listeners learn to accommodate. Extending previous findings showing that exposure to high-variation training facilitates perceptual learning of accented speech, the current study examines to what extent the structure of training materials affects learning. During training, native adult speakers of American English transcribed sentences spoken in English by native Spanish-speaking adults. In Experiment 1, training stimuli were blocked by speaker, sentence, or randomized with respect to speaker and sentence (Variable training). At test, listeners transcribed novel English sentences produced by unfamiliar Spanish-accented speakers. Listeners' transcription accuracy was highest in the Variable condition, suggesting that varying both speaker identity and sentence across training trials enabled listeners to generalize their learning to novel speakers and linguistic content. Experiment 2 assessed the extent to which ordering of training tokens by a single factor, speaker intelligibility, would facilitate speaker-independent accent learning, finding that listeners' test performance did not reliably differ from that in the no-training control condition. Overall, these results suggest that the structure of training exposure, specifically trial-to-trial variation on both speaker's voice and linguistic content, facilitates learning of the systematic properties of accented speech. The current findings suggest a crucial role of training structure in optimizing perceptual learning. Beyond characterizing the types of variation listeners encode in their representations of spoken utterances, theories of spoken language processing should incorporate the role of training structure in learning lawful variation in speech. (PsycINFO Database Record

14 citations

Journal ArticleDOI
TL;DR: The authors examined the effects of talker-specific perceptual learning on the time course of spoken word processing and found that participants were faster to shadow words produced by familiar than unfamiliar talkers both when responding immediately to the word and when responding after a cue delay.
Abstract: The current study examined the effects of talker‐specific perceptual learning on the time course of spoken word processing to assess the point at which effects of talker familiarity emerge during spoken word recognition. Listeners learned to identify six talkers’ voices (three males, three females) over three days of training. At test, listeners either completed an immediate or a delayed word‐shadowing task. Items at test were novel words produced by the six familiar talkers heard during training and by a set of six unfamiliar talkers. A separate group of controls completed the test phase only. The results indicated that effects of talker familiarity were found in both the immediate and delayed shadowing tasks. Listeners were faster to shadow words produced by familiar than unfamiliar talkers both when responding immediately to the word and when responding after a cued delay. When tested with unfamiliar talkers, the trained listeners did not differ from untrained controls. These findings suggest that effe...

4 citations

Journal ArticleDOI
TL;DR: This article examined how different ways of eliciting vocal fry affects acoustic properties as well as listener perceptions of speech, and found that elicited fry may introduce other vocal characteristics that activate stereotypes.
Abstract: Vocal fry, used frequently by both men and women, is often associated with negative stereotypes of young women. This study examines how different ways of eliciting vocal fry affects acoustic properties as well as listener perceptions of speech. Stimuli in previous studies have been obtained from actors who were presented with obvious examples of fry from media sources and were then instructed to produce speech with vocal fry. This method of eliciting fry may introduce other vocal characteristics that activate stereotypes. We obtained natural stimuli by asking speakers to read passages with no instructions about vocal fry and created a corpus of acoustic and prosodic data about naturalistic fry. We then recorded actors who listened to naturally-produced examples of vocal fry. We compared the acoustic characteristics of low and high fry speech from our naturally-produced utterances, our actors, and stimuli collected by Anderson et al. (2014). Elicited fry (from both studies) differed when compared to naturally-produced fry. Additionally, elicited fry in our study differed from that of Anderson et al on jitter and other measures. The three sets of stimuli were used to replicate and extend Anderson and colleagues’ study on listener perceptions of men and women using vocal fry.

Cited by
More filters
Journal ArticleDOI
TL;DR: A review of the effects of adverse conditions (ACs) on the perceptual, linguistic, cognitive, and neurophysiological mechanisms underlying speech recognition is presented in this paper, where the authors advocate an approach to speech recognition that includes rather than neutralises complex listening environments and individual differences.
Abstract: This article presents a review of the effects of adverse conditions (ACs) on the perceptual, linguistic, cognitive, and neurophysiological mechanisms underlying speech recognition. The review starts with a classification of ACs based on their origin: Degradation at the source (production of a noncanonical signal), degradation during signal transmission (interfering signal or medium-induced impoverishment of the target signal), receiver limitations (peripheral, linguistic, cognitive). This is followed by a parallel, yet orthogonal classification of ACs based on the locus of their effect: Perceptual processes, mental representations, attention, and memory functions. We then review the added value that ACs provide for theories of speech recognition, with a focus on fundamental themes in psycholinguistics: Content and format of lexical representations, time-course of lexical access, word segmentation, feed-back in speech perception and recognition, lexicalsemantic integration, interface between the speech system and general cognition, neuroanatomical organisation of speech processing. We conclude by advocating an approach to speech recognition that includes rather than neutralises complex listening environments and individual differences.

555 citations

Journal ArticleDOI
TL;DR: The ideal adapter framework is formalized and can be understood as inference under uncertainty about the appropriate generative model for the current talker, thereby facilitating robust speech perception despite the lack of invariance.
Abstract: Successful speech perception requires that listeners map the acoustic signal to linguistic categories. These mappings are not only probabilistic, but change depending on the situation. For example, one talker's /p/ might be physically indistinguishable from another talker's /b/ (cf. lack of invariance). We characterize the computational problem posed by such a subjectively nonstationary world and propose that the speech perception system overcomes this challenge by (a) recognizing previously encountered situations, (b) generalizing to other situations based on previous similar experience, and (c) adapting to novel situations. We formalize this proposal in the ideal adapter framework: (a) to (c) can be understood as inference under uncertainty about the appropriate generative model for the current talker, thereby facilitating robust speech perception despite the lack of invariance. We focus on 2 critical aspects of the ideal adapter. First, in situations that clearly deviate from previous experience, listeners need to adapt. We develop a distributional (belief-updating) learning model of incremental adaptation. The model provides a good fit against known and novel phonetic adaptation data, including perceptual recalibration and selective adaptation. Second, robust speech recognition requires that listeners learn to represent the structured component of cross-situation variability in the speech signal. We discuss how these 2 aspects of the ideal adapter provide a unifying explanation for adaptation, talker-specificity, and generalization across talkers and groups of talkers (e.g., accents and dialects). The ideal adapter provides a guiding framework for future investigations into speech perception and adaptation, and more broadly language comprehension.

429 citations

Journal ArticleDOI
TL;DR: A multicomponent model of the phenomenon informed by developmental, cognitive, and psycholinguistic considerations is presented, which appears to perform significant functions in human cognition, which in some cases reflect its developmental origins and its sharing of resources with other cognitive processes.
Abstract: Inner speech—also known as covert speech or verbal thinking—has been implicated in theories of cognitive development, speech monitoring, executive function, and psychopathology. Despite a growing body of knowledge on its phenomenology, development, and function, approaches to the scientific study of inner speech have remained diffuse and largely unintegrated. This review examines prominent theoretical approaches to inner speech and methodological challenges in its study, before reviewing current evidence on inner speech in children and adults from both typical and atypical populations. We conclude by considering prospects for an integrated cognitive science of inner speech, and present a multicomponent model of the phenomenon informed by developmental, cognitive, and psycholinguistic considerations. Despite its variability among individuals and across the life span, inner speech appears to perform significant functions in human cognition, which in some cases reflect its developmental origins and its sharing of resources with other cognitive processes.

410 citations

Journal ArticleDOI
TL;DR: The phenomenology of inner speech is described by examining five issues: common behavioural and cerebral correlates with overt speech, different types of innerspeech (wilful verbal thought generation and verbal mind wandering), presence of inner Speech in reading and in writing, inner signing and voice-hallucinations in deaf people, and agency in inner speech.

218 citations

Journal ArticleDOI
TL;DR: In this paper, the empirical literature on auditory imagery is reviewed, and it is concluded that auditory imagery preserves many structural and temporal properties of auditory stimuli, can facilitate auditory discrimination but interfere with auditory detection, involves many of the same brain areas as auditory perception, is often but not necessarily influenced by subvocalization, involves semantically interpreted information and expectancies, involves depictive components and descriptive components, can function as a mnemonic but is distinct from rehearsal, and is related to musical ability and experience (although the mechanisms of that relationship are not clear).
Abstract: The empirical literature on auditory imagery is reviewed. Data on (a) imagery for auditory features (pitch, timbre, loudness), (b) imagery for complex nonverbal auditory stimuli (musical contour, melody, harmony, tempo, notational audiation, environmental sounds), (c) imagery for verbal stimuli (speech, text, in dreams, interior monologue), (d) auditory imagery's relationship to perception and memory (detection, encoding, recall, mnemonic properties, phonological loop), and (e) individual differences in auditory imagery (in vividness, musical ability and experience, synesthesia, musical hallucinosis, schizophrenia, amusia) are considered. It is concluded that auditory imagery (a) preserves many structural and temporal properties of auditory stimuli, (b) can facilitate auditory discrimination but interfere with auditory detection, (c) involves many of the same brain areas as auditory perception, (d) is often but not necessarily influenced by subvocalization, (e) involves semantically interpreted information and expectancies, (f) involves depictive components and descriptive components, (g) can function as a mnemonic but is distinct from rehearsal, and (h) is related to musical ability and experience (although the mechanisms of that relationship are not clear).

210 citations