scispace - formally typeset
Search or ask a question

Showing papers on "Voice published in 2000"


Journal ArticleDOI
TL;DR: It is shown here that self‐uttered syllables transiently activate the speaker's auditory cortex around 100 ms after voice onset, which primes the human auditory cortex at a millisecond time scale, dampening and delaying reactions to self‐produced “expected” sounds, more prominently in the speech‐dominant hemisphere.
Abstract: The voice we most often hear is our own, and proper interaction between speaking and hearing is essential for both acquisition and performance of spoken language. Disturbed audiovocal interactions have been implicated in aphasia, stuttering, and schizophrenic voice hallucinations, but paradigms for a noninvasive assessment of auditory self-monitoring of speaking and its possible dysfunctions are rare. Using magnetoencephalograpy we show here that self-uttered syllables transiently activate the speaker's auditory cortex around 100 ms after voice onset. These phasic responses were delayed by 11 ms in the speech-dominant left hemisphere relative to the right, whereas during listening to a replay of the same utterances the response latencies were symmetric. Moreover, the auditory cortices did not react to rare vowel changes interspersed randomly within a series of repetitively spoken vowels, in contrast to regular change-related responses evoked 100-200 ms after replayed rare vowels. Thus, speaking primes the human auditory cortex at a millisecond time scale, dampening and delaying reactions to self-produced "expected" sounds, more prominently in the speech-dominant hemisphere. Such motor-to-sensory priming of early auditory cortex responses during voicing constitutes one element of speech self-monitoring that could be compromised in central speech disorders.

307 citations


Journal ArticleDOI
TL;DR: This article found that the typical Hong Kong English speaker operates with a smaller set of vowel and consonant contrasts than in native varieties of English, and that there is no length/tenseness contrast in vowels, and no voicing contrast in fricatives.
Abstract: Whether or not Hong Kong English (HKE) has acquired the status of a ‘new variety of English’, there is no doubt that there exists an identifiable ‘HKE accent’, and therefore a HKE phonology. The paper embodies the author's findings in the first part of his research project on HKE phonology, which covered segmental phonology—in particular the underlying phonemic system of HKE speakers, and the phonetic realisations of its phonemes in different phonological environments. The subjects comprised 15 undergraduates at the Hong Kong Baptist University. The initial batch of data consisted of a number of key words designed to capture all potential vowel and consonant contrasts in a variety of phonological environments. With the help of spectrographic analysis, it was found that the typical HKE speaker operates with a considerably smaller set of vowel and consonant contrasts than in native varieties of English. In particular, there is no length/tenseness contrast in vowels, and no voicing contrast in fricatives. HKE also exhibits a number of interesting and possibly unique phonological properties. An underlying phonemic system is postulated for HKE, and a number of allophonic variations are described.

200 citations


Journal ArticleDOI
TL;DR: Oral airflow, intraoral pressure, and acoustic signals from normal English-speaking adults and children producing stop consonants and /h/ embedded in a short carrier utterance indicate that one cannot assume comparable laryngeal conditions across speaker groups, and implies that VOT acquisition in children cannot be interpreted purely in terms of developing interarticulator timing control.
Abstract: Voicing control in stop consonants has often been measured by means of voice onset time (VOT) and discussed in terms of interarticulator timing. However, control of voicing also involves details of...

101 citations


01 Jan 2000
TL;DR: In this article, the authors examined long distance voicing agreement between consonants (Cs) and showed that these agreement patterns come about through a correspondence relation that is established between Cs in the output.
Abstract: This paper examines long distance voicing agreement between consonants (Cs). Two related patterns are observed. In the first one [voice] agreement is restricted to pairs of oral stops that match in place of articulation, as seen in Ngbaka. In the second, observed in Kera, [voice] agreement occurs among all pairs of stops. I argue that these agreement patterns come about through a correspondence relation that is established between Cs in the output. The notion of intersegmental correspondence will be important in explaining two key properties of the phenomena: (i) the potential for interaction between Cs at a distance, and (ii) the preference for voicing agreement to occur between similar Cs. From a wider perspective, this analysis is supported by work on other consonantal agreement patterns that display similar characterizing properties (Walker 1999, Rose & Walker in prep.). In addition, I propose that the correspondence approach has the potential to extend to cases of voicing dissimilation. The analysis is couched within Optimality Theory (OT; Prince & Smolensky 1993). The paper is organized as follows. In §2 I present the data illustrating voicing agreement between Cs at a distance. §3 diagnoses the agreement as arising through the mechanism of segmental correspondence rather than feature spreading. In §4 I lay out a theoretical overview of the correspondence approach to long-distance agreement, and then develop the details of analysis of Ngbaka and Kera. §5 discusses an extension to voicing dissimilation phenomena, and §6 gives the conclusion.

69 citations


Journal ArticleDOI
TL;DR: Voice-onset times for stops in the same voicing category but not in different categories correlated across speakers; Spanish articulatory VOT values seem to vary with dialect, possibly influencing Spanish VOT perceptual boundaries.

65 citations


Journal ArticleDOI
TL;DR: Results from a study of the acquisition of the voicing contrast in English word-final obstruents by native speakers of Catalan indicate a very high incidence of devoicing, which confirms the prevalence of final devoices in second language acquisition and points to the joint effect of transfer and universal tendencies.
Abstract: This paper examines the interference of L1 neutralization rules in the acquisition of a marked L2 phonological feature. More specifically, it presents results from a study of the acquisition of the voicing contrast in English word-final obstruents by native speakers of Catalan. The voicing contrast in final position in Catalan is neutralized by voicing or devoicing rules, depending on the environment. The results of an experiment testing the production of target final obstruents in different environments indicate a very high incidence of devoicing, which confirms the prevalence of final devoicing in second language acquisition and points to the joint effect of transfer and universal tendencies. In contrast with devoicing, the results reveal a more limited effect of the L1 voicing rules. It is argued that this difference is due to an effect of word integrity in the interlanguage that restricts the domain of application of the transferred rules.

43 citations


Journal ArticleDOI
TL;DR: This paper showed that Progressive Voicing Assimilation is characteristically restricted to the interword environment (i.e. it occurs at the WORD level) and is the consequence of WORD-faithfulness.
Abstract: This paper proposes that differences in the direction of application of phonological rules can be attributed to the differences in the observed patterns of faithfulness at the WORD and ROOT-levels. Using data from English and Dutch I show that Progressive Voicing Assimilation is characteristically restricted to the inter-word environment (i.e. it occurs at the WORD-level) and is the consequence of WORD-faithfulness. I consider whether the same kind of faithfulness effect can account for assymetrical patterns observed with other phonological processes such as vowel harmony, vowel elision and nasal place assimilation

33 citations


01 Jan 2000
TL;DR: If and how infants that do not have a lexicon might undo phonological variation, i.e. deduce which phonological processes apply and infer unique underlying word forms that will constitute lexical entries, is examined.
Abstract: Mapping word forms onto their corresponding meanings is one of the most complex tasks that young infants acquiring their native language have to perform. This is due to the fact that an utterance can refer to many different aspects of a scene, a problem known as referential ambiguity (Quine 1960). An even more basic problem, though, is that it is not easy to find word forms to start with. In fact, the speech waveform is continuous, and word boundaries are not readily available. Moreover, words often surface with different phonetic forms due to the application of postlexical phonological processes; that is, surface word forms exhibit what we call phonological variation. Most models of lexical acquisition assume that infants can somehow extract unique word forms out of the speech stream before they acquire the meaning of words (e.g. Siskind 1996). Hence, they propose a solution to the problem of referential ambiguity, thereby assuming that the problems of finding word boundaries and undoing phonological variation have already been solved. There is evidence that infants can indeed find word boundaries before they have a lexicon (Jusczyk & Aslin 1995). By contrast, virtually nothing is known concerning the question of how prelexical infants deal with phonological variation. In this paper, we will examine if and how infants that do not have a lexicon might undo phonological variation, i.e. deduce which phonological processes apply and infer unique underlying word forms that will constitute lexical entries. The various intricacies of phonological variation for lexical acquisition can be illustrated within a single language, i.e. Korean. First, consider the allophonic rule of obstruent voicing. This rule voices plain obstruents that occur between two voiced segments, as illustrated in (1); voiced obstruents do not otherwise occur in Korean.

27 citations


Journal ArticleDOI
01 Sep 2000-Lingua
TL;DR: The alternation in obstruent voicing in the dialect of Breton found on the Ile de Groix is particularly interesting, since three modes of neutralisation can be observed: (i) final devoicing, i.e., obstruents surface as voiceless in syllable- or word-final position; (ii) obstruENT voicing, which is observed when a word-firm obstruen is followed by a vowel-or sonorant-initial word, and (iii) regressive as well as progressive voicing assimilation, which was

23 citations



Dissertation
01 May 2000
TL;DR: In this paper, the authors used a pitch-scaled harmonic filter (PSHF) to separate the voiced and unvoiced parts of a speech signal, which was tested extensively on synthetic signals.
Abstract: This thesis is a study of the production of human speech sounds by acoustic modelling and signal analysis. It concentrates on sounds that are not produced by voicing (although that may be present), namely plosives, fricatives and aspiration, which all contain noise generated by flow turbulence. It combines the application of advanced speech analysis techniques with acoustic flow-duct modelling of the vocal tract, and draws on dynamic magnetic resonance image (dMRI) data of the pharyngeal and oral cavities, to relate the sounds to physical shapes. Having superimposed vocal-tract outlines on three sagittal dMRI slices of an adult male subject, a simple description of the vocal tract suitable for acoustic modelling was derived through a sequence of transformations. The vocal-tract acoustics program VOAC, which relaxes many of the assumptions of conventional plane-wave models, incorporates the effects of net flow into a one-dimensional model (viz., flow separation, increase of entropy, and changes to resonances), as well as wall vibration and cylindrical wavefronts. It was used for synthesis by computing transfer functions from sound sources specified within the tract to the far field. Being generated by a variety of aero-acoustic mechanisms, unvoiced sounds are somewhat varied in nature. Through analysis that was informed by acoustic modelling, resonance and anti-resonance frequencies of ensemble-averaged plosive spectra were examined for the same subject, and their trajectories observed during release. The anti-resonance frequencies were used to compute the place of occlusion. In vowels and voiced fricatives, voicing obscures the aspiration and frication components. So, a method was devised to separate the voiced and unvoiced parts of a speech signal, the pitch-scaled harmonic filter (PSHF), which was tested extensively on synthetic signals. Based on a harmonic model of voicing, it outputs harmonic and anharmonic signals appropriate for subsequent analysis as time series or as power spectra. By applying the PSHF to sustained voiced fricatives, we found that, not only does voicing modulate the production of frication noise, but that the timing of pulsation cannot be explained by acoustic propagation alone. In addition to classical investigation of voiceless speech sounds, VOAC and the PSHF demonstrated their practical value in helping further to characterise plosion, frication and aspiration noise. For the future, we discuss developing VOAC within an articulatory synthesiser, investigating the observed flow-acoustic mechanism in a dynamic physical model of voiced frication, and applying the PSHF more widely in the field of speech research.

Patent
TL;DR: In this paper, a method for estimating a percentage of unvoiced and voiced energy for each harmonic within each of a plurality of bands of a speech signal spectrum is provided, based on the assumption that speech is purely voiced.
Abstract: A voicing probability determination method is provided for estimating a percentage of unvoiced and voiced energy for each harmonic within each of a plurality of bands of a speech signal spectrum. Initially, a synthetic speech spectrum is generated based on the assumption that speech is purely voiced. The original and synthetic speech spectra are then divided into plurality of bands. The synthetic and original speech spectra are compared harmonic by harmonic, and a voicing determination is made based on this comparison. In one embodiment, each harmonic of the original speech spectrum is assigned a voicing decision as either completely voiced or unvoiced by comparing the difference with an adaptive threshold. If the difference for each harmonic is less than the adaptive threshold, the corresponding harmonic is declared as voiced; otherwise the harmonic is declared as unvoiced. The voicing probability for each band is then computed based on the amount of energy in the voiced harmonics in that decision band. Alternatively, the voicing probability for each band is determined based on a signal to noise ratio for each of the bands which is determined based on the collective differences between the original and synthetic speech spectra within the band.

Journal ArticleDOI
TL;DR: Three learning systems are studied: single-layer perceptrons, support vector machines and Fisher linear discriminants, which highlight similarities and differences between these approaches.
Abstract: Speech perception relies on the human ability to decode continuous, analogue sound pressure waves into discrete, symbolic labels (‘phonemes’) with linguistic meaning. Aspects of this signal-to-symbol transformation have been intensively studied over many decades, using psychophysical procedures. The perception of (synthetic) syllable-initial stop consonants has been especially well studied, since these sounds display a marked categorization effect: they are typically dichotomised into ‘voiced’ and ‘unvoiced’ classes according to their voice onset time (VOT). In this case, the category boundary is found to have a systematic relation to the (simulated) place of articulation, but there is no currently-accepted explanation of this phenomenon. Categorization effects have now been demonstrated in a variety of animal species as well as humans, indicating that their origins lie in general auditory and/or learning mechanisms, rather than in some ‘phonetic module’ specialized to human speech processing. In recent work, we have demonstrated that appropriately-trained computational learning systems (‘neural networks’) also display the same systematic behaviour as human and animal listeners. Networks are trained on simulated patterns of auditory-nerve firings in response to synthetic ‘continuua’ of stop-consonant/vowel syllables varying in place of articulation and VOT. Unlike real listeners, such a software model is amenable to analysis aimed at extracting the phonetic knowledge acquired in training, so providing a putative explanation of the categorization phenomenon. Here, we study three learning systems: single-layer perceptrons, support vector machines and Fisher linear discriminants. We highlight similarities and differences between these approaches. We find that the modern inductive inference technique for small sample sizes of support vector machines gives the most convincing results. Knowledge extracted from the trained machine indicated that the phonetic percept of voicing is easily and directly recoverable from auditory (but not acoustic) representations.

Patent
21 Dec 2000
TL;DR: In this paper, a voicing determination algorithm for classification of a speech signal segment as voiced or unvoiced is presented, which is based on a normalized autocorrelation where the length of the window is proportional to the pitch period.
Abstract: This invention presents a voicing determination algorithm for classification of a speech signal segment as voiced or unvoiced. The algorithm is based on a normalized autocorrelation where the length of the window is proportional to the pitch period. The speech segment to be classified is further divided into a number of sub-segments, and the normalized autocorrelation is calculated for each sub-segment if a certain number of the normalized autocorrelation values is above a predetermined threshold, the speech segment is classified as voiced. To improve the performance of the voicing determination algorithm in unvoiced to voiced transients, the normalized autocorrelations of the last sub-segments are emphasized. The performance of the voicing decision algorithm can be enhanced by utilizing also the possible lookahead information.

Journal ArticleDOI
TL;DR: Wenck, Hayata, and Takayama as mentioned in this paper showed that the weakening process of intervocalic /p/, which is commonly summarized as *p > *Φ > w, is recast as * p > *b > *β > w. This dissenting view is more consistent with various sound change phenomena from Middle to Modern Japanese.
Abstract: It is standard view in Japanese historical linguistics that voicing of obstruents in Old through Early Middle Japanese (c. 700–1200) was contrastive although largely limited to intervocalic position. However, Wenck (1959), Hayata (1977), and Takayama (1993) question this view by raising the possibility that early Japanese had only non-contrastive voicing of intervocalic obstruents. On this account, "voiced" and "voiceless" obstruents in intervocalic position were distinguished purely on the basis of prenasalization rather than voicing in early Japanese; the well-known weakening process of intervocalic /p/, which is commonly summarized as *p > *Φ > w, is recast as *p > *b > *β > w. This dissenting view is in fact more consistent with various sound change phenomena from Middle to Modern Japanese. This paper presents a novel piece of evidence from the sound-symbolic stratum which supports the view of Wenck, Hayata, and Takayama.

Journal Article
TL;DR: A new analytic speech test is described that assesses the perception of significant phonological contrasts in the Colloquial Arabic variety used in Israel and shows a relationship between percent correct and presentation level that is in keeping with articulation curves obtained with Saudi Arabic and English monosyllabic words.
Abstract: The high incidence of hearing impairment in the Arabic-speaking population in Israel, as well as the use of advanced aural rehabilitation devices, motivated the development of Arabic speech assessment tests for this population. The purpose of this paper is twofold. The first goal is to describe features that are unique to the Arabic language and that need to be considered when developing such speech tests. These include Arabic diglossia (i.e., the sharp dichotomy between Literary and Colloquial Arabic), emphatization, and a simple vowel system. The second goal is to describe a new analytic speech test that assesses the perception of significant phonological contrasts in the Colloquial Arabic variety used in Israel. The perception of voicing, place, and manner of articulation, in both initial and final word positions, was tested at four sensation levels in 10 normally-hearing subjects using a binary forced-choice paradigm. Results show a relationship between percent correct and presentation level that is in keeping with articulation curves obtained with Saudi Arabic and English monosyllabic words. Furthermore, different contrasts yielded different articulation curves: emphatization was the easiest to perceive whereas place of articulation was the most difficult. The results can be explained by the specific acoustical features of Arabic.

Journal ArticleDOI
TL;DR: The results show that a right-ear advantage is significant only when the phonetic boundary is close to the release burst, i.e., when the identification of the two successive acoustical events needed to perceive a phoneme as voiced or voiceless requires rapid information processing.

Journal ArticleDOI
TL;DR: The wide-ranging and critical research, together with the incisiveness and generosity with which he has assessed the value of the work of others, make Duncan Brown's book a welcome and cogent argument for that symbiotic inetrdisciplinarinism that is increasingly becoming a sine qua non in academic life as discussed by the authors.
Abstract: The wide-ranging and critical research, together with the incisiveness and generosity with which he has assessed the value of the work of others, make Duncan Brown's book a welcome and cogent argument for that symbiotic inetrdisciplinarinism that is increasingly becoming a sine qua non in academic life. I welcome the opportunity to review a book written by one who is working in a milieu in which debates about the value of Africa's contribution to world literature, arising from its treasure trove of linguistic diversity, are advanced and, in some cases, now anachronistic. The debate itself is not only about literature since it is inextricably linked to how communities living within the same borders and, indeed on the same planet, view each other. Brown's book is an invitation to us all, particularly in southern Africa, to contemplate the common humanity that binds us all together, in spite of ourselves, by opening our eyes to the richness of our cultural legacy, even through the

Journal ArticleDOI
TL;DR: It was found that the perception of voicing has a strong dependence on vowel context with /Ca/ syllable being significantly better discriminated than /Ci/ and /Cu/ syllables.
Abstract: Previous research has shown that the VOT and first formant transition are primary perceptual cues for the voicing distinction for syllable‐initial plosives (SIP) in quiet environments. This study seeks to determine which cues are important for the perception of voicing for SIP in the presence of noise. Stimuli for the perceptual experiments consisted of naturally spoken CV syllables (six plosives in three vowel contexts) in varying levels of additive white Gaussian noise. In each experiment, plosives which share the same place of articulation (e.g., /p, b/) were presented to subjects in identification tasks. For each voiced/voiceless pair, a threshold SNR value was calculated. It was found that the perception of voicing has a strong dependence on vowel context with /Ca/ syllables being significantly better discriminated than /Ci/ and /Cu/ syllables. In addition, labials consistently had a higher threshold SNR (or more easily confusable) than alveolars and velars. Threshold SNR values were then correlated ...

Journal ArticleDOI
TL;DR: In this article, data taken from the Atlas linguistique de la France (Gillie´ron and Edmont 1902-10) are subjected to quantitative analysis to investigate how the variable patterns of voicing in obstruents to be found therein is conditioned.
Abstract: In the following paper, data taken from the Atlas linguistique de la France(Gillie´ron and Edmont 1902–10) are subjected to quantitative analysis to investigate how the variable patterns of voicing in obstruents to be found therein is conditioned. The analyses show how the data (which are here restricted to underlyingly voiced stop consonants) are consistent to a remarkable degree with hypotheses formulated on the basis of the findings of modern experimental phonetic and sociolinguistic studies, which is suggestive not only for the wider study in question, but also in (re-)considering the potential of the Atlas as a valuable data source for comparisons to be made between the French spoken throughout France at the beginning and end of the twentieth century.

Journal ArticleDOI
TL;DR: This article found that stutterers realized voiced stops by voicing before release (prevoicing), whereas non-stutterers did not, and the importance of this phonetic strategy for understanding and treating stuttering is discussed.
Abstract: Differences between stutterers and nonstutterers in temporal organization of fluent speech may offer clues to the elemental basis of fully elaborated, perceptible stuttering events. Guided by this hypothesis, we investigated voice onset time—the interval between voice onset and upper articulatory stop release—in voiced stop consonants under varying constraints. Under variation of rate, lexical stress location, and location of key words beginning with voiced stops, the stutterers realized voiced stops by voicing before release (prevoicing), whereas the controls realized voiced stops by voicing following the release. The significance of this phonetic strategy difference for understanding and treating stuttering is discussed.

Book ChapterDOI
31 Dec 2000

Patent
Ari Heikkinen1, Samuli Pietila1
08 Dec 2000
TL;DR: In this article, a normalised autocorrelation where the length of the window is proportional to the pitch period is used to classify a speech signal segment as voiced or unvoiced.
Abstract: of EP1111586This invention presents a voicing determination algorithm for classification of a speech signal segment as voiced or unvoiced. The algorithm is based on a normalised autocorrelation where the length of the window is proportional to the pitch period. The speech segment to be classified is further divided into a number of sub-segments, and the normalised autocorrelation is calculated for each sub-segment. If a certain number of the normalised autocorrelation values is above a predetermined threshold, the speech segment is classified as voiced. To improve the performance of the voicing determination algorithm in unvoiced to voiced transients, the normalised autocorrelations of the last sub-segments are emphasised. The performance of the voicing decision algorithm can be enhanced by utilising also the possible lookahead information.

01 Jan 2000
TL;DR: A previous experiment on the Castilian Spanish perceptual voice-onset-time boundary indicated possible differences between Spanishdialects as discussed by the authors, showing that Spanish articulatory VOT values seem to vary with the dialect.
Abstract: A previous experiment on the Castilian Spanish perceptual voice-onsettime (VOT) boundary indicated possible di!erences between Spanishdialects.WethereforemeasuredVOTsforthesixCastilianstopsininitialposition. Signi"cant main e!ects of voicing and place on VOTreproduced previous "ndings on Latin American Spanish dialects.Voicing also interacted with place and postconsonantal vowel. Voice-onset times for stops in the same voicing category but not in di!erentcategories correlated across speakers. Some previously published LatinAmerican VOTs di!er signi"cantly from our measurements, and themajority of the Latin American VOTs fall outside the 99% con"dencelimits of our means. Spanish articulatory VOT values seem to vary withdialect, possibly in#uencing Spanish VOT perceptual boundaries.( 2000 Academic Press

Proceedings Article
01 Jan 2000
TL;DR: Voiced consonants were better identidified than voiceless in both syllabic contexts, but especially for velars, which suggests that some voicing distinction is possible on the basis of visual cues.
Abstract: This study examined whether visually presented bilabials consonants are better identified than velars in a CV (C = consonant; V = vowel) or VCV context. We also investigated whether voiced and voiceless consonants sharing a same place and manner of articulation could be differentiated from each other with visual cues only. Although it is generally assumed that voicing is mainly mediated by the auditory modality, one cannot discard the possibility that the production of a voiced stop consonant produces a pattern of facial cues that could be detectable visually. Two pairs of stop consonants (/b/-/p/ and /g/-/k/) were articulated by a man and by a woman speaker in two syllabic contexts (CV monosyllables or VCV bisyllables). The bisyllables were uttered according to three speaking rates: slow, medial and fast. The materials were edited on a videotape and presented on a TV screen without sound. After each trial, participants had to choose between several written possibilities what they had perceived. Percentage of correct identifications reached 42% on average for the four consonants. Errors mostly consisted in voicing confusions (37%). Place of articulation confusions occurred in only 8% of the cases. Correct identifications were more numerous for bilabials than for velars but more particularly for monosyllables. Voiced consonants were better identidified than voiceless in both syllabic contexts, but especially for velars. This suggests that some voicing distinction is possible on the basis of visual cues.





ReportDOI
01 Jan 2000
TL;DR: This paper investigated the effect of vowel duration on the perception of post-vocalic word-final consonants in the English production of four native Spanish speakers and found that the degree of variation in the vowel lengths with respect to voicing was much less than the degree difference exhibited in native English, and similar to the variation produced in native Spanish.
Abstract: An abstract of the thesis of Becky Jean George for the Master of Arts in TESOL presented July 2, 1996. Title: Investigating Vowel Duration as a Perceptual Cue to Voicing in the English of Native Spanish Speakers. Researchers in the cognitive sciences, and in particular those in acoustic phonetics, investigate the acoustic properties in the speech signal that enable listeners to perceive particular speech sounds. Temporal cues have been found to convey information about the linguistic content of an utterance. One acoustic characteristic that is particularly well documented in American English is the difference in vowel duration preceding voiced and voiceless consonants, which has been found to play a role in the perception of the voicing of postvocalic word-final consonants. Research on vowel duration and its role in the perception of the voicing distinction of the following consonant has primarily involved data from native English speakers. The purpose of the present study was to investigate the vowel durations preceding word-final voiced and voiceless stops in the English production of four native Spanish speakers. This study sought to determine if differences in vowel duration are exhibited preceding voiced and voiceless consonants in the English production of the native Spanish speakers, and to determine if the vowel durations affected the perception of the voicing distinction of the postvocalic stop by four native English speakers. 2 A significant effect of voicing on the vowel durations in the English production of the native Spanish speakers was found. However, the degree of variation in the vowel lengths with respect to voicing was much less than the degree of difference exhibited in native English, and similar to the variation produced in native Spanish. The average mean difference in length with respect to the voicing of the following consonant was 17.8 msec. in the present study. In native English the mean difference between vowels preceding voiced and voiceless consonants ranges from 79 msec. to 92 msec. and in Spanish the average mean difference is 18 msec. Statistical analysis performed to quantify the contribution of vowel duration on the perception of the voicing distinction found only minimal affect. It was concluded that although the cue of vowel duration variation was present in the speech signal of this data, the listeners generally did not utilize it as a cue to the voicing distinction of the following stops. INVESTIGATING VOWEL DURATION AS A PERCEPTUAL CUE TO VOICING IN THE ENGLISH OF NATIVE SPANISH SPEAKERS