scispace - formally typeset
Search or ask a question

Showing papers on "Voice published in 1976"


Journal ArticleDOI
Lynn A. Streeter1
01 Jan 1976-Nature
TL;DR: This work has shown that voicing distinctions for stops in word-initial postion can usually be characterised by differences in voice onset time (VOT), where VOT is defined as the time between the stop release burst and the onset of vocal cord vibration (Voicing).
Abstract: VOICING differences distinguish two or more stop consonants in nearly all the world's languages. It has been found1 that voicing distinctions for stops in word-initial postion can usually be characterised by differences in voice onset time (VOT), where VOT is defined as the time between the stop release burst and the onset of vocal cord vibration (Voicing). For example, the English phonemes /p/ and /b/ differ in this way; voicing follows release in /p/, while voicing is approximately simultaneous with release in /b/.

349 citations


Journal ArticleDOI
TL;DR: In this paper, a series of perception experiments were conducted to determine if a brief stimulus in which only the spectral information at onset is preserved provides sufficient cues for identification of place of articulation across vowel contexts, and if it does, to define further the nature and size of the spectral window.
Abstract: In this series of perception experiments, we have attempted (a) to determine if a brief stimulus in which only the spectral information at onset is preserved provides sufficient cues for identification of place of articulation across vowel contexts, and (b) if it does, to define further the nature and size of the spectral window. Subjects were randomly presented with synthetically produced stimuli consisting of a 5‐ or 10‐msec noise burst followed by a brief voiced interval containing three formant transitions with onset and offset characteristics appropriate to the consonants [b, d, g] in the environment of the vowels [a, i, u], as well as stimuli with steady second‐ and third‐formant transitions. The length of the voiced interval was systematically varied from 40 to 5 msec. The results indicate that an onset spectrum consisting of the burst plus the initial 5 or 10 msec of voicing provide sufficient cues for the identification of the stop consonant, and that vocalic information can be reliably derived from these brief stimuli containing only one or two glottal pulses. [Research approved by an NIH grant.]

252 citations


Journal ArticleDOI
TL;DR: Results of this analysis showed a developmental pattern of change primarily for the voiceless stops in the form of increased correspondence between perceptual identification categories and production VOT values.
Abstract: The acoustic cue voice onset time (VOT) was used to study development of the voicing contrast in 10 two-year-old children, 10 six-year-old children, and 20 adults. Thirty utterances of the words be...

151 citations


Journal ArticleDOI
TL;DR: This article found that discrimination across a boundary was better than discrimination between stimuli that were both on one side of the boundary, and there was generalization of the voiced-voiceless distinction from labial to velar syllables.
Abstract: Monkeys were presented with synthetic speech stimuli in a shock-avoidance situation. On the basis of their behavior, perceptual boundaries were determined along the physical continua between /ba/ and /pa/, and /ga/ and /ka/, that were close to the human boundaries between voiced and voiceless consonants. As is the case with humans, discrimination across a boundary was better than discrimination between stimuli that were both on one side of the boundary, and there was generalization of the voiced-voiceless distinction from labial to velar syllables. Unlike humans, the monkeys showed large shifts in boundary when the range of stimuli was varied.

86 citations


Journal ArticleDOI
TL;DR: In this paper, the authors investigated the deaf children's ability to produce word-initial stop consonants /p t k/ and /b d g/ in the speech of 37 deaf and six normally-hearing adolescents.

56 citations


Journal ArticleDOI
TL;DR: The apparent discontinuity between the earlier sound classes and the later‐appearing comfort sounds may derive from the fact that a new combination of features is formed in comfort sound, each feature deriving from a previously existing sound class.
Abstract: There is said to be a discontinuity in development of vocalization at the time when comfort sounds (cooing) first emerge. In the present study, vocalizations produced by two normal female infants before and after the emergence of comfort sounds were studied. The vocalizations were first classified as cry, discomfort, vegetative (coughing, burping), and, from 6 to 8 weeks of age onwards, comfort sounds. Examples from each class were selected randomly for auditory and spectrographic analysis. The features of voicing, breath direction, and vowel‐like vs consonant‐like were documented. The two infants differed little from one another with respect to these features. Cry and discomfort sounds were both predominantly vowel‐like, voiced and egressive. Vegetative sounds were predominantly consonant‐like, voiceless and ingressive. The features voiced and egressive were incorporated with the feature consonant‐like in comfort sounds. Thus, the apparent discontinuity between the earlier sound classes and the later‐appearing comfort sounds may derive from the fact that a new combination of features is formed in comfort sound, each feature deriving from a previously existing sound class. This new combination coincides with emergence of the ability to express pleasure. [Work supported by NINDS.]

33 citations


Journal ArticleDOI
TL;DR: In this article, a study was conducted to determine the effect of coarticulation on the perception of consonant and vowel features and to compare the magnitude of R to L and L to R co-articulatory effects.

31 citations


Journal ArticleDOI
TL;DR: Results indicate that there exist feature detector mechanisms that are tuned to respond to the information specifying phonetic feature values best in particular acoustic environments, but that the extent of this selective tuning is limited.

20 citations


Proceedings ArticleDOI
12 Apr 1976
TL;DR: The Speech Communications Group at SPERRY UNIVAC Defense Systems is developing a linguistically-oriented procedure for recognizing words, phrases, and natural sentences by computer that will be extended from the recognition of several-word noun phrases to the understanding of more natural sentences.
Abstract: The Speech Communications Group at SPERRY UNIVAC Defense Systems is developing a linguistically-oriented procedure for recognizing words, phrases, and natural sentences by computer. The major components of the current speech recognition system perform acoustic and phonetic analysis, phonetic segmentation, and lexical matching and scoring. The acoustic processing is based on a linear-predictive spectral analysis of the speech signal. Sounds are classified by manner, place, and voicing using formant frequencies and other spectral functions, as well as information about syllable boundaries and nuclei. A linear sequence of analysis segments is created, and matched against the lexicon using a scoring matrix that ranks analysis-lexical segment pairs by their expected confusions. Word sequences are progressively formed and ranked against the entire input to determine the most likely phrases spoken. When the recognition system was tested on a 31-word vocabulary from two male speakers, single word recognition scores of 95% correct were obtained when the task syntax was used. Preliminary results for recognizing connected word sequences from three male speakers range from 54 to 74% for a task with constrained word order. Current plans for enhancing the recognition system include the incorporation of components for phonological rules, speaker normalization, and prosodic guidelines. By adding more powerful procedures for syntactic and semantic analysis, the system will be extended from the recognition of several-word noun phrases to the understanding of more natural sentences.

7 citations


Journal ArticleDOI
TL;DR: In this paper, a feature detector system sensitive to VOT was investigated with a binaural adaptation-dichotic testing paradigm, where a nonboundary voiced stop was paired with one of a set of voiceless stops.
Abstract: Certain properties of a feature detector system sensitive to VOT were investigated with a binaural adaptation‐dichotic testing paradigm. On each dichotic test trial, a nonboundary voiced stop was paired with one of a set of voiceless stops. The voiceless stops varied in VOT, from values close to the phonetic boundary to values well within the voiceless category. The relative effectiveness of each of the voiceless stimuli in competing for processing with the voiced stimulus was assessed before adaptation and after adaptation with voiceless stops with a range of VOT values. During both pre‐ and postadaptation, the number of correct voicing responses when targeting for the voiceless stop, and the number of voicing intrusions when targeting for the voiced stop, systematically varied as a function of the VOT value of the voiceless stimulus. In addition, the VOT value of the adapting stimulus determined the amount of adaptation obtained. These results indicate first, that the output of the detector is a graded ...

6 citations


Journal ArticleDOI
Abstract: Spectrographic measurements were made of voice onset time for 161 tokens of Spanish voiceless stops from one individual's conversational speech in an attempt to replicate laboratory findings on VOT. In running Spanish, occlusive allophones of voiced‐stop phohemes occur only in absolute initial position and after nasal consonants. In other environments, voiced‐stop phonemes are phonetically voiced fricatives. Therefore the contrast between voiced‐ and voiceless‐stop phoneme categories is maintained not only by the presence or absence of voicing, but also by the presence of frication (voiced phonemes) or its absence (i.e., closure for voiceless phonemes). Due to the limited distribution of voiced‐stop allophones, these are scarcely represented in the present sample, and their VOT values are not reported. In spite of VOT's “reduced” work load, however, the VOT values for voiceless stops in this study conform quite closely to earlier observations of production and perception of Spanish stop phonemes. VOT values for voiceless stops in utterance‐initial and post‐nasal positions are not significantly different from those for stops in other positions.

Journal ArticleDOI
TL;DR: In this paper, the authors focus on the acoustic cues that contribute to the perception of the voicing difference in /zi/ and /si/ were the focus of the present experiments, rather than simply varying the acoustic signal along a single dimension and observing the effect on perception, changes along two acoustic dimensions were covaried in a factorial manner.
Abstract: The acoustic cues that contribute to the perception of the voicing difference in /zi/ and /si/ were the focus of the present experiments. Rather than simply varying the acoustic signal along a single dimension and observing the effect on perception, changes along two acoustic dimensions were covaried in a factorial manner. The time between the onset of the syllable and the onset of vocal‐cord vibration called voice‐onset time (VOT) was covaried with the fundamental frequency (F0). Observers were asked to indicate where each stimulus fell on a scale from /zi/ to /si/. The results showed that both VOT and F0 contribute to the perception of voicing. Sounds were judged as more /zi/‐like with decreases in VOT and with decreases in the F0. The frequency contour of F0 during the syllable had no effect beyond that accounted for by the frequency of F0 at the onset of vocal cord vibration. Other experiments showed that the role of F0 could not be attributed to the possibility that there was less energy at the first...

Journal ArticleDOI
TL;DR: In this paper, a forced-choice intelligibility test was conducted on nine esophageal and three normal speakers reading lists of stop and fricative-initial consonant-vowel nonsense syllables.
Abstract: Tape recordings of nine esophageal and three normal speakers reading lists of stop‐ and fricative‐initial consonant‐vowel nonsense syllables were judged by eighteen listeners in a forced‐choice intelligibility test. Esophageal stops and fricatives were significantly less intelligible than the normal productions. Error analysis of esophageal stops revealed 78.4% cognate voicing errors, the majority of errors occurring with voiceless stops. Measures of voice onset time (VOT) showed insufficient lag for the esophageal voiceless stops. In addition, post‐burst aspiration was negligible for these voiceless stops. Burst amplitude was significantly higher for voiceless apical and velar stops than for their voiced cognates. Analysis of esophageal fricative errors revealed 59.2% cognate voicing errors. Proportion of voicing in steady state frication was calculated for all productions. Correct perception of voicing was found to be primarily a function of this measure. Using this proportion, 82% of the voicing judgments could be predicted. [This investigation was supported by NINCDS research grant no. NS 08041 and HRA predoctoral training grant no. MCT‐000202‐21‐0.]

Journal ArticleDOI
TL;DR: In this paper, a 512-year-old boy who substituted voiced for voiceless initial stops and /d/ for initial fricatives was tested for speech sound identification and discrimination.
Abstract: Tests of speech sound identification and discrimination were administered to a 512‐year‐old boy who substituted voiced for voiceless initial stops and /d/ for initial fricatives. Stimuli were taken from the Abramson‐Lisker VOT series for bilabial, alveolar, and velar stops. Discrimination (as measured by a verbal “same”‐“not the same” task) was well above chance for pairs of speech sounds which spanned the typically found voicing boundary for adult listeners. Identification (as measured by a previously trained picture‐pointing task) was, however, random. Normal controls were near ceiling on both tasks. Three months later, following considerable improvement in the child's production of initial voiceless stops, identification tests were repeated. The child's identification had improved only to a level of approximately 65% correct. Possible reasons for the discrepancy between the child's levels of performance on the two types of tasks are discussed. [Work supported by NIH.]

Journal ArticleDOI
TL;DR: It is shown that in final position in Hindi stops, such a neutralization between unaspirated and aspirated stops was absent and the intelligibility of aspirated sounds in Hindi was the highest on the perception scale while un aspirated had a lower rank.
Abstract: The purpose of this study is to examine the predictive power of recent theories of aspiration. These theories (lag of voicing) predict that aspirated sounds will be perceived as unaspirated in word-final position, as in the case of Korean stops. Contrary to the predictions of these theories, our experiment showed that in final position in Hindi stops, such a neutralization between unaspirated and aspirated stops was absent. What is even more interesting is that the intelligibility of aspirated sounds in Hindi was the highest on the perception scale while unaspirated had a lower rank.

Journal ArticleDOI
TL;DR: The authors report similar shifts along synthetic voicing and place of articulation continua, and their interpretation of adaptation boundary shifts are considered in the context of perception of speech and the perception of language.
Abstract: Brady and Darwin (see Darwin, C.J. The Perception of Speech, in E.C. Carterette, and M.P. Friedman, Handbook of Perception, Academic, New York (in press), Vol. 7, report shifts in the phoneme boundary along a synthetic voicing continuum as a function of the range of stimuli presented within a test. The present study reports comparable shifts along synthetic voicing and place of articulation continua. Implications for the interpretation of adaptation boundary shifts are considered.

Journal ArticleDOI
TL;DR: This paper examined vowel duration as a consonant voicing cue in naturally produced speech and found no clear relation between the length of the preceding vowel and the perception of the voicing characteristic of the final consonant.
Abstract: Several studies state that the duration of the vowel preceding the consonant is a significant cue to the voicing characteristic of that consonant. The CVC's used in these studies have generally been prepared on the Pattern Playback. The present study was designed to examine vowel duration as a consonant voicing cue in naturally produced speech. Several CVC's were recorded by both authors as stressed citation forms. Vowel durations were measured and parts were removed from: the endmost section of the vowel, the middle of the vowel, or the end and the middle of the vowel. At least for these stimuli produced by these speakers, no clear relation is seen between the length of the preceding vowel and the perception of the voicing characteristic of the final consonant.

Journal ArticleDOI
TL;DR: The authors found that prevoicing can be a sufficient or necessary voicing cue when only that provides positive voicing information in Spanish using synthetic speech, and that other properties must supply positive voice information in its absence.
Abstract: Prevoicing, or glottal vibration preceding articulatory release, characterizes only the voiced member of a contrasting voiced‐voiceless, work‐initial stop consonant pair in Spanish. Then is prevoicing a sufficient or necessary voicing cue for the Spanish listener? Eight monolingual Peruvian‐Spanish listeners labeled either "voiced" or "voiceless" synthetically produced, syllable‐initial stops for which voice‐onset time (VOT) varied from 40 msec of prevoicing to 40 msec of voicing lag. Seven out of eight listeners divided the series into voiced and voiceless portions within the prevoiced region and for six at least 15 msec of prevoicing was required for "voiced" judgments greater than 75% of the time, suggesting that prevoicing can be a sufficient voicing cue when only that provides positive voicing information in Spanish using synthetic speech. Prevoicing was edited from naturally‐produced, Spanish, word‐initial voiced stops and presented for identification with unedited originals and voiceless minimal pairs to eight Peruvian‐Spanish monolinguals. Removing prevoicing did not consistently induce "voiceless" judgments of edited words suggesting that prevoicing is not a necessary voicing cue in real speech. Other properties must supply positive voicing information in its absence.

Journal ArticleDOI
TL;DR: In this article, peak intraoral air pressure values were obtained from 16 normal adult speakers from the phenomes /p/ and /b/ on syllable initial and syllable final word positions, embedded in a sentence, and a voiceless/voiced cognate ratio was computed from the mean pressure values for each subject.
Abstract: Peak intraoral air pressure values were obtained from 16 normal adult speakers from the phenomes /p/ and /b/ on syllable initial and syllable final word positions, embedded in a sentence. A voiceless/voiced cognate ratio, as an expression of differential aerodynamic impedance, was computed from the mean pressure values for each subject. The results suggest that the mechanisms and structures responsible for the occurrence of voicing appear to function differently when initiating a word than when terminating one.