scispace - formally typeset
Search or ask a question

Showing papers on "Voice published in 1974"


Journal ArticleDOI
TL;DR: The experiments with synthetic speech compare the role of VOT and the presence or absence of a significant formant transition following voicing onset as cues for the voiced‐voiceless distinction and indicate that there is a significant trading relationship between these two cues.
Abstract: Previous research on acoustic cues responsible for the voiced‐voiceless distinction in prestressed English plosives has emphasized the importance of voicing onset time with respect to plosive release (VOT). Voiced plosives in English normally have a short VOT (less than 20–30 msec) and a significant formant transition is present following voice onset. Voiceless plosives in prestressed position, on the other hand, have relatively long VOT's (greater than about 50 msec) and the formant transitions are essentially completed prior to voice onset. Our experiments with synthetic speech compare the role of VOT and the presence or absence of a significant formant transition following voicing onset as cues for the voiced‐voiceless distinction. The data indicate that there is a significant trading relationship between these two cues. The presence or absence of a rapid spectral change following voice onset produces up to 15‐msec change in the location of the perceived phoneme boundary as measured in terms of absolute VOT. One can speculate that the auditory system may be predisposed to detect the presence or absence of a rapid spectrum change as a general property of acoustic inputs. If this is the case, then the acquisition of the voiced‐voiceless distinction in infants may be conditioned initially by the presence or absence of this property at the onset of voicing rather than by absolute VOT.

311 citations


Journal ArticleDOI
TL;DR: Noncategorical perception of the voicing distinction, reflected by an improvement in discrimination within phonetic categories, was obtained for the group of listeners who experienced both the sequential identification procedure and the 4IAX discrimination test.
Abstract: Native speakers of English identified and then discriminated between stimuli which varied in voice onset time (VOT). One group of listeners identified a randomized sequence of stimuli; another group identified an ordered sequence of stimuli, in which stimuli from the VOT continuum were presented in a consecutive order. Half of the Ss in each group then received one of two discrimination formats: the ABX discrimination test in which X was identified with A or with B, or 4IAX test of paired similarity in which two pairs of stimuli—one pair always the same and one pair always different—were presented on each trial. Noncategorical perception of the voicing distinction, reflected by an improvement in discrimination within phonetic categories, was obtained for the group of listeners who experienced both the sequential identification procedure and the 4IAX discrimination test. The results are interpreted as providing evidence for separate auditory and phonetic levels of discrimination in speech perception.

182 citations


Journal ArticleDOI
TL;DR: This study was designed to examine the status of voice onset time (VOT) in identification and production of word‐initial voiced and voiceless labial, apical, and velar stop consonants for 20 English‐speaking adults.
Abstract: This study was designed to examine the status of voice onset time (VOT) in identification and production of word‐initial voiced and voiceless labial, apical, and velar stop consonants for 20 English‐speaking adults. Synthetic speech stimuli were constructed for four continua including BEES/PEAS, BEAR/PEAR, DIME/TIME, and GOAT/COAT. VOT values for 30 productions of the same words examined in the identification task were determined from spectrographic measurements. Analyses of the perceptual data revealed significant differences among labial, apical, and velar stops for the VOT 50% crossover and lower and upper limits of the phoneme boundary, but not for boundary width. In production of voiced and voiceless stops, reliable differences for mean VOT were shown between all cognates and among places of articulatory constriction within voicing category. The latter finding was primarily related to the labial/velar comparisons. Variability among individual speakers was demonstrated in the percentage of voiced stops associated with VOT values in the lead portion of the continuum, whereas all subjects evidenced productions in the short lag range. Comparisons between identification and production demonstrated high consistency for VOT characteristics in that few productions coincided with the perceptual phoneme boundary or contrasting voicing category.

107 citations


Journal ArticleDOI
TL;DR: Two models of the interaction of phonetic features in speech perception were used to predict Ss' identification functions for a bi-dimensional series of synthetic CV syllables suggesting that the phonetic Features of place and voicing in stop consonants are not processed independently but rather show a mutual dependency on each other.

34 citations


Journal ArticleDOI
TL;DR: The authors found that the adaptation effects are partly attributable to the presence or absence of the formant transitions after voicing onset, rather than to VOT per se, and the present results may be attributable to adaptation of a detector sensitive to a weighted combination of the two hypothetical cues of VOT and the duration of voiced transitions.

33 citations


Journal ArticleDOI
TL;DR: This paper studied the durations of initial and final consonant clusters in monosyllabic and bisyllabic words within a frame sentence from spectrograms of readings by three speakers.
Abstract: The durations of initial and final consonant clusters in monosyllabic and bisyllabic words within a frame sentence were studied from spectrograms of readings by 3 speakers. The durations of consonants within a cluster varied with the features of the consonant and its phonetic environment, such as voicing and point and manner of articulation. A durational model was proposed based on two mechanisms. An articulatory mechanism was attributed to effects involving coarticulation and restrictions in the motion of the tongue and lips during a cluster. Shortening of consonants in clusters seemed to arise from the shorter distances that the articulators travel in clusters. Lengthening of consonants before fricatives and voiced consonants and aspiration effects were noted. The other factor was a phonological mechanism, related to the use of duration as an acoustic cue in consonant perception. Single consonants varying only in the voicing characteristic had substantial durational differences which could aid in distinguishing them. However, phonological restrictions, such as common voicing among stops and fricatives, arise in the clusters, and the redundancies allow the durational differences to become less.

31 citations


Journal ArticleDOI
TL;DR: In this article, two major acoustical correlates of voicing are examined, previously conflated in the Voice Onset Time concept: the time from the burst to the first voicing pulse and the relative position in the formant transition frame at which voicing onsets and is measured as the amount of voiced first formant (F1) transition.

28 citations


Journal ArticleDOI
TL;DR: Identification of CV syllables was studied in a backward masking paradigm in order to examine two types of interactions observed between dichotically presented speech sounds: the feature sharing effect and the lag effect.

18 citations


PatentDOI
TL;DR: In this article, the channel vocoder is modified to permit simultaneous transmission of both voiced and unvoiced portions of a speech sound by connecting the pitch source to the lowest channel at all times, and including an additional decision circuit to independently transmit the voiced/unvoiced decision.
Abstract: An improved technique and apparatus for increasing the quality of voice transmission through vocoders, particularly channel vocoders. The vocoder is modified to permit the simultaneous transmission of both voiced and unvoiced portions of a speech sound by connecting the pitch source to the lowest channel at all times, and including an additional decision circuit to independently transmit the voiced/unvoiced decision. With this connection, it is possible to reproduce sounds such as the voiced fricatives which contain simultaneous voiced and unvoiced portions. In addition, the threshold for voicing can be adjusted for less sensitivity while providing for low level voicing such as nasal murmurs and voice bars associated with stop consonants, or plosive unvoicing.

17 citations


Journal ArticleDOI
R. Fink1
TL;DR: Responses indicate that the child's perceptual categorization of these phones undergoes a change during his internalization of the rules of English orthography, i.e. at the age of 7 or 8, and the value of the feature of voicing assigned to these stops may not be the same for all places of articulation.
Abstract: Stimuli containing voiced or voiceless stops occurring after [s] were presented in the form of a spelling test to child and adult subjects. Responses indicate that the child's perceptual categorization of these phones undergoes a change during his internalization of the rules of English orthography, i.e. at the age of 7 or 8. Results also suggest that the value of the feature of voicing assigned to these stops may not be the same for all places of articulation.

6 citations


Journal ArticleDOI
TL;DR: An economical (< 600-dollar) hardware realization of a 4-kHz digital linear predictive speech synthesizer which requires, at most, a CPU overhead of about 40 percent real time and permits the utilization of formant concatenation techniques and reduces the coefficient storage required to specify vowels/voiced consonants by about 60 percent.
Abstract: Speech analysis/synthesis algorithms utilizing linear prediction coefficients have certain advantages over those employing formantbased techniques. For example, 4-kHz speech samples may be synthesized using a basic sequence of 10 multiply/adds followed by a single addition of the current sample of the excitation function. Real-time software synthesis of 4-kHz speech is possible (using this technique) on certain 16-b minicomputers, but the central processing unit (CPU) overhead may approach 100 percent. We describe an economical (< 600-dollar) hardware realization of a 4-kHz digital linear predictive speech synthesizer which requires, at most, a CPU overhead of about 40 percent real time. The device is constructed of standard TTL/MOS logic and consists (essentially) of a high speed 2's complement multiplier/adder capable of calculating a 26-b product (10-b speech samples, 16-b coefficients) in 0.33 μs, and a dual shift register. In addition, a procedure is discussed which enables the device to be used both as a formant synthesizer for vowels or voiced consonant production, and as a predictive synthesizer for other speech sounds. This procedure, hybrid synthesis, permits the utilization of formant concatenation techniques and reduces the coefficient storage required to specify vowels/voiced consonants by about 60 percent.

Journal ArticleDOI
TL;DR: In this article, the authors performed identification and discrimination tasks with synthetic speech stimuli varying in equal steps of voicing onset time (VOT) from voicing lead to voicing lag. Measurements of VOT were taken from spectrograms of initial voiced and voiceless labial stops.
Abstract: Subjects performed in identification and discrimination tasks with synthetic speech stimuli varying in equal steps of voicing onset time (VOT) from voicing lead to voicing lag. Measurements of VOT were taken from spectrograms of initial voiced and voiceless labial stops. Average identification crossover and discrimination maxima differed significantly for the two monolingual groups. Within each group, values of VOT for voiced and voiceless tokens in production straddled the group location of identification crossover and discrimination maximum. Identification crossover matched discrimination maximum for individual bilinguals, but these locations varied across individuals, distributing themselves between the monolingual Spanish and English values. There was no significant difference in the values of VOT for voiced and voiceless tokens in Spanish words spoken by bilinguals and Spanish monolinguals. However, there was a significant difference in values of VOT for voiced and voiceless tokens in Spanish words s...

Journal ArticleDOI
TL;DR: In this article, a CVC multiple-choice articulation test was developed for use in phoneme and feature confusion analyses, which was given to a sample of hypacusic listeners, and was applied in low-pass filtering experiments with normal hearing listeners.
Abstract: A new CVC multiple‐choice articulation test was developed for use in phoneme and feature confusion analyses. The phoneme differentiation test consists of 200 test items: four tokens each of 22 initial consonant contrasts, 13 final consonant contrasts, and 15 syllable nucleus contrasts. The major difference between the phoneme differentiation test and previous closed‐response tests is that for each stimulus word the alternatives were designed to provide contrasts in only one phonemic feature at a time. Thus the set of contrasts for each stimulus is different from the set of contrasts for every other stimulus. Voicing, nasality, manner, and place are contrasted for consonant stimuli. Place, openness, and intrinsic duration are contrasted for vowel stimuli. The test was given to a sample of hypacusic listeners, and was applied in low‐pass filtering experiments with normal‐hearing listeners. The findings for both clinical and experimental groups are in general agreement with those of previous studies. Several...

01 Jan 1974
TL;DR: This article analyzed the perceptual pattern of Hindi aspirated consonants as spoken and recognized by native speakers and examined the predictive role of phonetic sciences in the light of the recent theory of aspiration propounded by Kim.
Abstract: a) to analyze the perceptual pattern of Hindi aspirated consonants as spoken and recognized by native speakers; b) to examine the predictive role of phonetic sciences in the light of the recent theory of aspiration propounded by Kim (1970); c) to determine the phonological phenomena involving aspiration in Hindi. In Hindi, as in other Indo-Aryan languages, the consonant system has many contrasts involving aspiration and voicing as is shown in the following table. Unvoiced Unvoiced Voiced Voiced Stops Unaspirated Aspirated Unaspirated Aspirated