Showing papers on "Voice published in 1978"

PDF

Open Access

Journal Article•DOI•

Perceptibility of phonetic features in fluent speech.

[...]

Ronald A. Cole, Jola Jakimik, William E. Cooper

01 Jul 1978-Journal of the Acoustical Society of America

TL;DR: A series of experiments examined listeners' ability to detect mispronounced words in a short story and showed that prestressed work-initial stop consonants are more perceptible than other consonants.

...read moreread less

Abstract: A series of experiments examined listeners’ ability to detect mispronounced words in a short story. Mispronunciations were produced by changing a single consonant segment in a word to produce a (phonologically permissible) nonsense word. The results of six different experiments showed that prestressed word‐initial stop consonants are more perceptible than other consonants. For example, mispronunciations produced by changing the voicing of a word‐initial stop (e.g., ’’boy’’ to ’’poy’’) were detected about 70% of the time, while changes in voicing of a word‐initial fricative (e.g., ’’voice’’ to ’’foice’’) were detected about 38% of the time. Mispronunciations produced by changing the place of articulation of a prestressed word‐initial stop were most detectable of all (80% to 90% detection) for three different speakers. A change in place of articulation of a word‐initial stop (e.g., ’’baby’’ to ’’daby’’) was detected as often as a change in both place of articulation and voicing (e.g., ’’baby to ’’taby’’). Finally, it was found that a mispronunciation was detected about twice as often in word‐initial than in word‐final position in one syllable words for both stops and nasals. The results suggest that listeners pay special attention to word‐initial stop consonants in natural continuous speech.

...read moreread less

103 citations

Journal Article•DOI•

A mixed‐source model for speech compression and synthesis

[...]

John Makhoul, R. Viswanathan, Richard Schwartz, A. W. F. Huggins

01 Dec 1978-Journal of the Acoustical Society of America

TL;DR: In this article, an excitation source model for speech compression and synthesis is presented that allows the degree of voicing to be varied continuously by mixing voiced and unvoiced excitations in a frequency-selective manner.

...read moreread less

Abstract: This paper presents an excitation source model for speech compression and synthesis that allows the degree of voicing to be varied continuously by mixing voiced (pulse) and unvoiced (noise) excitations in a frequency‐selective manner. The mix is achieved by dividing the speech spectrum into two regions, with the pulse source exciting the low‐frequency region and the noise source exciting the high‐frequency region. The degree of voicing is specified by a parameter Fc, which corresponds to the cut‐off frequency between the voiced and unvoiced regions. For speech compression applications, Fc can be extracted automatically from the speech spectrum and transmitted. Experiments performed with the new model indicate its power in synthesizing natural sounding voiced fricatives and in largely eliminating the ’’buzzy’’ quality of vocoded speech. A functional definition of buzziness and naturalness is given in terms of the model.

...read moreread less

83 citations

Journal Article•DOI•

Temporal coordination of phonation and articulation in a case of verbal apraxia: A voice onset time study ☆

[...]

Frances J. Freeman¹, Elaine S. Sands¹, Katherine S. Harris¹•Institutions (1)

City University of New York¹

01 Jul 1978-Brain and Language

TL;DR: Investigation of voice onset time in stop production demonstrated that the VOTs of the apraxic subject differed markedly from those of normal subjects, yielding a compression of the two categories and a marked overlap.

...read moreread less

82 citations

Journal Article•DOI•

Investigating the MESA (Multipoint Electrotactile Speech Aid): The transmission of segmental features of speech

[...]

David W. Sparks, Patricia K. Kuhl, Alice E. Edmonds, Gary P. Gray

01 Jan 1978-Journal of the Acoustical Society of America

TL;DR: Analysis of the data demonstrates that the tactile transform enables receivers to achieve excellent recognition of vowels in CVC context and the consonantal features of voicing and nasality, which leads to recognition performance in the combined condition (visual plus tactual) which far exceeds either reception condition in isolation.

...read moreread less

Abstract: Four normal‐hearing young adults have been extensively trained in the use of a tactile speech‐transmission system. Subjects were tested in the recognition of various phonetic elements including vowels, and stop, nasal, and fricative consonants under three receiving conditions; visual reception alone (lipreading), tactile reception alone, and tactile plus visual reception. Subjects were artificially deafened using earplugs and white noise and all speech tokens were presented live voice. Analysis of the data demonstrates that the tactile transform enables receivers to achieve excellent recognition of vowels in CVC context and the consonantal features of voicing and nasality. This, in combination with high recognition of vowels and the consonantal feature place of articulation through visual reception, leads to recognition performance in the combined condition (visual plus tactual) which far exceeds either reception condition in isolation.

...read moreread less

73 citations

Journal Article•DOI•

The devoicing of voiced fricatives

[...]

Mark Haggard¹•Institutions (1)

University of Nottingham¹

01 Apr 1978-Journal of Phonetics

TL;DR: In this article, it is shown that the glottis is at least partially open at each position of articulation, but it is not established how much of this opening is cause and how much effect.

...read moreread less

72 citations

Journal Article•

Effects of Place of Articulation and Vowel Environment on "Voiced" Stop Consonant Production.

[...]

Bruce L. Smith

01 Jan 1978-Glossa

61 citations

Proceedings Article•DOI•

A mixed-source model for speech compression and synthesis

[...]

John Makhoul¹, R. Viswanathan, Richard Schwartz, A. W. F. Huggins•Institutions (1)

BBN Technologies¹

10 Apr 1978

TL;DR: An excitation source model for speech compression and synthesis is presented, which allows for a degree of voicing by mixing voiced (pulse) and unvoiced (noise) excitations in a frequency-selective manner.

...read moreread less

Abstract: This paper presents an excitation source model for speech compression and synthesis, which allows for a degree of voicing by mixing voiced (pulse) and unvoiced (noise) excitations in a frequency-selective manner. The mix is achieved by dividing the speech spectrum into two regions, with the pulse source exciting the low-frequency region and the noise source exciting the high-frequency region. A parameter F c determines the degree of voicing by specifying the cut-off frequency between the voiced and unvoiced regions. For speech compression applications, F c can be extracted automatically from the speech spectrum and transmitted. Experiments using the new model indicate its power in synthesizing natural sounding voiced fricatives, and in largely eliminating the "buzzy" quality of vocoded speech. A functional definition of buzziness and naturalness is given in terms of the model.

...read moreread less

58 citations

Journal Article•DOI•

Voicing cues in English final stops

[...]

Catherine G. Wolf¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Oct 1978-Journal of Phonetics

TL;DR: The authors investigated the relation between the acoustic characteristics of final stop syllables and the perception of the voicing distinction and found that the formant transitions, closure, burst, and vowel duration are important in determining whether a stimulus is heard as voiced or voiceless.

...read moreread less

57 citations

Journal Article•DOI•

Range effect in the perception of voicing

[...]

Susan A. Brady, C. J. Darwin

01 May 1978-Journal of the Acoustical Society of America

TL;DR: The location of the voicing boundary in the perception of initial stop consonants is shown to vary according to the range of voice-onset times used in a block of trials, and may provide a metric for assessing the auditory tolerance of phonological categories.

...read moreread less

Abstract: The location of the voicing boundary in the perception of initial stop consonants is shown to vary according to the range of voice‐onset times used in a block of trials and according to the order in which blocks covering different ranges are presented. Although these range effects introduce methodological complications into the interpretation of adaptation experiments, they appear to be qualitatively different from adaptation effects and, it is suggested, may provide a metric for assessing the auditory tolerance of phonological categories.

...read moreread less

49 citations

Journal Article•DOI•

In qualified defense of VOT.

[...]

Leigh Lisker¹•Institutions (1)

University of Pennsylvania¹

01 Oct 1978-Language and Speech

TL;DR: The VOT measure has been said to provide the single most nearly adequate physical basis for separating homorganic stop categories across a variety of languages, granted that other features may also be involved.

...read moreread less

Abstract: The VOT measure has been said to provide the single most nearly adequate physical basis for separating homorganic stop categories across a variety of languages, granted that other features may also be involved. That transition duration affects perceived voicing of synthesized initial stops of one specific language, English, has suggested the hypothesis by Stevens and Klatt (1974) that a detector responsive to rapid formant-frequency shifts after voice onset better explains the child's acquisition of the contrast than does some mechanism which responds to VOT directly. If such a detector is part of our biological equipment, then it seems remarkably underutilized in language, for the hypothesis asserts that basic to voicing perception is whether laryngcal signal is or is not present during the interval in which the stop-vowel transition occurs. In effect, the “archetypical” voiceless stop is aspirated. Not only do many languages not possess voiceless aspirates, but even in English aspiration is severely res...

...read moreread less

38 citations

Journal Article•DOI•

Stimulus dominance and ear dominance in the perception of dichotic voicing contrasts.

[...]

Bruno H. Repp¹•Institutions (1)

Haskins Laboratories¹

01 May 1978-Brain and Language

TL;DR: The range and reliability of the laterality effects obtained, as well as certain other methodological features, make the present tests promising as tools for assessing individual differences in ear dominance.

...read moreread less

Journal Article•DOI•

Laryngeal control for voicing distinction in Japanese consonant production

[...]

Hajime Hirose, Tatsujiro Ushijima

01 Jan 1978-Phonetica

TL;DR: It was found that there were apparent reciprocal patterns in the posterior cricoarytenoid (PCA) and the interarytenoids (INT) in terms of significant negative correlation, and active control of PCA for voicelessness was demonstrated.

...read moreread less

Abstract: The aim of the present study was to investigate the laryngeal adjustments for voiced versus voiceless distinction in Japanese consonant production by means of laryngeal electromyography (EMG) and fiberoptic observation. Multichannel EMG recordings were taken of a Japanese subject and the data were computer-processed to obtain the averaged activity patterns of the five intrinsic laryngeal muscles with special reference to the voicing distinction in consonant production in various phonetic environments. It was found that there were apparent reciprocal patterns in the posterior cricoarytenoid (PCA) and the interarytenoid (INT) in terms of significant negative correlation, and active control of PCA for voicelessness was demonstrated. The patterns of the thyroarytenoid and the lateral cricoarytenoid were different from that of INT even though these two muscles are usually classified as the members of the adductor group, and their activity levels were apparently influenced by the phonetic environment. A possible contribution of the cricothyroid (CT) to the voicing distinction was also pointed out but further investigations on acoustic parameters seem to be mandatory in more critical interpretation of CT activity in speech.

...read moreread less

Journal Article•DOI•

Some substantive universals in atomic phonology

[...]

Daniel A. Dinnsen¹, Fern Marja Eckman²•Institutions (2)

Indiana University¹, University of Wisconsin–Milwaukee²

01 May 1978-Lingua

TL;DR: In this paper, the authors elucidate some of the principles governing cross-linguistic variation in such phonological processes as Terminal Devoicing and Intervocalic Voicing and show that the theory of atomic phonology provides a correct characterization of these processes and their associated constraints.

...read moreread less

Journal Article•DOI•

Manner of vowel termination as a perceptual cue to the voicing status of postvocalic stop consonants

[...]

Donal O’Kane¹•Institutions (1)

University of Essex¹

01 Oct 1978-Journal of Phonetics

TL;DR: In this article, an experimental investigation of a hypothesized perceptual cue in the determination of the voicing status of postvocalic English stop consonants was conducted, and no empirical support for Parker's hypothesis was found.

...read moreread less

Proceedings Article•DOI•

Votrax real time hardware for phoneme synthesis of speech

[...]

R. Gagnon¹•Institutions (1)

University of Rochester¹

10 Apr 1978

TL;DR: A real time phonetic voice synthesizer roughly the size of a small hi-fi amplifier has been developed that accepts a string of phoneme commands, each consisting of 8 bits, and simulates the transfer function of the human vocal tract.

...read moreread less

Abstract: A real time phonetic voice synthesizer roughly the size of a small hi-fi amplifier has been developed. It accepts a string of phoneme commands, each consisting of 8 bits. 6 bits determine the phoneme uttered while 2 bits determine the inflection associated with that phoneme. The synthesizer contains an active filter network which simulates the transfer function of the human vocal tract. This analog network is excited by both voicing and fricative sound sources. The sound sources and the vocal tract filter transfer function are dynamically manipulated in response to the numerous phoneme command sequences to produce articulatory synthesis by rule.

...read moreread less

Journal Article•DOI•

Integration of Place and Voicing Information in the Identification of Synthetic Stop Consonants.

[...]

Gregg C. Oden¹•Institutions (1)

University of Wisconsin-Madison¹

01 Apr 1978-Journal of Phonetics

TL;DR: The fuzzy logical model provided a good account for the data of this experiment and implies that place and voicing feature information are evaluated independently before being integrated during phoneme identification.

...read moreread less

Journal Article•DOI•

Categorical features in speech perception and production

[...]

Louis Goldstein

01 May 1978-Journal of the Acoustical Society of America

TL;DR: Multidimensional scaling analyses of three types of English consonant confusions are reported: consonant substitutions in spontaneous speech errors, CV perceptural confusions, and VC perceptual confusions.

...read moreread less

Abstract: Multidimensional scaling analyses of three types of English consonant confusions are reported: consonant substitutions in spontaneous speech errors, CV perceptual confusions, and VC perceptual confusions. Two data sets of each type are analyzed to assess reliability. Three reliable dimensions emerge in all data sets, corresponding to voicing, stop/fricative, and place of articulation. Representation of consonants in terms of categorical phonological features exhaustively describes what is common to the configurations of different data types, even though there is reliable detail within each data type that is not captured by categorical features. Such features can be viewed as groupings of speech sounds common to various perception and production processes.

...read moreread less

Journal Article•DOI•

Voice assimilation in Dutch: Some refinements

[...]

Daniel Brink¹•Institutions (1)

University of California, Berkeley¹

01 Jan 1978-Acta Linguistica Hafniensia

TL;DR: The analysis of Dutch voice assimilation presented in Hubers and Kooij (1973; hereafter, H&K) represents a considerable improvement over previous generative accounts as discussed by the authors.

...read moreread less

Abstract: 0. The analysis of Dutch voice assimilation presented in Hubers and Kooij (1973; hereafter, H&K) represents a considerable improvement over previous generative accounts. 1 Especially the suggestion that two distinctive features, [± Vce] (voicing) and [± Tns] (tenseness), rather than just one, [± Vce], should be used in the analysis has enabled the authors to achieve a much more detailed phonetic description of this phenomenon than had previously been possible. However, in spite of these improvements, certain facts have remained unaccounted for in their presentation. In this paper I will show that, by altering slightly the underlying forms assumed and the phonological rules required, it is possible to account naturally for these other facts while retaining all of the basic advantages of H&K's approach. 2

...read moreread less

Book•

The perception of voicing contrasts in Thai and English

[...]

Susan Lea Donald

01 Jan 1978

Proceedings Article•DOI•

Improvement of voicing decisions by use of context

[...]

E. Neuburg

01 Apr 1978

TL;DR: For a number of popular voicing statistics (zero-crossing rate, spectral slope, and low-frequency energy), the voiclng decision is improved by use of context, in fact by using of just the previous segment.

...read moreread less

Abstract: Voicing decisions in speech compression or recognition procedures are usually made in a context-free manner on successive fixed-length segments of speech. For a number of popular voicing statistics (zero-crossing rate, spectral slope, and low-frequency energy), the voiclng decision is improved by use of context, in fact by use of just the previous segment. For each statistic, instead of looking for a threshold that selects voiced segments, we use two thresholds, one if the last segment was called voiced and the other if the last segment was unvoiced. A typical improvement obtained by allowing this 'hysteresis' in the voicing decision is a 15 percent drop in error rate.

...read moreread less

Journal Article•DOI•

Production of white tone from white noise and voiced speech from whisper

[...]

Richard M. Warren¹, James A. Bashford¹•Institutions (1)

University of Wisconsin-Madison¹

01 May 1978-Bulletin of the psychonomic society

TL;DR: A new class of tonal sounds can be generated by repeating brief sections of noise over and over without intervening silence when the repeated waveform is white noise, a "white tone" with a rich distinctive timbre and no noise-like quality is heard over a considerable range of repetition rates as discussed by the authors.

...read moreread less

Abstract: A new class of tonal sounds can be generated by repeating brief sections of noise over and over without intervening silence When the repeated waveform is white noise, a “white tone” with a rich distinctive timbre and no noise-like quality is heard over a considerable range of repetition rates If the noise is a whispered vowel rather than white noise, repetition of a sample equal in duration to a single glottal pulse during voicing can generate a “whisper tone” sounding like a voiced version of the vowel Whispered discourse can be converted to an intelligible voiced monotone by repetition of regularly spaced samples drawn from the whis- pered speech

...read moreread less

Journal Article•DOI•

The perception of voice onset time in Polish

[...]

Michael J. Mikoś, Patricia A. Keating, Barbara J. Moslin

01 May 1978-Journal of the Acoustical Society of America

TL;DR: This article showed that speakers who rarely if ever produce longlag stops themselves place their category boundary between the short and longlag regions, thus showing a dissociation between production and perception, and further work will vary the test conditions and/or subjects' response categories to determine if a boundary exists between the prevoiced and shortlag regions of the continuum.

...read moreread less

Abstract: Polish is traditionally described as using the prevoiced and short‐lag categories to contrast its voiced and voiceless stops. However, Moslin and Keating [J. Acoust. Soc. Am. 62, S27 (A) (1977)] have shown that some speakers of Polish make use of the long‐lag, aspirated voicing category. Preliminary results for six speakers of Polish on a/da/‐/ta/continuum with VOT from −20 to +80 ms indicate that all speakers, regardless of how they produce their apical stops, show a labeling boundary and discrimination peak at about 35 ms. That is, speakers who rarely if ever produce long‐lag stops themselves place their category boundary between the short‐ and long‐lag regions, thus showing a dissociation between production and perception. Further work will vary the test conditions and/or subjects' response categories to determine if a boundary exists between the prevoiced and short‐lag regions of the continuum.

...read moreread less

Journal Article•DOI•

Breton (Treger dialect)

[...]

S. Hewitt¹•Institutions (1)

University of Cambridge¹

01 Jun 1978-Journal of the International Phonetic Association

TL;DR: In this article, the authors present a broad transcription of the Treger dialect, where stressed vowels are half long unless before fortis (voiceless, double) consonants or consonant clusters, where they are short.

...read moreread less

Abstract: Fairly broad transcription. Stress is strong. Unstressed vowels are short, stressed vowels are half long unless before fortis (voiceless, double) consonants or consonant clusters, where they are short. Adjacent vowels are in hiatus and thus form two syllables, w, j are consonantal except when final or before a consonant where they represent the second element of falling closing diphthongs. , a = a in W. Treger, a in E. Treger. Contingent nasality before nasal consonants is not marked, e, a are e, a reduced towards ə except in the slowest, clearest forms of speech, θ = rounded ə. Lenis obstruent devoicing in final pausal position and in sandhi is marked .; fortis obstruent voicing in sandhi is marked ˅. h is a lenis, usually unvoiced, with some voicing possible between vowels and next to liquids; in final pausal position or in sandhi = x. m is a fortis. ɲ is a fortis; there is usually a j-glide between it and a preceding vowel, r is a light flap or trill; with some speakers it is ɻ; in some parts of Brittany it is R or B, but not in Treger; when written r it is not usually heard except in slow, clear forms of speech. I may be heard velarized in some districts, but not in Treger. t, d, n may be somewhat advanced towards a dental position, p, t, k may have slight aspiration except after s.

...read moreread less

Journal Article•DOI•

Glottalized stops in K'ekchi (Maya)

[...]

S. Pinkerton

01 Nov 1978-Journal of the Acoustical Society of America

TL;DR: The authors measured stop duration (VOT and closure) and intraoral air pressure for comparison of the production of glottalized stops with that of nonglottalised stops in K'ekchi.

...read moreread less

Abstract: Stop duration (VOT and closure) and intraoral air pressure were measured for comparison of the production of glottalized stops with that of nonglottalized stops. The pitch and duration of the preceding and following vowels were also measured. Subjects read natural language minimal pairs in which glottalized and nonglottalized stops contrasted in word initial, medial, and final positions. The results establish preliminary acoustic and physiological variables by which glottalized stops in K'ekchi may be characterized and distinguished from such stops in other languages. These glottalized stops had a significantly greater VOT than their nonglottalized counterparts. However, /b′/ had two productions in free variation: (l) a voicing lead and (2) a zero VOT. Glottalized /t′/, /k′/, and /q′/ exhibited a greater positive air pressure than their nonglottalized counterparts. /b′/ exhibited a zero air pressure, thus demonstrating that the inventory of glottalized stops in K'ekchi consists of one bilabial implosive and three ejectives. The pitch of a vowel following a glottalized stop began at a lower frequency than that of a vowel following a nonglottalized stop except when the vowel was found before /q′/ and /q/. The pitch in these latter cases was the same.

...read moreread less

Journal Article•DOI•

The reliability of closure features as cues to medial stop voicing in English

[...]

L. Lisker

01 Nov 1978-Journal of the Acoustical Society of America

TL;DR: This paper showed that stop closure duration is a cue to the voicing of medial stops in English trochees, and they also found that /b/ closures are regularly shorter than /p/ closures in words such as rabid rapid, and that this difference has perceptual/phonetic significance.

...read moreread less

Abstract: The evidence that closure duration is a cue to the voicing of medial stops in English trochees is as convincing as any we have for other acoustic features considered to be factors governing the linguistic interpretation of speech signals. Measurements of natural speech show /b/ closures to be regularly shorter than /p/ closures in words such as rabid rapid, and there are experimental data to indicate that this difference has perceptual/phonetic significance. Another closure feature, glottal pulsing, also plays a role in the /b/‐/p/ distinction in medial position. New data gathered to test the reliability of these two features as cues to the intelligibility of naturally produced tokens of rabid rapid indicate (1) stop closure duration does not suffice to separate /b/ from /p/ across speakers, (2) the phonetic effect of manipulating silent “closure” differs greatly for different tokens of the source word produced by a single speaker, and (3) the effect of replacing buzz with silence in natural tokens of rab...

...read moreread less