Showing papers on "Voice published in 1999"

PDF

Open Access

Book•

Phonetics and Phonology of Tense and Lax Obstruents in German

[...]

Michael Jessen¹•Institutions (1)

15 Jan 1999

TL;DR: This book claims that the Jakobsonian feature tense was rejected prematurely, and it is shown that tense and voice differ in their invariant properties and basic correlates, but that they share a number of other correlates, including F0 onset and closure duration.

...read moreread less

Abstract: Knowing that the so-called voiced and voiceless stops in languages like English and German do not always literally differ in voicing, several linguists — among them Roman Jakobson — have proposed that dichotomies such as fortis/lenis or tense/lax might be more suitable to capture the invariant phonetic core of this distinction. Later it became the dominant view that voice onset time or laryngeal features are more reasonable alternatives. However, based on a number of facts and arguments from current phonetics and phonology this book claims that the Jakobsonian feature tense was rejected prematurely. Among the theoretical aspects addressed, it is argued that an acoustic definition of distinctive features best captures the functional aspects of speech communication, while it is also discussed how the conclusions are relevant for formal accounts, such as feature geometry. The invariant of tense is proposed to be durational, and its ‘basic correlate’ is proposed to be aspiration duration. It is shown that tense and voice differ in their invariant properties and basic correlates, but that they share a number of other correlates, including F0 onset and closure duration. In their stop systems languages constitute a typology between the selection of voice and tense , but in their fricative systems languages universally tend towards a syncretism involving voicing and tenseness together. Though the proposals made here are intended to have general validity, the emphasis is on German. As part of this focus, an acoustic study and a transillumination study of the realization of /p,t,k,f,s/ vs. /b,d,g,v,z/ in German are presented.

...read moreread less

181 citations

Book•

Articulatory and Phonological Impairments: A Clinical Focus

[...]

Jacqueline Ann Bauman-Wängler

20 Oct 1999

TL;DR: In this article, the authors present a framework for the diagnosis and treatment of articulation disorders, including speech sound forms and sounds in context, as well as their application in the development of children's speech.

...read moreread less

Abstract: All chapters include a "Chapter Outline," "Summary," and "References." Preface. 1. Clinical Framework: Basic Terms and Concepts. Articulation and Articulation Disorders. Phonetics and Its Relationship to Articulation Disorders. Speech Sounds versus Phonemes: Clinical Application. Phonology and Phonological Disorders. Phonetics versus Phonology: Form and Function. Articulation Disorders and Phonological Disorders. 2. Articulatory Phonetics: Speech Sound Form. Vowels versus Consonants. Sounds in Context: Coarticulation and Assimilation. 3. Phonetics Transcription and Diacritics. Phonetic Transcription as a Notational System. Why Use Phonetic Transcription? Diacritics. Clinical Implications. 4. Theoretical Considerations. Phonology. Distinctive Feature Theories. Generative Phonology. Natural Phonology. Linear versus Nonlinear Phonologies. 5. Normal Phonological Development. Aspects of Structural and Functional Development. Aspects of Perceptual Development. Prelinguistic Stages: Before the First Word. Transition from Babbling to First Words. The First Fifty Words. The Preschool Child. The School-Age Child. 6. Appraisal: Collection of Data. Evaluation by the Clinician. Initial Impression. Articulation Tests. Spontaneous Speech Sample. Evaluation of the Speech Mechanism. Selection of Additional Assessment Measures. Special Considerations. Summary of the Data. 7. Diagnosis: Phonetic versus Phonemic Emphasis. Preliminary Analysis: Inventory and Distribution of Speech Sounds. Decision-Making: Primarily Phonetic Emphasis. Decision-Making: Primarily Phonemic Emphasis. Measures of Severity and Intelligibility. 8. Therapy for Phonetic Errors. Decision Making: When to Use a Phonetic Approach. Therapy Sequence. Individual Sound Errors. Misarticulations of: s-sounds, sh-sounds, k- and g-sounds, l-sounds, r-sounds including central vowels with r-coloring, th-sounds. Other Sound Errors. Voicing Problems, Misarticulations of f- and v-Sounds, Affricates, and Consonant Clusters. 9. Treatment of Phonemic Errors. Treatment Principles. Minimal Pair Contrast Therapies. Cycles Training. Metaphon Therapy. Phonemic Disorders with Concurrent Language Problems. Therapeutic Suggestions. The Child with an Emerging Phonological System. Therapeutic Suggestions. Treatment of Multiple Vowel Errors. 10. Articulatory/Phonological Disorders in Selected Populations. Development Apraxia of Speech: A Disorder of Speech motor Control. Motor Speech Disorders: Cerebral Palsy. Clefting: Cleft Palate and Cleft Lip. Mental Disability. Hearing Impairment. Motor Speech Disorders: Acquired Apraxia of Speech. Motor Speech Disorders: The Dysarthrias. Glossary. References. Index.

...read moreread less

135 citations

Journal Article•DOI•

Phonological Segments and Features as Planning Units in Speech Production

[...]

Ardi Roelofs¹•Institutions (1)

University of Exeter¹

01 Apr 1999-Language and Cognitive Processes

TL;DR: The author reports four experiments that examined phonological processes in spoken word production using the WEAVER model of word-form encoding, in which a serial encoding of segments is followed by a parallel activation of features.

...read moreread less

Abstract: The author reports four experiments that examined phonological processes in spoken word production. A form-preparation paradigm was applied to the question of whether phonological features can be preplanned to facilitate spoken word production. In Experiment 1, monosyllabic words were produced in sets different in form, or in sets sharing either the initial segment or initial segments differing only in voicing. Only shared initial segments yielded facilitation. A similar pattern of results was observed when the sets were matched for the following vowel (Experiment 2), when words were produced in response to pictured objects (Experiment 3), and when place of articulation rather than voicing was manipulated (Experiment 4). The special status of identity suggests that segments are planning units independent of their features. The results are explained in terms of the WEAVER model of word-form encoding, in which a serial encoding of segments is followed by a parallel activation of features. A WEAVER simulation of the experiments is presented which supports these claims.

...read moreread less

124 citations

Journal Article•DOI•

Effects of syllable-initial voicing and speaking rate on the temporal characteristics of monosyllabic words

[...]

J. Sean Allen¹, Joanne L. Miller•Institutions (1)

Northeastern University¹

30 Aug 1999-Journal of the Acoustical Society of America

TL;DR: It was concluded that the traditional method of creating VOT continua for perceptual experiments, although not perfect, approximates natural speech by capturing the basic trade-off between VOT and vowel duration in syllable-initial voiced versus voiceless stop consonants.

...read moreread less

Abstract: Two speech production experiments tested the validity of the traditional method of creating voice-onset-time (VOT) continua for perceptual studies in which the systematic increase in VOT across the continuum is accompanied by a concomitant decrease in the duration of the following vowel. In experiment 1, segmental durations were measured for matched monosyllabic words beginning with either a voiced stop (e.g., big, duck, gap) or a voiceless stop (e.g., pig, tuck, cap). Results from four talkers showed that the change from voiced to voiceless stop produced not only an increase in VOT, but also a decrease in vowel duration. However, the decrease in vowel duration was consistently less than the increase in VOT. In experiment 2, results from four new talkers replicated these findings at two rates of speech, as well as highlighted the contrasting temporal effects on vowel duration of an increase in VOT due to a change in syllable-initial voicing versus a change in speaking rate. It was concluded that the traditional method of creating VOT continua for perceptual experiments, although not perfect, approximates natural speech by capturing the basic trade-off between VOT and vowel duration in syllable-initial voiced versus voiceless stop consonants.

...read moreread less

93 citations

Italian s -voicing and the structure of the phonological word

[...]

Marc van Oostendorp

15 Dec 1999

46 citations

Glottal spreading bias in Germanic

[...]

Gregory K. Iverson, Milwaukee, Joseph C. Salmons

01 Jan 1999

TL;DR: This paper showed that laryngeal assimilations in German and English provide evidence for positing instead the feature [spread glottis] as the phonologically relevant privative feature.

...read moreread less

Abstract: Much theoretical phonology in the 1990s has focused on the characterization of voicing assimilations, nearly always assuming presence of the feature [voice] versus its absence in order to distinguish voiced obstruents from voiceless. While [voice] is uncontestably at play in Romance and Slavic, as well as in many other languages we show here that laryngeal assimilations in German and English provide evidence for positing instead the feature [spread glottis] as the phonologically relevant privative feature. The description of German and English laryngeal patterns in terms of [spread glottis] also simplifies our understanding of German final fortition (Auslautverhartung) and related phenomena

...read moreread less

43 citations

Journal Article•DOI•

Contributions of temporal encodings of voicing, voicelessness, fundamental frequency, and amplitude variation to audio-visual and auditory speech perception

[...]

Andrew Faulkner¹, Stuart Rosen•Institutions (1)

University College London¹

30 Aug 1999-Journal of the Acoustical Society of America

TL;DR: The timing of voiced and voiceless excitation appears to be the major temporal cues to consonant identity.

...read moreread less

Abstract: Auditory and audio-visual speech perception was investigated using auditory signals of invariant spectral envelope that temporally encoded the presence of voiced and voiceless excitation, variations in amplitude envelope and F0. In experiment 1, the contribution of the timing of voicing was compared in consonant identification to the additional effects of variations in F0 and the amplitude of voiced speech. In audio-visual conditions only, amplitude variation slightly increased accuracy globally and for manner features. F0 variation slightly increased overall accuracy and manner perception in auditory and audio-visual conditions. Experiment 2 examined consonant information derived from the presence and amplitude variation of voiceless speech in addition to that from voicing, F0, and voiced speech amplitude. Binary indication of voiceless excitation improved accuracy overall and for voicing and manner. The amplitude variation of voiceless speech produced only a small increment in place of articulation scores. A final experiment examined audio-visual sentence perception using encodings of voiceless excitation and amplitude variation added to a signal representing voicing and F0. There was a contribution of amplitude variation to sentence perception, but not of voiceless excitation. The timing of voiced and voiceless excitation appears to be the major temporal cues to consonant identity.

...read moreread less

42 citations

Journal Article•DOI•

Stop consonants in Yanyuwa and Yindjibarndi: locus equation data

[...]

Marija Tabain¹, Andrew Butcher²•Institutions (2)

Macquarie University¹, Flinders University²

01 Oct 1999-Journal of Phonetics

TL;DR: In this article, the slope values from the F2 transition data for stop consonants in two Australian Aboriginal languages, Yanyuwa and Yindjibarndi, were used in order to determine the amount of anticipatory coarticulation between the consonant and the vowel in CV syllables.

...read moreread less

40 citations

Journal Article•DOI•

The operation of rendaku in the Japanese specifically language- impaired: A preliminary investigation.

[...]

Suzy E. Fukuda¹, Shinji Fukuda•Institutions (1)

McGill University¹

18 Feb 1999-Folia Phoniatrica Et Logopaedica

TL;DR: The data indicate that Japanese specifically language-impaired SLI children did not in fact voice most of the initial obstruents of the second member in non-frequent and novel compounds, whereas the age-matched non-SLI children did voice the appropriate obstruent of all the compounds and the younger non- SLI children voiced some initial Obstruents.

...read moreread less

Abstract: Rendaku is a well-documented phenomenon in Japanese phonology in which a word-initial voiceless obstruent becomes voiced when it is the second member of a compound (e.g., ori + kami --> origami 'paper folding'). It was hypothesized that Japanese specifically language-impaired (SLI) children who appear to rely on explicit declarative memory as opposed to implicit procedural memory to learn language would have difficulty forming such compounds: word-initial voiceless obstruents would remain unvoiced in the second members of non-frequent and novel compounds. Six Japanese SLI children, ranging in age from 8;9 to 12;1, 6 age-matched non-SLI children and 4 younger non-SLI children were given a word formation task involving three different categories of compounds. A significant difference in performance between the groups was found. The data indicate that the SLI children did not in fact voice most of the initial obstruents of the second member in non-frequent and novel compounds, whereas the age-matched non-SLI children did voice the appropriate obstruents of all the compounds and the younger non-SLI children voiced some initial obstruents of all the compounds. Qualitative differences in the responses provide evidence that the SLI children did not have or were unable to construct a productive procedural rule of voicing.

...read moreread less

29 citations

DOI•

Assimilation to the unmarked

[...]

Eric Baković¹•Institutions (1)

Rutgers University¹

01 Jan 1999

TL;DR: In this article, the root-or stem-controlled voicing assimilation of Yiddish is discussed. But the root and stem of affixes in the root morpheme remains constant while the value of the feature in affixe alternates with the root.

...read moreread less

Abstract: Assimilation is often controlled by a segment in a particular position. For instance, vowel harmony is often rootor stem-controlled, meaning that the value of the harmonic feature in the root morpheme (more accurately, the stem of affixation) remains constant while the value of the feature in affixes alternates to agree with the root. Similarly, in voicing assimilation, the value of [voice] often remains constant in an onset while a coda alternates to agree with the onset; voicing assimilation is thus often onset-controlled. An example of onset-controlled voicing assimilation comes from Yiddish (Katz 1987, Lombardi 1996, 1999). Final obstruents contrast in voicing, but adopt a following initial obstruent’s value of the feature in compounds.

...read moreread less

25 citations

Vowel duration in Scottish English speaking children

[...]

Nigel Hewlett, Ben Matthews, James M. Scobbie

01 Jan 1999

TL;DR: The authors found a robust Scottish Vowel Length Rule pattern for four of the subjects, with a minimal Voicing Effect, in children with two non-Scottish English speaking parents this pattern was either absent or less definite.

...read moreread less

Abstract: In most English accents vowel length is approximately 50% greater before a voiced consonant than before its voiceless cognate (the ‘Voicing Effect’). In Scottish English it is conditioned by the ‘Scottish Vowel Length Rule’. The lengthening environments of this rule overlap with those of the Voicing Effect. The phonetic details of the Scottish Vowel Length Rule and its relationship with the Voicing Effect are uncertain. Its influence on the speech of younger speakers is also not known. In this study, tokens of /i/ and /¬/ were measured in minimal pairs produced by seven Scottish English speaking children aged 6-9 years. Some pairs tested for a Voicing Effect, others for a Scottish Vowel Length effect. Results suggested a robust Scottish Vowel Length pattern for four of the subjects, with a minimal Voicing Effect. However, in children with two non-Scottish English speaking parents this pattern was either absent or less definite.

...read moreread less

Journal Article•DOI•

Contextual influences on the internal structure of phonetic categories: A distinction between lexical status and speaking rate

[...]

J. Sean Allen¹, Joanne L. Miller¹•Institutions (1)

Northeastern University¹

30 Aug 1999-Journal of the Acoustical Society of America

TL;DR: The results indicated that lexical status has a more limited and qualitatively different effect on the category's best exemplars than does the acoustic-phonetic factor of speaking rate.

...read moreread less

Abstract: A series of experiments examined the effects of an acoustic‐phonetic contextual factor, speaking rate, and a higher‐order linguistic contextual factor, lexical status, on the internal structure of a voicing category, specified by voice‐onset‐time (VOT). In keeping with previous results, speaking rate fundamentally altered the structure of the voiceless category, not only affecting the perception of stimuli in the voiced–voiceless category boundary region but also altering which tokens were rated as the best exemplars of the voiceless category. In contrast, the effect of lexical status was more limited. Although (as expected) lexical status also affected the perception of stimuli in the category boundary region, this effect disappeared in the region of the best‐rated exemplars. This distinction between the effects of speaking rate and lexical status on the internal structure of the voiceless category mirrors the effects of these factors in speech production: It is well known that speaking rate alters the V...

...read moreread less

Dissertation•DOI•

Nasals on my mind : The phonetic and the cognitive approach to the phonology of nasality.

[...]

Stefan Ploch

01 Jan 1999

TL;DR: In this article, the authors compare two approaches to the phonology of nasality: the phonetic approach, which is discussed in part 1, and the cognitive approach (part 2), which is argued to be the more empirical one.

...read moreread less

Abstract: This thesis compares two approaches to the phonology of nasality and consists therefore of two main parts: the phonetic approach, which is discussed in part 1, and the cognitive approach (part 2). This is to say that this thesis investigates how the Language Acquisition Device employs nasality to define vocalic or consonantal systems of contrast, on the one hand, and phonotactic constraints and phonological processes, on the other. Ultimately, the phonetic approach is rejected, while the cognitive view is argued to be the more empirical one. Part 1, which deals with the phonetic approach, has three chapters. In chapter 1, I show after a brief introduction to Popper's evolutionary view of research and empiricism, that the assumption that the phonologial behaviour of nasality or any other phonetically defined notion is phonetically motivated or grounded (the 'Phonetic Hypothesis', 'PH') is flawed. Chapter 2 investigates feature theories, e.g. underspecification and feature geometry, and discusses the metatheoretical problems these framework have due to the assumption of the PH. This demonstrates that phonological processes involving 'nasality' cannot be explained by the employment of features. In Chapter 3,1 look at the commonly held view that there is a phonetically motivated phonologically relevant link between nasality and vocalic height or consonantal place of articulation (the 'Heightmyth', 'HM'). Part 2 of this thesis shows in four chapters how a cognitive account avoids the metatheoretical problems of the phonetic approach. In addition, it introduces a new proposal in relation to the acquisitional role of phonology: Chapter 4 provides an introduction to Government Phonology ('GP') and, more specifically, to GP's subtheories dealing with melody: (Revised) Element Theory and the Theory of Generative Constraints. This chapter demonstrates that there are languages with phonetically oral vowels which can phonetically nasalise following oral consonants. In chapter 5, I put forward evidence for the merger of Kaye, Lowenstamm & Vergnaud's L- and N-element into one new element (new) L. The main advantages of such a move are that it helps to keep overgeneration down and that it provides the basis for a integrated account for the cross-linguistically attested phenomena of nasality-induced voicing, Dahl's and Meinhof's Law. Chapter 6 investigates Quebec French nasal vowels, Montpelier VN-sequences and English NC-clusters and proposes a unified account for them. This analysis includes a cognitive explanation of the French version of the Heightmyth, i.e. for the observation that French vowels may not be high. Finally, in chapter 7, I demonstrate that the view that the PH is mistaken points to a new insight: Acoustic cues do not only contain much phonologically useless packaging in addition to phonologically relevant material, but also underdetermine the phonological representation. In other words, acoustic cues do not always contain all the information necessary to determine the internal representation of a segment. This is due to a phenomenon I have labelled 'acoustic cue overlap'. I can show for a number of Turkic vowel systems that they could not be acquired without the help of phonological processes (I- and U-harmony). Similarly, even though phonetically defined cues like 'voiced' or 'voiceless' for segments do not contain much useful information in relation to the phonological behaviour of the segments involved, there is cross-linguistic evidence for my claim that many consonant systems (including those exhibiting voiced-voiceless contrasts) could not be acquired without the helping, i.e. disambiguating, hand of phonology. All in all, the cognitive approach to phonology will not only be shown to be more empirical than the phonetic approach but also to be much more insightful. (Abstract shortened by ProQuest.).

...read moreread less

Journal Article•DOI•

Ensemble responses of the auditory nerve to normal and whispered stop consonants.

[...]

Hanna E. Stevens¹, Robert E. Wickesberg¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 May 1999-Hearing Research

TL;DR: It is suggested that the encoding of either a whispered or a normal stop consonant results in the same temporal pattern in the ensemble response.

...read moreread less

Journal Article•DOI•

Guaraní voiceless stops in oral versus nasal contexts : An acoustical study

[...]

Rachel Walker¹•Institutions (1)

University of Southern California¹

01 Jun 1999-Journal of the International Phonetic Association

TL;DR: In this article, the authors investigated voiceless stops in Guarani that are described as transparent to nasal harmony and found that the velar stop exhibits the longest VOT of all places of articulation, and its VOT remains unaffected by oral/nasal context.

...read moreread less

Abstract: This acoustic study investigates voiceless stops in Guarani that are described as transparent to nasal harmony. Voiceless stops in oral versus nasal contexts are examined in relation to theoretical issues of locality and phonetic implementation. First, the oral/nasal and voicing properties of the stops are considered in connection to proposals in phonological theory that feature spreading produces strictly continuous spans of a spreading property. The stops are discovered to display the acoustic attributes of voiceless oral obstruents; no evidence of nasal airflow energy was observed during closure nor was the closure fully voiced. These results suggest that strict continuity of a spreading featural property is not always found in the phonetic output. Second, the timing of the voicelessness is addressed. Interestingly, the duration of voicelessness of the stops [p, t] remains the same across oral/nasal environments, although the voiceless interval shifts to persevere longer into the following vowel in nasal contexts. The velar stop does not exhibit an increased perseverance of voicelessness after closure release-it displays the longest VOT of all places of articulation, and its VOT remains unaffected by oral/nasal context. It is suggested that incorporating a notion of conflicting realizational requirements in models of phonetic implementation is important in interpreting these results.

...read moreread less

Patent•

Speech recognition device

[...]

Ichiro Akahori, Norihide Kitaoka, Hideo Miyauchi, Hiroshi Ono, Yoshitaka Ozaki, Kunio Yokoi, 教英北岡, 宏大野, 英夫宮内, 義隆尾崎, 邦雄横井, 一郎赤堀 - Show less +8 more

13 May 1999

TL;DR: In this paper, a speech judgement unit 20 judges that a monosyllable speech recognition result is given priority when the length of voicing of a user, the length between voicing and voicing, the loudness of the voice, and the pitch of voice are all within prescribed ranges.

...read moreread less

Abstract: PROBLEM TO BE SOLVED: To increase recognition reliability of monosyllable speech recognition and recognition reliability of continuous syllable speech recognition. SOLUTION: A speech judgement unit 20 judges that a monosyllable speech recognition result is given priority when the length of voicing of a user, the length between voicing and voicing, the loudness of the voice, and the pitch of the voice are all within prescribed ranges; when it is judged that the monosyllable speech recognition result is given priority, the speech recognition result of a monosyllable recognition unit 16 is outputted from a recognition result output unit 26 and when it is judged that a continuous syllable speech recognition result is given priority, the speech recognition result of a continuous syllable recognition unit 22 is outputted from the recognition result output unit 26. COPYRIGHT: (C)2006,JPO&NCIPI

...read moreread less

Journal Article•DOI•

Voice onset times and burst frequencies of four velar stop consonants in Gujarati

[...]

Manish K. Rami, Joseph Kalinowski, Andrew Stuart, Michael P. Rastatter

23 Nov 1999-Journal of the Acoustical Society of America

TL;DR: Differences in mean VOT and burst frequency as a function of voicing and aspiration were examined and found no significant difference between mean burst frequencies of aspirated and unaspirated stops.

...read moreread less

Abstract: Voice onset times (VOT) and burst frequencies of two aspirated (i.e., /kh/, /gh/) and two unaspirated (i.e., /k/, /g/) Gujarati velar stop consonants were investigated in an effort to provide characteristic acoustic information. Stop consonants in a monosyllabic vowel–consonant–vowel (VCV) production were obtained from eight native speakers of Gujarati. Differences in mean VOT and burst frequency as a function of voicing and aspiration were examined. A significant voicing by aspiration effect was found for VOT (p=0.026). The two voiced stops, while not significantly different from each other (p=0.278), had significantly shorter VOTs than voiceless stops. The aspirated /kh/ had a significantly longer VOT than the unaspirated /k/ (p=0.0013). With respect to burst frequency, voiced stops had significantly higher burst frequencies than voiceless stops (p=0.002). There was no significant difference between mean burst frequencies of aspirated and unaspirated stops (p=0.058).

...read moreread less

Journal Article•DOI•

Detection of consonant voicing: A module for a hierarchical speech recognition system

[...]

Jeung-Yoon Choi

30 Aug 1999-Journal of the Acoustical Society of America

TL;DR: The results in this study suggest that acoustic cues selected by considering the representation and production of speech may provide reliable criteria for determining consonant voicing.

...read moreread less

Abstract: This research describes a module for detecting consonant voicing in a hierarchical speech recognition system. In this system, acoustic cues are used to infer values of features that describe phonetic segments. A first step in the process is examining consonant production and conditions for phonation, to find acoustic properties that may be used to infer consonant voicing. These are examined in different environments to determine a set of reliable acoustic cues. These acoustic cues include fundamental frequency, difference in amplitudes of the first two harmonics, cutoff first formant frequency, and residual amplitude of the first harmonic, around consonant landmarks. Classification experiments are conducted on hand and automatic measurements of these acoustic cues for isolated and continuous speech utterances. Voicing decisions are obtained for each consonant landmark, and are compared with lexical and perceived voicing for the consonant. Performance is found to improve when measurements at the closure and release are combined. Training on isolated utterances gives classification results for continuous speech that is comparable to training on continuous speech. The results in this study suggest that acoustic cues selected by considering the representation and production of speech may provide reliable criteria for determining consonant voicing.

...read moreread less

Proceedings Article•DOI•

Speech enhancement using voice source models

[...]

A. Yasmin¹, Paul Fieguth, Li Deng•Institutions (1)

University of Waterloo¹

01 Jan 1999

TL;DR: This work has chosen one of the most common excitation models, the four-parameter LF model of Fant, Liljencrants and Lin (1985), and applied it to the enhancement of individual voiced phonemes and shows that the LF model yields a substantial improvement in performance.

...read moreread less

Abstract: Autoregressive (AR) models have been shown to be effective models of the human vocal tract during voicing. However the most common model of speech for enhancement purposes, AR process excited by white noise, fails to capture the periodic nature of voiced speech. Speech synthesis researchers have long recognized this problem and have developed a variety of sophisticated excitation models, however these models have yet to make an impact in speech enhancement. We have chosen one of the most common excitation models, the four-parameter LF model of Fant, Liljencrants and Lin (1985), and applied it to the enhancement of individual voiced phonemes. Comparing the performance of the conventional white-noise-driven AR, an impulsive-driven AR, and AR based on the LF model shows that the LF model yields a substantial improvement, on the order of 1.3 dB.

...read moreread less

Journal Article•DOI•

Voice onset time in speech produced by inexperienced signers during simultaneous communication.

[...]

Nicholas Schiavetti¹, Robert L. Whitehead², Dale Evan Metz¹, Natalie Moore³•Institutions (3)

State University of New York System¹, National Technical Institute for the Deaf², University of Memphis³

01 Jan 1999-Journal of Communication Disorders

TL;DR: VOT contrasts were enhanced in SC but followed English voicing rules and varied appropriately with place of articulation, similar to the voicing contrast results reported for clear speech by Picheny, Durlach, and Braida (1986) and for experienced signers using SC by Schiavetti, Whitehead, Metz,Whitehead, and Mignerey (1996).

...read moreread less

Synthesizing systematic variation at boundaries between vowels and obstruents

[...]

Sebastian Heid, Sarah Hawkins

01 Jan 1999

TL;DR: In this paper, the authors assess whether natural sounding excitation near segment boundaries enhances the intelligibility of formant synthesis and find that synthesized phrases proved more intelligible in noise when excitation at fricative boundaries and in voiced stop closures was structurally appropriate.

...read moreread less

Abstract: This work assesses whether natural-sounding excitation near segment boundaries enhances the intelligibility of formant synthesis. Excitation type at fricative-vowel (FV) and vowelfricative (VF) boundaries and durations of voicing in voiced stop closures are described for one male speaker of British English. Most VF boundaries have mixed aperiodic and periodic excitation, whereas most FV boundaries change abruptly from aperiodic to periodic excitation. Syllable stress, vowel height, and final/non-final position within the phrase influenced the incidence and duration of mixed excitation. Voicing in stop closures varied in well-understood ways. Synthesized phrases proved more intelligible in noise when excitation at fricative boundaries and in voiced stop closures was structurally appropriate. Implications for formant synthesis are discussed.

...read moreread less

Journal Article•DOI•

Voicing judgements by chinchillas trained with a reward paradigm.

[...]

Kevin K. Ohlemiller¹, Leifann B Jones¹, Arnold F. Heidbreder¹, William W. Clark¹, James D. Miller¹ - Show less +1 more•Institutions (1)

Central Institute for the Deaf¹

01 Apr 1999-Behavioural Brain Research

TL;DR: Experiments were performed to replicate and extend previous findings of similar categorization of voiced/voiceless consonant-vowel syllables by humans and chinchillas and suggested that listeners were attending to different phonetic cues in a manner that depended on the listener, rather than on species.

...read moreread less

Journal Article•DOI•

The role of F1 in the perception of voice onset time and voice offset time

[...]

Jörgen Pind¹•Institutions (1)

University of Iceland¹

30 Jun 1999-Journal of the Acoustical Society of America

TL;DR: An experiment is reported showing that this greater effect that vowel quantity has on the perception of VOffT than on the Perception of VOT cannot be explained by the effect of F1 frequency at vowel offset.

...read moreread less

Abstract: An important speech cue is that of voice onset time (VOT), a cue for the perception of voicing and aspiration in word-initial stops Preaspiration, an [h]-like sound between a vowel and the following stop, can be cued by voice offset time, a cue which in most respects mirrors VOT In Icelandic VOffT is much more sensitive to the duration of the preceding vowel than is VOT to the duration of the following vowel This has been explained by noting that preaspiration can only follow a phonemically short vowel Lengthening of the vowel, either by changing its duration or by moving the spectrum towards that appropriate for a long vowel, will thus demand a longer VOffT to cue preaspiration An experiment is reported showing that this greater effect that vowel quantity has on the perception of VOffT than on the perception of VOT cannot be explained by the effect of F1 frequency at vowel offset

...read moreread less

Lexical Classes in Japanese: a Reply to Rice

[...]

Junko Ito, Armin Mester, Jaye Padgett

01 Oct 1999

TL;DR: Ito, Mester, and Padgett as discussed by the authors made an argument based on the interaction of compound voicing and postnasal voicing in Japanese, that the [voice] specification of certain nasal- obstruent clusters, though redundant, is phonologically active.

...read moreread less

Abstract: Phonology at Santa Cruz, Vol. 6, 1999, pp. 39-46. Lexical Classes in Japanese: a Reply to Rice Junko Ito, Armin Mester, and Jaye Padgett University of California, Santa Cruz Ito, Mester, and Padgett (1995) make an argument, based on the interaction of compound voicing and postnasal voicing in Japanese, that the [voice] specification of certain nasal- obstruent clusters, though redundant, is phonologically active. The argument presupposes a well known division of the Japanese lexicon into separate strata—native or Y AMATO , S INO - J APANESE , sound-symbolic or M IMETIC , and more recent borrowings. Recently, Rice (1997) has attempted to cast doubt on the argument for active redundant [voice], by questioning the motivation for the relevant lexical strata. The intention of this squib is to address her arguments, and show that the posited stratal divisions for Japanese are indeed well motivated. 1 Background Compound voicing (Rendaku) involves the voicing of initial obstruents in second compound members meeting the right structural conditions, as shown in (1) (see Ito, Mester, and Padgett 1995, Rice 1997, and works cited there for further examples). ori + kami 6 origami ‘paper folding’ Rendaku is blocked when the targeted word already contains a voiced obstruent, as shown in (2). • iro + tabi 6 • irotabi ‘white socks’ * • irodabi This blocking, known as Lyman’s Law, is a reflex of a more general prohibition on roots containing two voiced obstruents, such as *dabi, *baga, etc. Both the constraint on roots and Lyman’s Law are attributed to an Obligatory Contour Effect involving [voice] by Ito and Mester (1986). The argument for active redundant [voice] is based on the fact that postnasal obstruents also block Rendaku, as shown in (3). take + tombo 6 taketombo ‘bamboo dragonfly’ (a toy) *takedombo This fact is significant, because voicing in postnasal obstruents is predictable (in Y AMATO words—see below). There are no words such as *tompo or *unsari next to actual tombo ‘dragonfly’ and unzari ‘disgusted’, etc. Within traditional generative theories of underspecification (see Steriade 1995 for references and an overview), this implies that postnasal voicing is underlyingly absent and therefore phonologically inactive. In fact, the © 1999 by Junko Ito, Armin Mester, and Jaye Padgett

...read moreread less

Journal Article•DOI•

Re-voicing Arianna (and laments): two women respond

[...]

Suzanne G. Cusick¹•Institutions (1)

University of Virginia¹

01 Aug 1999-Early Music

Book•

A Simple Guide to Better Voicing: For Teachers and Professional Voice Users

[...]

Eml Yiu

01 Jan 1999

Journal Article•DOI•

La duración consonantica en castellano

[...]

Laura del Barrio Estévez, Sergio Tornel Castells

15 Dec 1999

TL;DR: In this article, the influence of three different factors (stress, location on the syllable, and pause vicinity) on the duration of Spanish consonants in reading text has been studied.

...read moreread less

Abstract: This paper presents the results of an experimental analysis of the duration of Spanish consonants. The influence of three different factors-stress, location on the syllable, and pause vicinity- on the duration of consonants in reading text has been studied. Phonetic context and speech rate have also been controlled. The obtained results seem to indicate that the vicinity of a pause and the location of the consonant in the syllable have a systematic effect on all the consonants, while the stress only affects the duration of a reduced set of voiced allophones ([1], [r] and [n]). Some general conclusions about the mean duration of each consonant, and about the relationship between duration and voicing, manner and place of articulation, have also been established.

...read moreread less

Proceedings Article•

A measure of speech and pitch reliability from voicing

[...]

Frédéric Berthommier¹, Hervé Glotin•Institutions (1)

Grenoble Institute of Technology¹

01 Jan 1999

TL;DR: Speech Reference EPFL-CONF-82549 highlights the importance of knowing the carrier and removal status of canine coronavirus in the context of infectious disease.

...read moreread less

Abstract: Keywords: glotin ; speech Reference EPFL-CONF-82549 Record created on 2006-03-10, modified on 2017-05-10

...read moreread less

Patent•

Singing sound synthesizing device

[...]

Ota Shinichi, Nishimoto Tetsuo

22 Jan 1999

TL;DR: In this paper, text data are read out corresponding to melody data stored in a music information storage part 5 and voice parameters consisting of phonematic articulation coupling control data corresponding to the vocal sound of the text data, and a voicing parameter supply control part 7 interpolates the respective parameters and supplies them to formant waveform generation parts 81 to 8m at specific intervals of time to synthesize and output a singing voice corresponding to text.

...read moreread less

Abstract: PROBLEM TO BE SOLVED: To synthesize a more natural singing sound on the basis of text data SOLUTION: Text data are read out corresponding to melody data stored in a music information storage part 5 and voice parameters consisting of phonematic formant data and phonematic articulation coupling control data corresponding to the vocal sound of the text data are read out of a voice quality control information storage part 6; and a voicing parameter supply control part 7 interpolates the respective parameters and supplies them to formant waveform generation parts 81 to 8m at specific intervals of time to synthesize and output a singing voice corresponding to the text The speed of pitch variation at the time of a rise in interval is made slower than that at the time of a decrease Further, the pitch is held when a voiceless sound is generated and begins to be varied when a voiced sound is generated For consonant and vowel sounds, the target values of the formant frequencies of the vowels are varied

...read moreread less

Journal Article•DOI•

The perception of phonologically significant contrasts using speech envelope cues.

[...]

Liat Kishon-Rabin¹, Michal Nir-Dankner•Institutions (1)

Tel Aviv University¹

01 Jan 1999-Journal of basic and clinical physiology and pharmacology

TL;DR: Findings help to better understand speech perception performance of hearing-impaired individuals, including cochlear implant users, and may have practical implications for aural rehabilitation and sensory aids design for the Hebrew speaking population.

...read moreread less

Abstract: The observation that many cochlear implantees demonstrate high levels of speech recognition, despite limited or distorted spectral information, has motivated research on the importance of temporal information for the perception of speech. The purpose of this study was to measure the recognition of speech contrasts via only the speech envelope before and after training. Test stimuli consisted of eight segmental and two suprasegmental contrasts of the Hebrew Speech Pattern Contrast test using a binary forced-choice paradigm. Multiplying the speech waveform with white noise eliminated spectral information. Results show that stress, intonation and manner of articulation were very well perceived using only temporal information, whereas voicing and place of articulation were perceived above chance levels. Results also show that vowels were more susceptible to the removal of spectral information than consonants. These findings help to better understand speech perception performance of hearing-impaired individuals, including cochlear implant users. They may also have practical implications for aural rehabilitation and sensory aids design for the Hebrew speaking population.

...read moreread less