Showing papers on "Voice published in 1993"

PDF

Open Access

Journal Article•DOI•

F0 gives voicing information even with unambiguous voice onset times

[...]

Douglas H. Whalen¹, Arthur S. Abramson, Leigh Lisker, Maria Mody•Institutions (1)

01 Apr 1993-Journal of the Acoustical Society of America

TL;DR: The voiced/voiceless distinction for English utterance-initial stop consonants is primarily realized as differences in the voice onset time, which is largely signaled by the time between the stop burst and the onset of voicing.

...read moreread less

Abstract: The voiced/voiceless distinction for English utterance‐initial stop consonants is primarily realized as differences in the voice onset time (VOT), which is largely signaled by the time between the stop burst and the onset of voicing. The voicing of stops has also been shown to affect the vowel’s F0 after release, with voiceless stops being associated with higher F0. When the VOT is ambiguous, these F0 ‘‘perturbations’’ have been shown to affect voicing judgments. This is to be expected of what can be considered a redundant feature, that is, that it should carry a distinction in cases where the primary feature is neutralized. However, when the voicing judgments were made as quickly as possible, an inappropriate F0 was found to slow response time even for unambiguous VOTs. This was true both of F0 contours and level F0 differences. These results reinforce the plausibility of tonogenesis, and they add further weight to the claim that listeners make full use of the signal given to them, even when overt labeling would seem to indicate otherwise.

...read moreread less

124 citations

Journal Article•DOI•

Perception and Production of a Voicing Contrast by French-English Bilinguals

[...]

Valerie Hazan¹, Georges Boulakia²•Institutions (2)

University College London¹, University of Paris²

01 Jan 1993-Language and Speech

TL;DR: In this paper, the use of spectral information at vowel onset, which constitutes a stronger cue to the voicing contrast in English than in French, was investigated in French-English bilinguals in order to determine whether the primary language in terms of early experience determines acoustic cue weighting.

...read moreread less

Abstract: The use of spectral information at vowel onset, which constitutes a stronger cue to the voicing contrast in English than in French, was investigated in French-English bilinguals in order to determine whether the primary language in terms of early experience determines acoustic cue weighting. The /pen/-/ben/ minimal pair, meaningful in both languages, was used as a base for identification tests, which were presented with either an English or a French precursor word before each token. Two stimulus continua, formed of digitally-edited natural speech tokens, had an identical VOT range but varied in their [en] stem. In their production of the contrast, bilinguals showed clear evidence of code-switching but did not always produce monolingual-like VOTs in their weaker language. In perception, the code-switching effect was significant but small. The bilingual group with English as primary early language showed a greater effect of vowel onset characteristics, in conflictingcue conditions, than the bilingual group ...

...read moreread less

117 citations

Journal Article•DOI•

Attentional Modulation of the Phonetic Significance of Acoustic Cues

[...]

Peter C. Gordon¹, Jennifer L. Eberhardt¹, Jay G. Rueckl¹•Institutions (1)

Harvard University¹

01 Jan 1993-Cognitive Psychology

TL;DR: Four experiments addressing the role of attention in phonetic perception indicate that careful attention to speech perception is necessary for strong acoustic cues to achieve their full impact on phonetic labeling, while weaker acoustic cues (FO onset frequency and vowel duration) achieve theirFull impact on speech perception without close attention.

...read moreread less

90 citations

Journal Article•DOI•

Relationships Between Acoustically Determined Knowledge of Stop Place and Voicing Contrasts and Phonological Treatment Progress

[...]

Ann A. Tyler¹, G. Randall Figurski¹, Teru E. Langsdale¹•Institutions (1)

University of Nevada, Reno¹

01 Aug 1993-Journal of Speech Language and Hearing Research

TL;DR: The speech of 7 children with phonological disorders was analyzed for imperceptible acoustic distinctions for seemingly homophonous word pairs and a shorter treatment period was observed for subjects attributed to have productive knowledge of the contrast being trained, as compared with those who had no knowledge.

...read moreread less

Abstract: The speech of 7 children with phonological disorders (4 who failed to produce an initial voicing contrast for stops and 3 who failed to produce the alveolar-velar stop contrast) was analyzed for imperceptible acoustic distinctions for seemingly homophonous word pairs. Subjects were audio/video recorded before and during treatment as they produced minimal pairs containing their error and correct sounds. Acoustic measures were VOT and CV locus equations. The presence of acoustic distinctions was taken as evidence for productive knowledge of the sound contrasts. Treatment was applied experimentally and progress was related to pretreatment productive knowledge inferred from acoustic distinctions. A shorter treatment period was observed for subjects attributed to have productive knowledge of the contrast being trained, as compared with those who had no knowledge. One of the 4 subjects with initial voicing errors produced an acoustic distinction between voiced and voiceless stops and required the shortest treat...

...read moreread less

72 citations

Journal Article•DOI•

Contextual variation of the vowel voice source as a function of adjacent consonants

[...]

Ailbhe Ní Chasaide¹, Christer Gobl¹•Institutions (1)

Trinity College, Dublin¹

01 Apr 1993-Language and Speech

TL;DR: The contextual effects of voiced/voiceless stops on the voice source of an adjacent vowel were examined for the first vowel in ‘CVCV utterances in German, English, Swedish, French, and Italian to yield insights into the control parameters which may be involved in regulating voicing oppositions in these languages.

...read moreread less

Abstract: The contextual effects of voiced/voiceless stops on the voice source of an adjacent vowel were examined for the first vowel in 'CVCV utterances in German, English, Swedish, French, and Italian. The principal analysis technique involved interactive inverse filtering and parameterisation of the glottal waveform in terms of a four-parameter voice source model (the LF-model). This analysis procedure was supplemented by measures from narrow-band spectral sections of the speech output and by oral airflow recordings which allow inferences about the relative timing of glottal and supraglottal gestures. Results indicated that the voiced/voiceless nature of the consonant does yield differences in the voice source of the vowel. The most striking effects were found in the context of voiceless consonants, and cross-language differences did emerge in terms of directionality and degree. Extensive anticipatory effects were found for Swedish and for some speakers of English. Preceding the voiceless stop the vowel becomes increasingly breathy-voiced, and it would appear that the glottal abduction gesture is anticipated very early in the course of the vowel. Italian exhibited a similar tendency, though to a considerably lesser degree. The German data, on the other hand, showed certain strong carryover effects: Following the voiceless aspirated stop there was extensive breathy voicing. French showed little contextual variation in either direction. Rather surprisingly, the observed effects were not directly correlated with, or predictable from, the phonetic categories involved (voiced, voiceless unaspirated, and voiceless postaspirated). These results yield insights into the control parameters which may be involved in regulating voicing oppositions in these languages. Whereas the anticipatory effects observed might be consistent with a "timing" model of glottal control, the carryover effects cannot be explained in terms of timing alone and suggest that differences in tension settings of the laryngeal musculature may also be implicated.

...read moreread less

54 citations

Journal Article•DOI•

The effects of semantic context on voicing neutralization.

[...]

Jan Charles‐Luce¹•Institutions (1)

State University of New York System¹

01 Jan 1993-Phonetica

TL;DR: The results suggest that neutralization occurs when semantic information is present, but that a voicing contrast is realized when it is absent.

...read moreread less

Abstract: The present study examined regressive voice assimilation in Catalan in an attempt to determine a systematic explanation of complete versus incomplete voicing neutralization. Two types of contexts were constructed. In one type, semantic information was present to bias the meaning of target words. In the other type, no semantic information was present. The results showed that vowel duration distinguished underlying voicing in the neutral context only. The results suggest that neutralization occurs when semantic information is present, but that a voicing contrast is realized when it is absent.

...read moreread less

53 citations

Journal Article•DOI•

Perceptual switching in Spanish/English bilinguals

[...]

Ocke-Schwen Bohn¹, James Emil Flege²•Institutions (2)

University of Kiel¹, University of Alabama at Birmingham²

01 Jul 1993-Journal of Phonetics

TL;DR: In this paper, the authors explore the basis of language set effects for short-lag voice onset time (VOT) voicelessness in Spanish/English perceptual sets and find that the effect of VOT was not an overriding cue to the voicing feature for them. But they did not reveal acoustic dimensions that would reliably differentiate the shortlag Spanish /t/ tokens that were predominantly identified as "t" from those that were ambiguous between " t" and "d".

...read moreread less

51 citations

Journal Article•DOI•

Closest Speaking Space During the Production of Sibilant Sounds and its Value in Establishing the Vertical Dimension of Occlusion

[...]

C.A. Burnett¹, Thomas Clifford¹•Institutions (1)

Queen's University Belfast¹

01 Jun 1993-Journal of Dental Research

TL;DR: It was concluded that voicing sibilant phonemes, or word sounds, does cause the subject to adopt the CSS, and that a single sIBilant word sound does not give a reliable indication of the smallest speaking vertical dimension.

...read moreread less

Abstract: The purpose of this investigation was to determine whether the production of sibilant sounds involved adopting a jaw position that corresponded to the closest vertical speaking space (CSS), by analysis of the smallest vertical excursion of the mandible during the performance of different phonetic exercises. A further objective was to establish the variability in the CSS produced by individual sibilant phonemes. Thirty young adult subjects had their CSS determined during three separate phonetic tests, using a kinesiograph (Sirognathograph, Siemens A.G., Benshiem, Germany) and a Bio-Pak (BioResearch Associates Inc., Milwaukee, WI) jaw-tracking software program. The first test was a general phonetic articulation test containing all the sounds of the English language and specifically including all six sibilant word sounds. The second phonetic test contained the six sibilant sound making up a short sentence. The third test included six single words, each expressing a different sibilant sound. No statistically significant difference among the mean CSS determined in each of three exercises was demonstrable. A phonetic test containing all sibilant sounds produced a CSS equivalent to that of a test containing all speech sounds. The vertical component of the CSS was also independent of the form or duration of the phonetic tests containing the sibilant word sounds used in this investigation. The CSS determined for 5 of the individual sibilant phonemes in the third exercise differed (p < 0.05) from that calculated for the three complete exercises. It was concluded that voicing sibilant phonemes, or word sounds, does cause the subject to adopt the CSS.(ABSTRACT TRUNCATED AT 250 WORDS)

...read moreread less

46 citations

Journal Article•DOI•

Rate of speech effects in aphasia: voice onset time.

[...]

Shari R. Baum¹, L. Ryan¹•Institutions (1)

McGill University¹

01 May 1993-Brain and Language

TL;DR: Normal subjects' VOTs were significantly shorter at the fast rate of speech relative to the slow/normal rate, as expected, and the nonfluent aphasic patients produced voice and voiceless consonants with somewhat overlapping VOT distributions, indicating an impairment in temporal integration in these subjects.

...read moreread less

36 citations

Journal Article•DOI•

Lexical acquisition and acquisition of initial voiceless stops

[...]

Ann A. Tyler¹, Mary Louise Edwards²•Institutions (2)

University of Nevada, Reno¹, Syracuse University²

01 Jun 1993-Journal of Child Language

TL;DR: The interaction between lexical acquisition and acquisition ofInitial voiceless stops was studied in two normally developing children by acoustically examining the token-by-token accuracy of initial voiceless stop targets in different lexical items.

...read moreread less

Abstract: The interaction between lexical acquisition and acquisition of initial voiceless stops was studied in two normally developing children, aged 1;9 and 1;10, by acoustically examining the token-by-token accuracy of initial voiceless stop targets in different lexical items. Production accuracy was also examined as it related to the frequency of usage of different words, as well as the time when they entered the children's lexicons. Fewer than half of the words in the children's lexicons had tokens representing the emergence of accurate voiceless stop production prior to the session at which the voicing contrast was achieved. These words were primarily ‘old’ words that had been in the children's lexicons from the beginning of data collection, as opposed to ‘new’ words, first produced in later recording sessions. Findings are discussed in reference to the ‘lexical diffusion’ model of sound change and within the framework of nonlinear underspecification theory.

...read moreread less

30 citations

Journal Article•DOI•

Changes in voice-onset time in speakers with cochlear implants.

[...]

Harlan Lane¹, Jane Wozniak, Joseph S. Perkell•Institutions (1)

Massachusetts Institute of Technology¹

01 Sep 1993-Journal of the Acoustical Society of America

TL;DR: The findings are interpreted as supporting the hypothesis that speakers use their hearing to calibrate mechanisms of speech production by monitoring the relations between their articulations and their acoustic output.

...read moreread less

Abstract: Voice‐onset time (VOT) and syllable duration were measured for the English plosives in /C■d/ (C=consonant) context spoken by four postlingually deafened recipients of multichannel (Ineraid) cochlear implants. Recordings were made of their speech before, and at intervals following, activation of the speech processors of their implants. Three patients reduced mean syllable duration following activation. Using measures of VOT and syllable duration from speakers with normal hearing [Volaitis and Miller, J. Acoust. Soc. Am. 92, 723–735 (1992)] and from the subjects of this study, VOT is shown to vary approximately linearly with syllable duration over the ranges produced here. Therefore, the VOT of each token was adjusted for the change in syllable duration of that token relative to the mean syllable duration in the first baseline session. This variable, labeled VOTc, was used to evaluate the effects on voicing of the speakers’ renewed access to the voicing contrast provided by their implants. Preimplant, all four speakers characteristically uttered voiced plosives with too‐short VOT, compared to the measures for hearing speakers. Voiceless plosive mean VOT was also abnormally short for two of the speakers, and close to normal for the remaining two. With some hearing restored, subjects made relatively few errors with respect to voicing when identifying plosives in listening tests, and three of the four speakers lengthened VOTc. The findings are interpreted as supporting the hypothesis that speakers use their hearing to calibrate mechanisms of speech production by monitoring the relations between their articulations and their acoustic output.

...read moreread less

Manner vs place of articulation in the Kiranti initial stops

[...]

Boyd Michailovsky

01 Sep 1993

TL;DR: The authors reconstructed the Kiranti or East Himalayish subgroup of Tibeto-Burman, with divergent evolutions depending on the place of articulation, showing a Germanic devoicing, with the voicing oppostion transphonologised into aspiration.

...read moreread less

Abstract: Phonological reconstruction of the Kiranti or East Himalayish subgroup of Tibeto-Burman. Voiced and unvoiced series are reconstructed, with divergent evolutions depending on the place of articulation. The Eastern group shows a Germanic devoicing, with the voicing oppostion transphonologised into aspiration. Development of a partial glottal series (ɓ, ɗ) is proposed to explain the voicing "flip-flop" observed between western and southern subgroups in the dental and bilabial orders.

...read moreread less

Journal Article•DOI•

Inferring articulatory movements from acoustic properties at fricative‐vowel boundaries

[...]

Lorin Wilde

01 Sep 1993-Journal of the Acoustical Society of America

TL;DR: The authors measured the articulatory kinematics involved in the transition from a consonant to a vowel and found that voiceless fricative transition is more dependent on vowel context than the vowel context.

...read moreread less

Abstract: Formant transitions provide context‐dependent acoustic cues that can be interpreted in terms of the articulatory kinematics involved in moving from a consonant to a vowel Formant frequencies were measured at identified acoustic landmarks for eight English fricatives preceding front, back, and back‐rounded vowels Formant onset times designated the point when the energy increased most rapidly and evidence of the first formant was first observed Comparing the two‐dimensional representation of F2×F3 onset frequencies along the voicing dimension showed the voiceless fricatives to be more dependent on vowel context The onset frequencies for voiced fricatives reflect a more extreme supraglottal posture, while the voiceless fricative measures can be considered to be at a point closer to the vowel because voicing begins at a later time relative to the oral release gesture Formant structure in the noise before the release, to the extent that it is visible in the consonantal interval prior to voicing onset, can

...read moreread less

Journal Article•DOI•

The Durations of Syllable-Final Nasals and the Mora Hypothesis in Japanese

[...]

Yumiko Sato

01 Jan 1993-Phonetica

TL;DR: In this paper, the authors investigated durational differences in syllable-final nasals in Japanese, English, and Korean, and examined the mora hypothesis in Japanese (the mora nasal /n/) in Japanese.

...read moreread less

Abstract: This study investigates durational differences in syllable-final nasals in Japanese, English, and Korean, and examines the mora hypothesis in Japanese. The phenomenon that syllable-final nasals are longer when followed by a voiced consonant than when followed by a voiceless consonant was observed in languages of different timing categories, i. e. Japanese (moratimed), Korean (syllable-timed), and English (stress-timed). However, syllable-final nasals in Japanese (the mora nasal /n/) are set apart with respect to the moraic status: syllable-final nasals (moraic) are clearly differentiated in duration from syllable-initial nasals (non-moraic).

...read moreread less

Old English fricative voicing unvisited

[...]

Roger Lass

01 Jan 1993

Proceedings Article•DOI•

A motivation-sustaining articularity/acoustic speech training system for profoundly deaf children

[...]

H. Javkin¹, N. Antonanzas-Barroso¹, A. Das¹, D. Zerkle¹, Yoshinori Yamada¹, N. Murata, H. Levitt², Karen Youdelman² - Show less +4 more•Institutions (2)

Panasonic¹, Lexington School for the Deaf²

27 Apr 1993

TL;DR: A speech training system for deaf children which integrates acoustic and several types of instrumentally measured articulatory data: palatography, nasal vibration, airflow, and the presence/absence of voicing is described.

...read moreread less

Abstract: The authors describe a speech training system for deaf children which integrates acoustic and several types of instrumentally measured articulatory data: palatography, nasal vibration, airflow, and the presence/absence of voicing. The system presents these in both a technical and a motivating game format. It was designed to be used both with a teacher's guidance and (within certain limits) by the children alone. The games are proving to be highly motivating to the children, and encourage them to experiment with their speech production. >

...read moreread less

Dissertation•

Experimental study of emphasis and voicing in the plosives of Yemeni Spoken Arabic with some implications for foreign language teaching and learning

[...]

Abdulrahman Mohamed Al-Nuzaili

01 Dec 1993

TL;DR: In this paper, an experimental study of two major distinctive features: emphasis and voicing in the plosives of Yemeni spoken Arabic is presented, which investigates some of their acoustic, perceptual and arodynamic correlates and aims to find the language-specific aspects in as far as these phonetic phonemena are concerned.

...read moreread less

Abstract: This is an experimental study of two major distinctive features: emphasis and voicing in the plosives of Yemeni Spoken Arabic. It investigates some of their acoustic, perceptual and arodynamic correlates and aims, at least in part, to find the language-specific aspects in as far as these phonetic phonemena are concerned. It falls into two related parts. Part One consists of two chapters. Chapter one gives a general background of YSA with special reference to the phonemic significance of emphasis and voicing in the plosives and their interaction with various contextual factors and positions. Phonological definitions of these features are given. Various theoretical approaches are also dealt with. The syllable structure and the stress patterns in both Modern Standard Arabic and Yemeni Spoken Arabic are presented. Chapter two reviews critically some of the hypotheses and interpretations of voicing mechanisms and the factors affecting their realizations in various languages. Some of the relevant aspects reviewed are voice onset time in various languages, formant transitions, closure durations, temporal relationship between consonants and vowels, categorical perception and the phoneme boundary, aerodynamic factors and their role in the production of plosives. The two features are also reviewed in relation to vocalic context, place of articulation, stress, gemination and phonetic position. Part Two consists of four chapters representing the main body of this study. Chapter three is an investigation of the acoustic characteristics of the voiced/voiceless and emphatic/nonemphatic categories in words embedded in a contextual frame sentence. Chapter four is a perceptual investigation of the above contrasts by means of synthetically generated speech using the Klatt Synthesizer. It examines the role played by VOT, the relative onset time between the release and the onset of voicing, in the accurate identification of the voicing cognates. Another experiment attempts to evaluate the role of the second formant particularly its onset frequency and steady state portion in the emphatic/nonemphatic distinction. The relationships between perception and production are described and the theory of 'categorical perception' in relation to our data is also discussed. Chapter five investigates aerodynamic patterns and aerodynamically derived estimates of articulation for the emphatic/nonemphatic and the voiced/voiceless consonants in two experiments. Since there are several variables involved in this investigation, the results in both experiments are subjected to analyses of variance to obtain the effects of the independent variables on the dependent ones. In chapter six the findings of the previous three chapters are summarized. Some implications for foreign language teaching and learning are also discussed. The study ends with a section on the limitations and suggestions for future research.

...read moreread less

Journal Article•DOI•

Segmental duration changes due to variations in stress, vowel, place of articulation, and voicing of stop consonants in Greek

[...]

H. B. Kollia

01 Apr 1993-Journal of the Acoustical Society of America

TL;DR: The authors examined the contextual effects on the VOT of the post-consonantal vowel, the stress pattern, and the distance of the stress from the initial stop consonant.

...read moreread less

Abstract: Although Lisker and Abramson (1967) found no effect of the following vowel on the VOT of a stop consonant, Port and Rotunno (1979) found VOT to have greater values for voiceless stops followed by tense than by lax vowels. The purpose of the present study was to obtain a complete database on the VOT characteristics of voiced and voiceless initial stop consonants in Greek, and to examine the contextual effects on the VOT of the post‐consonantal vowel, the stress pattern, and the distance of the stress from the initial stop consonant. The question here was whether the vowel effects found by Port and Rotunno for English would be seen in Greek, a language whose two stop categories have voicing lead and medium lag. Speakers read isolated disyllabic and trisyllabic words of four stress patterns. The utterance‐initial stops /p, t, k, b, d, g/ were followed by the five vowels of Greek, /ɑ , eh, i, o, u/. Results indicated that both voicing lead and voicing lag increased for stops followed by higher than by lower v...

...read moreread less

Journal Article•DOI•

Speech input for dysarthric users

[...]

Hwa‐Ping Chang

01 Sep 1993-Journal of the Acoustical Society of America

TL;DR: The analysis and testing have led to several conclusions concerning the control of the articulators for this speaker: production of obstruent consonants was a particular problem, whereas sonorant consonants were less of a problem (70% correct), and vowel errors were less prevalent.

...read moreread less

Abstract: One practical aim of this research is to determine how best to use speech recognition techniques for augmenting the communication abilities of dysarthric speakers. As a first step toward this goal, the following kinds of analyses and tests have been performed on words produced by several dysarthric speakers: a closed‐set intelligibility test based on Kent et al. [J. Speech Hear. Disord. 54, 482–499 (1989)]; an open intelligibility test; critical listening and transcription; acoustic analysis of selected utterances; and an evaluation of the recognition of words by a commercial speech recognizer. The data from one speaker have been examined in detail. The analysis and testing have led to several conclusions concerning the control of the articulators for this speaker: production of obstruent consonants was a particular problem (only 30% of syllable‐initial obstruents were produced with no error), whereas sonorant consonants were less of a problem (70% correct). Of the obstruent errors, most were voicing errors, but place errors for alveolars (particularly fricatives) were also high, and these consonants were produced inconsistently, as inferred from acoustic analysis and from low scores from the recognizer for words with these consonants. In comparison, vowel errors were less prevalent. Implications for the use of a speech recognizer for augmenting this speaker’s communication abilities are discussed.

...read moreread less

Patent•

Continuous speech recognition system

[...]

Isotani Ryosuke

03 Sep 1993

TL;DR: In this paper, a phoneme model at the tail of a word and a diphone model in other cases are used to recognize a continuous speech generated by continuously voicing words.

...read moreread less

Abstract: PURPOSE:To precisely recognize even a speech generated by continuously voicing words, without increasing the throughput at word borders when recognition units depending upon environment. CONSTITUTION:Diphones obtained by fractionizing a phoneme by a following phoneme and a phoneme which does not depend upon the following phoneme are used as the recognition units. A word dictionary 3 is so constituted as to use a phoneme model at the tail of a word and a diphone model in other cases. A recognition network 4 is generated by using the word dictionary, model parameters, and grammatical information to recognize a continuous speech. The parameters of the phoneme model are found by averaging the parameters of the diphone model.

...read moreread less

Patent•

Speech recognizing and answering device

[...]

Takebayashi Yoichi, Shinoda Hidenori, Ukita Teruhiko

13 Jul 1993

TL;DR: In this article, the authors present a speech answering control for a speech recognition and answering system with an ultterance speed measuring instrument 16 which measures the utterance speed of the input speech and an answering control part 17 which prepares expression forms differing in the number of characters for answer sentences having the same meaning and controls the expression forms of the answer sentence voiced by the voice answer output part 18 according to the voicing speed measured by the measuring device 16.

...read moreread less

Abstract: PURPOSE:To provide the speech recognizing and answering device with high practicability which enables natural and smooth interaction between a human being and a machine and processes information through effective speech input. CONSTITUTION:The speech recognizing and answering device which recognizes an input speech by a speech recognition part 13 and voices an answer sentence for the recognition result from a speech answer output part 18 is equipped with an ultterance speed measuring instrument 16 which measures the utterance speed of the input speech and a speech answering control part 17 which prepares expression forms differing in the number of characters for answer sentences having the same meaning and controls the expression forms of the answer sentence voiced by the voice answer output part 18 according to the voicing speed measured by the voicing speed measuring instrument 16.

...read moreread less

Journal Article•DOI•

Locating landmarks in utterances for speech recognition

[...]

Sharlene A. Liu

01 Apr 1993-Journal of the Acoustical Society of America

TL;DR: In this paper, an algorithm was proposed to locate landmarks caused by closures and releases of obstruent consonants (sounds produced with a pressure buildup behind a constriction) flanked by sonorants.

...read moreread less

Abstract: Locating landmarks, or acoustically important points, in an utterance is the first step in a proposed method for feature‐based speech recognition. The algorithm developed here is designed to locate landmarks caused by closures and releases of obstruent consonants (sounds produced with a pressure buildup behind a constriction) flanked by sonorants. Two characteristics of obstruents are: (1) voicing diminishes or stops completely at the onset and (2) noise is generated during the constricted interval (at the release in the case of stops and affricates). The algorithm thus monitors voicing changes by keeping track of low‐frequency signal energy and locates a landmark wherever a rapid change occurs. It also monitors higher frequencies for the presence of noise to aid in the detection of voiceless stop and affricate releases. With appropriate selection of time windows, smoothing intervals, and frequency bands, sonorant/obstruent boundaries for stops, fricatives, and affricates could be detected with only a few percent error. Semivowels and creaky voicing sometimes mistakenly cause a landmark to be detected, but more detailed analysis of the characteristics of these erroneous landmarks may overcome this problem. [Work supported by NSF.]

...read moreread less

Journal Article•DOI•

Stimulus control analysis of language disorders: A study of substitution between voiced and unvoiced consonants

[...]

Alcione G. Brasolotto¹, Julio C. de Rose¹, Lawrence T. Stoddard², Deisy das Graças de Souza¹•Institutions (2)

Federal University of São Carlos¹, Northeastern University²

01 Jan 1993-The Analysis of Verbal Behavior

TL;DR: Results showed distinct deficit profiles for each subject, consisting of patterns of defective stimulus control relations underlying persistent substitution between voiced and unvoiced consonants in the speech and writing of two children.

...read moreread less

Abstract: This study attempted to analyze defective stimulus control relations underlying persistent substitution between voiced and unvoiced consonants in the speech and writing of two children. A series of 20 tests was administered repeatedly. Some tests consisted of matching-to-sample tasks, with dictated words, printed words, or pictures as samples. Comparison stimuli were arranged in pairs of printed words or pictures, such that the only difference in their corresponding spoken words was the voicing of one consonant phoneme. In other tests, a stimulus (dictated word, printed word, or picture) was presented, and the subject was required to emit an oral response (repeat the dictated word, read the printed word, or name the picture) or a written response (write to dictation, copy the word, or write a picture name). Other tests required the subjects to make a same/different distinction in pairs of dictated words that did or did not differ in the voicing of a single phoneme. Results showed distinct deficit profiles for each subject, consisting of patterns of defective stimulus control relations. The subjects were able, however, to distinguish between voiced and unvoiced sounds and to produce these sounds.

...read moreread less

Journal Article•DOI•

Studying the effects of speaking rate and syllable structure on phonetic perception using recurrent neural networks

[...]

Mukhlis Abu-Bakar¹, Nick Chater²•Institutions (2)

Bangor University¹, University of Edinburgh²

01 Jan 1993-Irish Journal of Psychology

TL;DR: This article applied recurrent neural networks to the processing of time-warped sequences, particularly, modelling how listeners distinguish between phonetic categories in the context of changing speech rate, and applied a more detailed speech representation to model the effects of both speaking rate and syllable structure.

...read moreread less

Abstract: We apply recurrent neural networks to the processing of time-warped sequences, particularly, modelling how listeners distinguish between phonetic categories in the context of changing speech rate. In an earlier paper (AbuBakar & Chater, 1993), we modelled the effects of speaking rate on the perception of voicing contrasts specified by voice-onset-time (VOT) in syllable-initial stop consonants using a simple coding procedure. In the present investigation, we apply a more detailed speech representation to model the effects of both speaking rate and syllable structure on the syllableinitial distinction between a stop consonantlbl and a semivowel/wl cued by the duration of the formant transitions. In the first set of experiments, we constructed nine pairs of Iba/-/wa/ syllables varying in syllable duration and transition values. In another set of experiments, we compressed these syllables and added syllable-final transitions appropriate for a stop consonant /d/ to produce a second set of syllables (/bad-/wadl...

...read moreread less

Patent•

Speech synthesis system

[...]

Miyamoto Maki, Mitome Yukio

26 Mar 1993

TL;DR: In this paper, the authors simplify a selecting process by using various tables when unnaturalness is reduced by using data selected in diverse continuous speeches as to respective unit speeches for speech synthesis for synthesizing an optional word.

...read moreread less

Abstract: PURPOSE:To simplify a selecting process by using various tables when unnaturalness is reduced by using data selected in diverse continuous speeches as to respective unit speeches for speech synthesis for synthesizing an optional word CONSTITUTION:Speech parameters obtained by analyzing a natural speech which is continuously voiced in advance, correspondence relation between unit speeches and the speech parameter, and a phoneme series in the voicing of the speech parameters are stored in a unit speech data table 4, which is referred to for each unit speech according to phoneme and rhythm information on an inputted character string to select the best unit speech in unit speech data according to a determined selection reference; and the speech parameter of the unit speech data selected by extraction 6 from the speech parameters 7 according to the information in the unit speech data table is used to synthesize a speech

...read moreread less

Journal Article•DOI•

Training native English speakers to identify Hindi dental and retroflex consonants

[...]

John S. Pruitt

01 Apr 1993-Journal of the Acoustical Society of America

TL;DR: This article investigated the difficulty experienced by American-English listeners in identifying Hindi dental and retroflex stop consonants in different voicing conditions, and found that the difficulty was affected by which speaker produced the contrasts and, to a lesser extent, by the vowel context.

...read moreread less

Abstract: Training listeners to perceive consonantal contrasts that do not occur in their native language has proved to be difficult. Cross‐language training studies usually produce about 10% improvement in performance. This improvement has not transferred to related material in different linguistic contexts. The present research had four aims: (1) to investigate the difficulty experienced by American‐English listeners in identifying Hindi dental and retroflex stop consonants in different voicing conditions, (2) to test a new, computer‐based, interactive training method, (3) to examine transfer of training to new voicing conditions, to a new vowel context, and to the voice of a new speaker, and (4) to test the hypothesis that increasing stimulus variability (in this case, training with one versus two voicing conditions) increases transfer of training. Subjects had differential difficulty identifying dental versus retroflex consonants produced in different voicing conditions. Further, this relative difficulty was affected by which speaker produced the contrasts and, to a lesser extent, by the vowel context. The computer‐based training improved subjects’ consonant identification. However, this improvement showed little transfer to new stimuli. Finally, increasing stimulus variability during training did not affect transfer of training. [Work supported by NIDCD.]

...read moreread less

Proceedings Article•

Perceptual effects of place and voicing assimilation in Dutch consonants

[...]

Vincent J. van Heuven, Willy Jongenburger

01 Jan 1993

Journal Article•

New developments in speech pattern element hearing aids for the profoundly deaf

[...]

Andrew Faulkner¹, J. R. Walliker, Ian S. Howard, V Ball, Adrian J. Fourcin - Show less +1 more•Institutions (1)

University College London¹

01 Jan 1993-Scandinavian audiology. Supplementum

TL;DR: The MLP-based pattern element aid gave significantly better performance in the reception of consonantal voicing contrasts from speech in pink noise than that achieved with conventional amplification and consequently, it also gave better overall performance in audio-visual consonant identification.

...read moreread less

Abstract: Two new developments in speech pattern processing hearing aids will be described. The first development is the use of compound speech pattern coding. Speech information which is invisible to the lipreader was encoded in terms of three acoustic speech factors; the voice fundamental frequency pattern, coded as a sinusoid, the presence of aperiodic excitation, coded as a low-frequency noise, and the wide-band amplitude envelope, coded by amplitude modulation of the sinusoid and noise signals. Each element of the compound stimulus was individually matched in frequency and intensity to the listener's receptive range. Audio-visual speech receptive assessments in five profoundly hearing-impaired listeners were performed to examine the contributions of adding voiceless and amplitude information to the voice fundamental frequency pattern, and to compare these codings to amplified speech. In both consonant recognition and connected discourse tracking (CDT), all five subjects showed an advantage from the addition of amplitude information to the fundamental frequency pattern. In consonant identification, all five subjects showed further improvements in performance when voiceless speech excitation was additionally encoded together with amplitude information, but this effect was not found in CDT. The addition of voiceless information to voice fundamental frequency information did not improve performance in the absence of amplitude information. Three of the subjects performed significantly better in at least one of the compound speech pattern conditions than with amplified speech, while the other two performed similarly with amplified speech and the best compound speech pattern condition. The three speech pattern elements encoded here may represent a near-optimal basis for an acoustic aid to lipreading for this group of listeners. The second development is the use of a trained multi-layer-perceptron (MLP) pattern classification algorithm as the basis for a robust real-time voice fundamental frequency extractor. This algorithm runs on a low-power digital signal processor which can be incorporated in a wearable hearing aid. Aided lipreading for speech in noise was assessed in the same five profoundly hearing-impaired listeners to compare the benefits of conventional hearing aids with those of an aid which provided MLP-based fundamental frequency information together with speech+noise amplitude information. The MLP-based pattern element aid gave significantly better performance in the reception of consonantal voicing contrasts from speech in pink noise than that achieved with conventional amplification and consequently, it also gave better overall performance in audio-visual consonant identification.(ABSTRACT TRUNCATED AT 400 WORDS)

...read moreread less