scispace - formally typeset
Search or ask a question

Showing papers on "Voice published in 2002"


Journal ArticleDOI
TL;DR: This study examines acoustic and aerodynamic characteristics of consonants in standard Korean and in Cheju, an endangered Korean language, and suggests that the fricative /s/ is better categorized as &&lenis'' rather than &&aspirated''.

370 citations


Journal ArticleDOI
TL;DR: The perceptual dominance of f 0 over VOT for lax stops is consistent with the size of the f 0 differences in word- (and phrase-) initial position, as well as the prominent role of the resulting tonal patterns in Korean intonational phonology.

138 citations


Journal ArticleDOI
TL;DR: Acoustic analysis was used to examine whether speech errors involve lexical, segmental, or sub-featural errors in speech production, and provides evidence for the psychological reality of phonological segments and words as units in the speech production process.

133 citations


Journal ArticleDOI
TL;DR: Portuguese fricatives were analyzed in ways designed to enhance the description of the language and to increase the understanding of the production of fricative mechanisms.

128 citations


Journal ArticleDOI
TL;DR: In this article, the authors present experimental results that support the view that German has underlying [spread glottis] stops, not [voice] stops and that the intervocalic voiced stops arise because of passive voicing of the non-spreadglottis stops.
Abstract: It is well known that initially and when preceded by a word that ends with a voiceless sound, German so-called ‘voiced’ stops are usually voiceless, that intervocalically both voiced and voiceless stops occur and that syllable-final (obstruent) stops are voiceless. Such a distribution is consistent with an analysis in which the contrast is one of [voice] and syllable-final stops are devoiced. It is also consistent with the view that in German the contrast is between stops that are [spread glottis] and those that are not. On such a view, the intervocalic voiced stops arise because of passive voicing of the non-[spread glottis] stops. The purpose of this paper is to present experimental results that support the view that German has underlying [spread glottis] stops, not [voice] stops.

128 citations


Journal ArticleDOI
TL;DR: Results find both a quantity and a voicing effect on vowel durations, though these two effects differ as to how they interact with stress and focus.

115 citations


Journal ArticleDOI
TL;DR: It is argued that extensive larynx lowering and vocal fold slackening can explain the specifics of the voicing feature in Xhosa and suggested that “slack voice” is a more appropriate term for the relevantXhosa sounds than “breathy voice’.

57 citations


Journal ArticleDOI
TL;DR: The results show that auditory speech perception performance of children with cochlear implants reaches an asymptote at 76% between 4 and 6 years of implant use, and the hierarchy in speech pattern contrast perception and production was similar between the implanted and normal-hearing children.
Abstract: The purpose of the present study was twofold: 1) to compare the hierarchy of perceived and produced significant speech pattern contrasts in children with cochlear implants, and 2) to compare this hierarchy to developmental data of children with normal hearing. The subjects included 35 prelingual hearing-impaired children with multichannel cochlear implants. The test materials were the Hebrew Speech Pattern Contrast (HeSPAC) test and the Hebrew Picture Speech Pattern Contrast (HePiSPAC) test for older and younger children, respectively. The results show that 1) auditory speech perception performance of children with cochlear implants reaches an asymptote at 76% (after correction for guessing) between 4 and 6 years of implant use; 2) all implant users perceived vowel place extremely well immediately after implantation; 3) most implanted children perceived initial voicing at chance level until 2 to 3 years after implantation, after which scores improved by 60% to 70% with implant use; 4) the hierarchy of phonetic-feature production paralleled that of perception: vowels first, voicing last, and manner and place of articulation in between; and 5) the hierarchy in speech pattern contrast perception and production was similar between the implanted and the normal-hearing children, with the exception of the vowels (possibly because of the interaction between the specific information provided by the implant device and the acoustics of the Hebrew language). The data reported here contribute to our current knowledge about the development of phonological contrasts in children who were deprived of sound in the first few years of their lives and then developed phonetic representations via cochlear implants. The data also provide additional insight into the interrelated skills of speech perception and production.

54 citations


Journal ArticleDOI
TL;DR: The aim is to provide a comprehensive account of all the various interactions between consonantal voicing, vowel height and consonant place on the above acoustic attributes in order to propose an explanation for such effects, and to compare the present results and interpretations with previous explanations
Abstract: This paper reports an acoustic study of CV sequences in Italian (where C is /b, d, g, p, t, k/ and V is one of the seven Italian vowels in stressed position). It explores the effects of vowel height, consonantal voicing, and place of articulation on a number of acoustic attributes of vowels (duration, f(0), F(1)), and on the duration of the preceding stop closure, VOT and RVOT (defined as the interval from C release to the acoustic vowel onset). The aim is to provide, for Italian, a comprehensive account of all the various interactions between consonantal voicing, vowel height and consonant place on the above acoustic attributes in order to propose an explanation for such effects, and to compare the present results and interpretations with previous explanations and with previous data on Italian and other languages.

51 citations


01 Jan 2002
TL;DR: In this paper, it is argued that the attested rankings correspond to patterns that arise through simple sound changes from phonetic patterns, while the unattested rankings cannot have such an origin.
Abstract: In many languages, a voiceless obstruent cannot occur after a nasal. In some languages such a sequence is avoided through voicing assimilation, but in others through deletion of the nasal or of the obstruent, or changing the nasal into an oral stop. This variety of responses can be expressed in OT through different constraint rankings. Other ways of avoiding such a sequence, such as epenthesis or metathesis, are not attested. These patterns correspond to rankings that are unattested. It is shown that such gaps in factorial typology are pervasive, and cannot be addressed through ad hoc revisions of the constraints or representations. It is argued that the attested rankings correspond to patterns that arise through simple sound changes from phonetic patterns, while the unattested rankings cannot have such an origin. This approach suggests phonetics influences not only markedness in phonology, but also constraint ranking.

49 citations


Journal ArticleDOI
TL;DR: It is confirmed that Swedish children show an early tendency to vary vowel durations according to final consonant voicing, followed only six months later by a stage at which the intrinsic influence of vowel identity grows relatively more robust.
Abstract: Vowel durations typically vary according to both intrinsic (segment-specific) and extrinsic (contextual) specifications. It can be argued that such variations are due to both predisposition and cognitive learning. The present report utilizes acoustic phonetic measurements from Swedish and American children aged 24 and 30 months to investigate the hypothesis that default behaviors may precede language-specific learning effects. The predicted pattern is the presence of final consonant voicing effects in both languages as a default, and subsequent learning of intrinsic effects most notably in the Swedish children. The data, from 443 monosyllabic tokens containing high-front vowels and final stop consonants, are analyzed in statistical frameworks at group and individual levels. The results confirm that Swedish children show an early tendency to vary vowel durations according to final consonant voicing, followed only six months later by a stage at which the intrinsic influence of vowel identity grows relatively more robust. Measures of vowel formant structure from selected 30-month-old children also revealed a tendency for children of this age to focus on particular acoustic contrasts. In conclusion, the results indicate that early acquisition of vowel specifications involves an interaction between language-specific features and articulatory predispositions associated with phonetic context.

PatentDOI
Ning Bi1, Andrew P. Dejaco1
TL;DR: In this paper, a speech processing system modifies various aspects of input speech according to a user-selected one of various preprogrammed voice fonts, each specifying a manner of modifying one or more of the received signals (i.e., formants, voicing, pitch, gain).
Abstract: A speech processing system modifies various aspects of input speech according to a user-selected one of various preprogrammed voice fonts. Initially, the speech converter receives a formants signal representing an input speech signal and a pitch signal representing the input signal's fundamental frequency. One or both of the following may also be received: a voicing signal comprising an indication of whether the input speech signal is voiced, unvoiced, or mixed, and/or a gain signal representing the input speech signal's energy. The speech converter also receives user selection of one of multiple preprogrammed voice fonts, each specifying a manner of modifying one or more of the received signals (i.e., formants, voicing, pitch, gain). The speech converter modifies at least one of the formants, voicing, pitch, and/or gain signals as specified by the selected voice font.

Journal ArticleDOI
TL;DR: Three of the four acoustic measures contributed significantly to the discriminant models that differentiated accurately perceived TE and laryngeal samples, and the values for each measure were higher/longer for the TE group.
Abstract: Tracheoesophageal (TE) speakers often have difficulty producing the voiced/ voiceless distinction. This limitation has been attributed to use of the pharyngoe-sophageal segment as the phonatory sou...

Patent
Ari Heikkinen1
05 Apr 2002
TL;DR: In this paper, a method is presented to minimize the effect of pitch jitter in voicing determination of sinusoidal speech coders during voiced speech, where the pitch of the input signal is normalized to a fixed value prior voicing determination in the analysis frame.
Abstract: In an embodiment of the invention, a method is presented to minimize the effect of pitch jitter in voicing determination of sinusoidal speech coders during voiced speech. In the method, the pitch of the input signal is normalized to a fixed value prior voicing determination in the analysis frame. After that, conventional voicing determination approaches can be used for the normalized signal. Based on experiments done, the method has been shown to improve the performance of sinusoidal speech coders during jittery voiced speech by increasing the accuracy of voicing classification decisions of speech signals.

Journal ArticleDOI
01 May 2002-Lingua
TL;DR: By allowing co-articulation, speech rate and register to play a role in the sound component of the grammar, this analysis accounts for the gradiency, and great deal of cross dialectal and dialect internal variation exhibited by Spanish spirantization.

Journal ArticleDOI
TL;DR: Investigation of speech recognition features related to voicing functions that indicate whether the vocal folds are vibrating shows that voicing features and spectral information are complementary and that improved speech recognition performance is obtained by combining the two sources of information.

Journal ArticleDOI
TL;DR: The authors examined whether or not stress is a factor in the likelihood of frication and devoicing of coda /b, d, [vvplosive]/ in Spanish dialects.
Abstract: In Spanish, /b, d, [vvplosive]/ are usually spirantized to voiced approximants in all syllabic contexts after a continuant sound. However, in North-Central Peninsular Spanish (NCS), spirantization interacts with coda devoicing, yielding voiceless fricatives. In the majority of cases, coda /b, d, [vvplosive]/ occur in stressed syllables. This work examines whether or not stress is a factor in the likelihood of frication and devoicing of coda /b, d, [vvplosive]/ in this dialect. An acoustic study was conducted of nine native speakers from NCS. These speakers were tested on nonce words with /b, d, [vvplosive]/ in coda position in both stressed and unstressed syllables. Measurements were made of vowel and consonant duration, presence and absence of frication and voicing, and voicing duration. The results show that frication is more likely in stressed syllables than in unstressed syllables. This suggests that in stressed syllables, a higher subglottal pressure produces higher airflow across the glottis, thereby favoring frication. In turn, frication inhibits voicing due to conflicting aerodynamic requirements between the two. We conclude that stress is a factor in spirantization and that it may indirectly affect the voicing properties of /b, d, [vvplosive]/.

Journal ArticleDOI
TL;DR: In this paper, the authors investigated the effect of audio infongruent monosyllables in French for both voiced and voiceless stop consonants, using two levels of auditory intensity (70 dB vs 40 dB).
Abstract: When presented with an auditory /b/ dubbed onto a visual /g/, listeners sometimes perceive a fused phoneme like /d/ while with the reverse presentation, they experience a combination such as /bg/. This phenomenon reported by McGurk and MacDonald (1976) is here investigated in French for both voiced and voiceless stop consonants, using two levels of auditory intensity (70 dB vs 40 dB). In a first experiment, audiovisual incongruent monosyllables (A/bi/ V/gi/, A/gi/ V/bi/, A/ki/ V/pi/, A/pi/ V/ki/) uttered by a man and by a woman speaker were recorded and dubbed, using an analogical technology. In a second experiment, the same syllables articulated by the man speaker were recorded and dubbed according to digital technology. In a third experiment, the same materials as in the second experiment were used but the presentation procedure of the experimental items was changed: Audiovisual incongruent trials were mixed up with congruent ones. In the three experiments, the role of voicing and of auditory intensity ...

Proceedings Article
01 Jan 2002
TL;DR: This work attempts to directly exploit prosodic correlates in acoustic modeling of speech for large vocabulary recognition by comparing two methods for using the fundamental frequency and voicing parameters.
Abstract: Prosody has long been studied as a knowledge source in speech processing. We attempt to directly exploit prosodic correlates in acoustic modeling of speech for large vocabulary recognition. We compare two methods for using the fundamental frequency and voicing parameters. The more complex approach starts by modeling prosodic classes and using a representation of their recognized sequences as acoustic features. The simpler approach simply adds suitably normalized raw values to the conventional mel cepstral coefficients in the observation vectors. The simpler approach achieves modest accuracy gains on HUB-5 Eval-2001 test set.

Proceedings ArticleDOI
28 Oct 2002
TL;DR: The evaluation using subject informal listening tests and a few object parameters indicates that the speech quality (intelligibility and naturalness) of the designed speech coder is better than that of the FS1015 standard coder.
Abstract: An average16kb/s low bit and variable rate speech coder based on local cosine transform (LCT) algorithm for a two-way conversational speech is designed for the first time in the paper The result of the voice activity detector (VAD) based on support vector machine (SVM) and the classification method of the voicing modes of the GSM half rate standard for active speech are adopted in the design of the variable bit rate coder The moderately voiced mode and the strongly voiced mode of the voicing modes are combined as a voicing mode, the new combined voicing mode is named as a moderately and strongly voiced mode A few segment vector quantizers of the LCT coefficients for each voicing mode and silence voicing frame (background noise) are employed, and LGB algorithm is applied to design the codebooks A tree fast search technique is used to select the vector of the LCT coefficients for each segment The evaluation using subject informal listening tests and a few object parameters indicates that the speech quality (intelligibility and naturalness) of the designed speech coder is better than that of the FS1015 standard coder The new coder has higher robust than the FS1015 standard coder, which is suitable for speech coding in any environments

Journal ArticleDOI
TL;DR: The phonetic manifestation of distinctive plosive types and click accompaniments in Xhosa was investigated with measurements of voice onset time (VOT), closure duration, voicing during closure, and burst amplitude.
Abstract: The phonetic manifestation of distinctive plosive types and click accompaniments in Xhosa was investigated with measurements of voice onset time (VOT), closure duration, voicing during closure, and burst amplitude. There is a high degree of interspeaker as well as token-to-token variability in the voiceless unaspirated plosives and clicks concerning their pronunciation with or without audible ejection. The plosives are much more frequently ejective than the corresponding clicks. If present, ejection is manifested by increased VOT, burst amplitude, or both. Duration of voicing during closure is substantial only in the implosive, but not in the 'voiced' plosives and clicks. After nasals the percentage of voicing during closure is high in 'voiced' plosives due to the very short closure duration found in that context; in the post-nasal 'voiced' clicks closure is mostly reduced to zero. Aspirated plosives and clicks in Xhosa show VOT values that are on average relatively long when compared to other languages. Closure duration tends to be shorter in aspirated plosives and clicks than in other categories.

01 Jan 2002
TL;DR: A new automatic analysis method based on a speech pro- duction process expressed by an autoregressive model with an exogenous input (ARX model) and a mathematical voicing source model is proposed with the aim of establishing a method of analyzing voicing source and formant parameters with high accuracy for the speech of not only males but also females and children.
Abstract: In this study, we propose a new automatic analysis method based on a speech pro- duction process expressed by an autoregressive model with an exogenous input (ARX model) and a mathematical voicing source model, with the aim of establishing a method of analyzing voicing source and formant parameters with high accuracy for the speech of not only males but also females and children. The features of the proposed method are as follows. 1) The for- mant parameters of high-pitched speech can be stably estimated by placing voicing source pulse trains that correspond to multiple pitches in the analysis window. 2) Low formant frequencies, which are easily affected by voicing source characteristics, can be estimated with high accuracy. 3) The estimation error caused by the incompleteness of the model can be reduced by introducing a prefilter that takes into account the spectral tilt of the voicing source.

Journal ArticleDOI
01 Jan 2002
TL;DR: This article proposed Dispersion-Focalization Theory (DFT) to predict vowel systems using two competing perceptual constraints weighted with two parameters, respectively λ and α, namely increasing auditory distances between vowel spectra (dispersion) and increasing the perceptual salience of each spectrum through formant proximities (focalisation).
Abstract: In the research field initiated by Lindblom & Liljencrants in 1972, we illustrate the possibility of giving substance to phonology, predicting the structure of phonological systems with nonphonological principles, be they listener-oriented (perceptual contrast and stability) or speaker-oriented (articulatory contrast and economy). We proposed for vowel systems the Dispersion-Focalisation Theory (Schwartz et al., 1997b). With the DFT, we can predict vowel systems using two competing perceptual constraints weighted with two parameters, respectively λ and α. The first one aims at increasing auditory distances between vowel spectra (dispersion), the second one aims at increasing the perceptual salience of each spectrum through formant proximities (focalisation). We also introduced new variants based on research in physics - namely, phase space (λ,α) and polymorphism of a given phase, or superstructures in phonological organisations (Vallee et al., 1999) which allow us to generate 85.6% of 342 UPSID systems from 3- to 7-vowel qualities. No similar theory for consonants seems to exist yet. Therefore we present in detail a typology of consonants, and then suggest ways to explain plosive vs. fricative and voiceless vs. voiced consonants predominances by i) comparing them with language acquisition data at the babbling stage and looking at the capacity to acquire relatively different linguistic systems in relation with the main degrees of freedom of the articulators; ii) showing that the places “preferred” for each manner are at least partly conditioned by the morphological constraints that facilitate or complicate, make possible or impossible the needed articulatory gestures, e.g. the complexity of the articulatory control for voicing and the aerodynamics of fricatives. A rather strict coordination between the glottis and the oral constriction is needed to produce acceptable voiced fricatives (Mawass et al., 2000). We determine that the region where the combinations of Ag (glottal area) and Ac (constriction area) values results in a balance between the voice and noise components is indeed very narrow. We thus demonstrate that some of the main tendencies in the phonological vowel and consonant structures of the world’s languages can be explained partly by sensorimotor constraints, and argue that actually phonology can take part in a theory of Perception-for-Action-Control.

Journal ArticleDOI
TL;DR: The data support the influence of both general auditory abilities and unique speech processes on categorical perception of speech and different category boundaries for speech and non-speech stimuli in Hebrew and across languages.
Abstract: The nature of the mechanism responsible for the categorical labeling of stimuli is not clear. One hypothesis suggests that categorization is limited by the 'natural sensitivities' of the auditory system. The alternative hypothesis suggests that categorization is mediated by a special speech mode and is influenced by how speech is produced. The present study attempts to provide some insight into this dilemma by evaluating categorical perception (CP) in speech and non-speech stimuli and across languages. Specifically, the goals of the present study were (1) to compare phonetic boundaries of Hebrew voicing to categorical boundaries (CB) of a two-tone complex which varies in the relative timing of the two tones (TOT) [TOT stimuli are considered be the non-speech analog to voice-onset time (VOT)], and (2) to re-establish the CB values of non-speech analog to voicing in American-English speakers using the same TOT continua as the Hebrew speakers and to compare them to CB of Hebrew-speaking subjects. Our assumption was that if CP is mediated by basic auditory sensitivity then we expect similar CB for speech and non-speech stimuli and no effect of language on CB. If, however, a special speech code determines CP, then phonetic boundaries are expected to be different from CB of non-speech stimuli and across languages. Of particular interest is the special case of Hebrew whose voice-voiceless distinction in production is very different from that in English. Twelve Hebrew-speaking adults and 12 American-English speaking adults participated in this study. Stimuli consisted of (a) a two-tone complex continuum that varied in the relative onset time of the lower tone from a lead of -100 ms to a lag of +50 ms in 10 ms steps, and (b) a /ba-pa/ continuum which varied in VOT values similar to (a). Subjects identified TOT stimuli as belonging to one of three categories: leading, simultaneous, or lagging. VOT stimuli were labeled as /ba/ or /pa/. Results show (a) different phonetic boundary for Hebrew voicing compared to published data on English voicing, (b) different category boundaries for speech and non-speech stimuli in Hebrew, (c) a phonetic boundary for Hebrew voicing that does not align with the VOT values of production, and (d) very similar CB for TOT stimuli in Hebrew- and American-English-speaking subjects. The data support the influence of both general auditory abilities and unique speech processes on categorical perception of speech.

01 Jan 2002
TL;DR: Steriade et al. as mentioned in this paper found that sounds which are less perceptible are more likely to be altered than more salient sounds, the rationale being that the loss of information resulting from a change in a sound which is difficult to perceive is not as great as the loss resulting from the change in more salient sound.
Abstract: It has been hypothesized that sounds which are less perceptible are more likely to be altered than more salient sounds, the rationale being that the loss of information resulting from a change in a sound which is difficult to perceive is not as great as the loss resulting from a change in a more salient sound. Kohler (1990) suggested that the tendency to reduce articulatory movements is countered by perceptual and social constraints, finding that fricatives are relatively resistant to reduction in colloquial German. Kohler hypothesized that this is due to the perceptual salience of fricatives, a hypothesis which was supported by the results of a perception experiment by Hura, Lindblom, and Diehl (1992). These studies showed that the relative salience of speech sounds is relevant to explaining phonological behavior. An additional factor is the impact of different acoustic environments on the perceptibility of speech sounds. Steriade (1997) found that voicing contrasts are more common in positions where more cues to voicing are available. The P-map, proposed by Steriade (2001a, b), allows the representation of varying salience of segments in different contexts. Many researchers have posited a relationship between speech perception and phonology. The purpose of this paper is to provide experimental evidence for this relationship, drawing on the case of Turkish /h/ deletion.

Proceedings ArticleDOI
13 May 2002
TL;DR: The knowledge-based acoustic parameters (APs) optimized within the EBS framework were compared to the mel-frequency cepstral coefficients in an HMM-based recognition system and showed that the APs achieve a higher recognition accuracy.
Abstract: In this paper, we discuss an event-based recognition system (EBS) which is based on phonetic feature theory and acoustic phonetics. First, acoustic events related to the manner phonetic features are extracted from the speech signal. Second, based on the manner acoustic events, information related to the place phonetic features and voicing are extracted. Most recently, we focused on place and voicing information needed to distinguish among the stop consonants /t,d,p,b/. Using the E-set utterances from the TI46 database, EBS achieved 75.7% overall word accuracy. Further, the knowledge-based acoustic parameters (APs) optimized within the EBS framework were compared to the mel-frequency cepstral coefficients in an HMM-based recognition system. The results on the E-set task showed that the APs achieve a higher recognition accuracy.

Dissertation
23 Nov 2002
TL;DR: In this paper, it is argued that voice, and, more generally, all features associated with "voice onset time" (VOT) are not segmental features; rather, VOT-values and length contrasts are assigned similar representations.
Abstract: In all phonological models of syllable structure, 'sonority', and, in particular, one of its main correlates — voice(lessness) — are intrinsic properties of segments, as opposed, for example, to length, which also plays a major role in syllable stucture, and was shown to be a prosodic effect by autosegmental phonology, thanks to the notion of skeletal positions and the Obligatory Contour Principle. This has particular importance today, since the segmental nature of sonority may naturally be viewed as evidence for 'output-based' and non-representational approaches to the syllable. The basic claim here is that voice, and, more generally, all features associated with 'voice onset time' (VOT) — voice, voicelessness and aspiration (henceforth VOT-values) — are not segmental features ; rather, VOT-values and length contrasts are to be assigned similar representations. It is proposed that phonological words are characterized by two parallel curves which follow from the association with the skeleton of two autonomous and antinomic tiers : the O-tier, where 'onsets' are the roots of consonants, is supposed to stand for (articulatory) 'tension' ; the N-tier, where 'nuclei' are the roots of vowels, represents (perceptual) 'sonority'. VOT-values and length contrasts are, as it were, contextual allophones of such abstract invariants : aspiration and voice emerge from O-spreading to the following N-slot, and from N-spreading to the preceding O-slot respectively ; consonantal and vocalic length results from O-spreading to the preceding N-slot, and from N-spreading to the following O-slot respectively. The representation of VOT-values and length in terms of O/N interactions provides a simple and straightforward solution to six problems at least : (a) why can no segment contain the sole 'feature' [voiced] or [aspirated] ? ; (b) why do gemination and voice behave as the poles of the same 'strength scale' ? ; (c) why are voice contrasts much more frequent among consonants than among vowels ? ; (d) why is compensatory lengthening impossible before vowel ? ; (e) why are both initial aspiration and final voicing 'edge-specific' marked phenomena ? ; (f) why does voicing normally take place in intervocalic position, but fails to occur either word-initially or after coda ? Finally, voicing and vowel lengthening are shown to be alternative lenition strategies. Beyond its explanatory power, the hypothesis of O/N interactions has an important issue on cognitive grounds. By denying any symbolic status to aspiration and voice, we are led to reduce the number of segmental primitives. By assuming that both VOT-values and length contrasts are segmental effects of onset and nucleus weight, defined as the number of slots onsets and nuclei are associated with, we are assigning a representational basis to syllables : 'syllables' exist wherever VOT and/or length contrasts may emerge. This runs counter the claims of output-based approaches, where syllables emerge from smaller units. A contrario, the present theory is likely to lend phonological support to quite independently grounded ideas, since based on brain studies, like MacNeilage's distinction between frame and content. In particular, the assumed autonomy of syllabic structure, i.e. of VOT/length, vis-a-vis segmental material proper is consonant with "the idea that speech production branches into metrical and segmental processes, and that syllabic frames are conceptually separable from their phonemic content".

Journal ArticleDOI
TL;DR: In this paper, the authors examined whether a voiced or a voiceless fricative in OE compounds and quasi-compounds conform to the rule of voicing between voiced sounds that applies morpheme-internally.
Abstract: Old English fricatives at points of morpheme juncture are studied to determine whether they conform to the rule of voicing between voiced sounds that applies morpheme-internally. Should we expect a voiced or a voiceless fricative in words like OE heorð-weorod, Wulfweard, and stīðlīce? The evidence examined regards chiefly compounds and quasi-compounds (the latter comprising both forms bearing clear derivational affixes and ‘obscured’ compounds, those in which the deuterotheme has lost its lexical independence), though a small amount of evidence in regard to voicing before inflectional suffixes is considered. Evidence is derived from place-names, personal names, and common nouns, on the basis of Modern English standard pronunciation, assimilatory changes in Old English, modern dialect forms, post-Conquest and nonstandard Old English spellings, and analogous conditioning for the loss of OE /x/. A considerable preponderance of the evidence indicates that in compounds as well as in quasi-compounds, fricatives were voiced at the end of the prototheme when a voiced sound followed, but not a voiceless one. It follows from the evidence that there was no general devoicing of fricatives in syllable-final position in Old English, despite Anglo-Saxon scribes' use of for etymological [Γ] in occasional spellings like and . Old English spellings of this kind need be taken to imply nothing more than a tendency for and to be used interchangeably in noninitial positions, due to the noncontrastive distribution of the sounds they represent everywhere except morpheme-initially. Rare early Middle English spellings of this kind may or may not have a phonological basis, but they cannot plausibly be taken to evidence a phonological process affecting /v, ð, z/.

DOI
11 Apr 2002
TL;DR: In this article, the authors investigated the influence of phonetic identity of the segments involved, context and speaker on the durational variability in Vowel-Consonant-Vowel (VCV) sequences.
Abstract: The paper investigates durational variability in Vowel-Consonant-Vowel (VCV) sequences, due to factors such as the phonetic identity of the segments involved, the context and the speaker. The speech material consisted of Greek real words containing VCV sequences with V=/i,a/ and C=/p,t,s/. The influence of the three factors on the durations of Vowel 1 (VI), Vowel 2 (V2), Consonant (C), and of the total VCV sequences are examined. In addition, further analyses are report­ed for VOT and VTT (voice termination time, i.e., carryover voicing during the consonant). The results show significant influence of all three parameters on the durations examined. The empirical findings are discussed with reference to rele­vant literature.

Journal ArticleDOI
TL;DR: Hansen et al. as mentioned in this paper examined the use of a prototype testing system that employs synthesized speech to deliver questions on reading and listening comprehension tests for individuals with visual impairments.
Abstract: A \"Self-Voicing\" Test for Individuals with Visual Impairments Eric G. Hansen, Moon J. Lee, Douglas C. Forer For test takers who are visually impaired (that is, are blind or have low vision), using human readers during tests can have several disadvantages. The problems may include an inconsistent quality of reading, the test taker's anxiety and embarrassment at having the reader reread the material, the reader's mistakes in recording answers, fatigue caused by the slowness and intensity of the reader/test-taker interaction, and a greater need for extra testing time. A computer-based testing system that is operable by keyboard input and speech output (synthesized and/or prerecorded) may reduce or eliminate the need for a human reader for some test takers who are visually impaired. This study examined the use of a prototype testing system that employs synthesized speech to deliver questions on reading and listening comprehension tests. The system is termed a \"self-voicing\" test because it provides the speech output capability within a testing application itself, rather through the use of a distinct assistive technology, such as screen reader software. With funding from the Educational Testing Service (ETS), the Graduate Record Examinations (GRE) program, and the Test of English as a Foreign Language (TOEFL) program, researchers investigated the use of speech output technology for tests for individuals with visual impairments.