scispace - formally typeset
Search or ask a question
Topic

Voice

About: Voice is a research topic. Over the lifetime, 2393 publications have been published within this topic receiving 56637 citations.


Papers
More filters
Patent
TL;DR: In this paper, a method for estimating a percentage of unvoiced and voiced energy for each harmonic within each of a plurality of bands of a speech signal spectrum is provided, based on the assumption that speech is purely voiced.
Abstract: A voicing probability determination method is provided for estimating a percentage of unvoiced and voiced energy for each harmonic within each of a plurality of bands of a speech signal spectrum. Initially, a synthetic speech spectrum is generated based on the assumption that speech is purely voiced. The original and synthetic speech spectra are then divided into plurality of bands. The synthetic and original speech spectra are compared harmonic by harmonic, and a voicing determination is made based on this comparison. In one embodiment, each harmonic of the original speech spectrum is assigned a voicing decision as either completely voiced or unvoiced by comparing the difference with an adaptive threshold. If the difference for each harmonic is less than the adaptive threshold, the corresponding harmonic is declared as voiced; otherwise the harmonic is declared as unvoiced. The voicing probability for each band is then computed based on the amount of energy in the voiced harmonics in that decision band. Alternatively, the voicing probability for each band is determined based on a signal to noise ratio for each of the bands which is determined based on the collective differences between the original and synthetic speech spectra within the band.

18 citations

Dissertation
01 Jan 2013
TL;DR: In this paper, an acoustic and physiological analysis of the consonant system in Bininj Gun-wok (BGW), an Australian language spoken in North Western Arnhem Land, is presented.
Abstract: This thesis is an acoustic and physiological phonetic analysis of the consonant system in Bininj Gun-wok (BGW), an Australian language spoken in North Western Arnhem Land. The primary aim of this thesis is to provide a detailed phonetic description of an Australian language looking at the articulation of intervocalic stops and nasals. This investigation examines a number of phonological contrasts in the language that have not had prior phonetic investigation. The analysis is divided into three experiments, the first two of which focus on differences in length and strength between stop series in BGW. The third experiment examines patterns of coarticulation within nasals. The materials used consist of two corpora with a total of 24 first language speakers of BGW. Corpus I includes five speakers of the Kuninjku variety and Corpus II includes 19 speakers of the Kunwinjku variety, all recorded under field conditions in Western Arnhem Land. Corpus I is made up of acoustic recordings and Corpus II, physiological recordings with associated time-aligned audio. An important phonological feature of BGW is a two stop series that contrasts for length. The two stops in the series, which are all matched for place of articulation, are phonologically classed as lenis or fortis. The primary focus of this study is to determine the phonetic realisations of these stop categories. The secondary focus of this study is to examines patterns of coarticulation between nasals and stops in BGW, as nasalisation can mask the acoustic cues that are needed to perceive place of articulation. Earlier cross-linguistic studies have consistently shown that duration is a key difference between stop categories within a language. This is particularly for languages that do not use voicing as a cue to the contrast. In the current study, acoustic analysis is used to measure duration and for analyses of burst characteristics of BGW stops. An articulatory analysis investigates differences in strength and also the prevalence and timing of voicing between the stop series. Findings show that there is a clear durational difference between lenis and fortis stops. Voice onset time differences are dependent on place of articulation rather than reliably signalling between stop categories. In addition there is a clear difference in strength in terms of peak intra-oral pressure. In the study, medial homorganic articulations are separated into three categories termed lenis, fortis and geminated consonants. These represent short intra-morphemic stops, long intra-morphemic stops and long inter-morphemic stops respectively. Fortis stops and geminates clusters do not differ in terms of duration. There are however measurable differences between them including pressure — pressure measured over time — showing that duration and pressure are independent. The timing of pressure peak is similar for lenis and fortis stops is similar, yet geminates show a delay in the intra-oral pressure peak. Across languages, anticipatory nasalisation is thought to be under direct control of the speaker. Carry-over nasalisation in contrast has proven to be a result of bio-mechanical inertia. The secondary focus of this thesis is an examination of nasalisation and directionality of nasal assimilation in BGW as well as the durational aspects of nasals in clusters. Aerodynamic results show that the rise of the nasal airflow, in medial nasals, is delayed to be almost coincident with the oral occlusion. The inference is that the velum is closed during the preceding vowel and opens quickly at the onset of the nasal. In a cluster of nasals followed by a stop, the nasal has a greater duration than the stop. In clusters of stops followed by nasals, it is the stop that has the greater duration. This suggests strengthening in a medial position. The post-tonic medial position is prosodically eminent, as this is where the majority of phonetic contrasts are found for Bininj Gun-wok and Australian languages in general. This investigation into medial consonants in BGW represents the first major phonetic investigation into stop articulation in an Australian language and provides key support for this proposition.

18 citations

Journal ArticleDOI
TL;DR: In this paper, a computer is used to display speech waveforms in the time and frequency domains, and then the speech can be played back in its natural form, and also after it has been analyzed and synthesized.
Abstract: Speech produced by individual audience participants will be stored on a computer. Speech waveforms in the time and frequency domains will then be displayed by the computer. The speech will be played back in its natural form, and also after it has been analyzed and synthesized. Various transforms of the speech will be implemented and played back. For example, the speech will be speeded up and slowed down. The gender of the voice will be changed by altering the pitch periods. The speech will be made to seem as if the talker had a cold by changing the voicing of the consonant sounds.

18 citations

Journal ArticleDOI
TL;DR: This article found that the fundamental frequency (F0) is lower following voiced stops that voiced ones is assumed to be an automatic consequence of the articulatory gesture which yields the voicing distinction in stops.
Abstract: That fundamental frequency (F0) is lower following voiced stops that voiced ones is assumed to be an automatic consequence of the articulatory gesture which yields the voicing distinction in stops. In this study it is shown that this F0‐perturbatory effect differs cross‐linguistically within the VOT category voiceless unaspirated. F0 was measured after initial voiceless unaspirated stops ([bů, dů, gů]) in naturally produced English tokens, and similar data was collected from other languages. The results show that F0 is low after these stops compared to F0 following [p, t, k] or [ph, th, kh]. This result implies that the glottal gesture differs for [bů, dů, gů] and [p, t, k], even though both are voiceless and unaspirated. This implication was tested by reducing intra‐oral air pressure during the production of stops. The results support the claim that the mechanism of production differs for the two types of voiceless unaspirated stops.

18 citations

Journal ArticleDOI
01 Jan 2009
TL;DR: The authors showed that postnasal devoicing in Shekgalagari is a categorical process, i.e., devoiced stops do not differ from underlying voiceless stops in any of the durational, voicing and tonal parameters analyzed.
Abstract: Like other languages of the Sotho-Tswana subgroup of Bantu, Shekgalagari exhibits a process of post-nasal devoicing, a phenomenon which has been at the center of the debate on the phonetic grounding of phonology. The existence of post-nasal devoicing has been questioned, and it has been claimed that it is phonetically unnatural. In this paper, we provide instrumental data that post-nasal devoicing actually exists in Shekgalagari and suggest that it is not necessarily phonetically unnatural. Acoustic and laryngographic data indicate that post-nasal devoicing is a categorical process, i.e., devoiced stops do not differ from underlying voiceless stops in any of the durational, voicing and tonal parameters analyzed. Voiced stops differ from devoiced and voiceless stops in all these parameters. Secondly, the results show that in Shekgalagari (as in Tswana) voiceless stops do not have longer voicing into the closure postnasally than postvocally, in contrast with the findings for most languages. These results undercut the claim that the tendency towards postnasal obstruent voicing is present in all languages. We argue that the two patterns, postnasal voicing and devoicing, may not be as antagonistic as has been assumed, and that both may be derived from a common source, variations in the relative timing of the nasal and oral gestures.

18 citations


Network Information
Related Topics (5)
Speech perception
12.3K papers, 545K citations
85% related
Speech processing
24.2K papers, 637K citations
78% related
First language
23.9K papers, 544.4K citations
75% related
Sentence
41.2K papers, 929.6K citations
75% related
Noise
110.4K papers, 1.3M citations
74% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023102
2022248
202156
202073
201981
201888