Topic

Voice

About: Voice is a research topic. Over the lifetime, 2393 publications have been published within this topic receiving 56637 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Patent•

Method and apparatus for speech coding with voiced/unvoiced determination

[...]

Ari Heikkinen¹, Samuli Pietila¹, Vesa Ruoppila¹•Institutions (1)

Nokia¹

21 Dec 2000

TL;DR: In this paper, a voicing determination algorithm for classification of a speech signal segment as voiced or unvoiced is presented, which is based on a normalized autocorrelation where the length of the window is proportional to the pitch period.

...read moreread less

Abstract: This invention presents a voicing determination algorithm for classification of a speech signal segment as voiced or unvoiced. The algorithm is based on a normalized autocorrelation where the length of the window is proportional to the pitch period. The speech segment to be classified is further divided into a number of sub-segments, and the normalized autocorrelation is calculated for each sub-segment if a certain number of the normalized autocorrelation values is above a predetermined threshold, the speech segment is classified as voiced. To improve the performance of the voicing determination algorithm in unvoiced to voiced transients, the normalized autocorrelations of the last sub-segments are emphasized. The performance of the voicing decision algorithm can be enhanced by utilizing also the possible lookahead information.

...read moreread less

14 citations

Journal Article•DOI•

ERP correlates of involuntary attention capture by prosodic salience in speech

[...]

Jingtian Wang¹, David Friedman¹, Walter Ritter², Michael Bersick¹•Institutions (2)

University of York¹, Nathan Kline Institute for Psychiatric Research²

01 Jan 2005-Psychophysiology

TL;DR: The sensitivity of the P3a to the stress manipulation suggests that prosodic rather than temporal salience captures attention in unattended speech sounds.

...read moreread less

Abstract: This study addressed whether temporally salient (e.g., word onset) or prosodically salient (e.g., stressed syllables) information serves as a cue to capture attention in speech sound analysis. In an auditory oddball paradigm, 16 native English speakers were asked to ignore binaurally presented disyllabic speech sounds and watch a silent movie while ERPs were recorded. Four types of phonetic deviants were employed: a deviant syllable that was either stressed or unstressed and that occurred in either the first or second temporal position. The nature of the phonetic change (a change from a voiced consonant to its corresponding unvoiced consonant) was kept constant. MMNs were observed for all deviants. In contrast, the P3a was only seen when the deviance occurred on stressed syllables. The sensitivity of the P3a to the stress manipulation suggests that prosodic rather than temporal salience captures attention in unattended speech sounds.

...read moreread less

14 citations

Proceedings Article•DOI•

Factors in voice quality: Acoustic features related to gender

[...]

Donald G. Childers¹, Ke Wu, Douglas M. Hicks•Institutions (1)

University of Florida¹

01 Apr 1987

TL;DR: A two-channel, speech and electroglottograph (EGG) approach to speech analysis is suggested to aid the automatic processing of speech.

...read moreread less

Abstract: Attempts to measure the synthetic quality of speech usually consider the two factors intelligibility and naturalness, each involving subjective and objective characteristics. To generate high quality synthetic speech, spectral distortion should be avoided, spectral continuity and formant tracking should be done well. Glottal-related factors, including proper modeling of the 1) glottal excitation waveforms and 2) effects of source-tract interaction for synthesizers are discussed. Accurate detection of voiced/unvoiced/ silent segments in the speech waveform and the fundamental frequency of voicing are also major concerns. We present both formal and informal listener evaluations of three synthesizers: LPC, formant and articulatory. Finally, we suggest a two-channel, speech and electroglottograph (EGG), approach to speech analysis to aid the automatic processing of speech.

...read moreread less

14 citations

Proceedings Article•DOI•

Acoustic-phonetic information from excitation source for refining manner hypotheses of a phone recognizer

[...]

N. Dhananjaya¹, B. Yegnanarayana², V. G. Suryakanth²•Institutions (2)

Indian Institute of Technology Madras¹, International Institute of Information Technology, Hyderabad²

22 May 2011

TL;DR: Limited acoustic-phonetic information derived primarily by processing the excitation source information in the speech signal is used to improve the performance of detection of manner of articulation from a baseline phone recognition system.

...read moreread less

Abstract: Reliable acoustic-phonetic (AP) information derived from the speech signal can be used to detect and correct errors in the output of a phone recognizer. In this paper, limited acoustic-phonetic information derived primarily by processing the excitation source information in the speech signal is used to improve the performance of detection of manner of articulation from a baseline phone recognition system. A context-independent HMM-based monophone system without any language information is used as the baseline system for this purpose. The performance of the phone recognizer in terms of its ability to detect the manners of articulation is studied. The errors in the hypothesis of the manner of articulation of phones are corrected using AP information such as voicing, voice bar and frication. It is shown that significant improvement can be achieved by using simple or limited AP information.

...read moreread less

13 citations

Journal Article•DOI•

Visual influences on the internal structure of phonetic categories.

[...]

Lawrence Brancazio¹, Joanne L. Miller¹, Matthew A. Paré¹•Institutions (1)

Northeastern University¹

01 May 2003-Attention Perception & Psychophysics

TL;DR: This article exploited the McGurk effect to examine whether visual information for place of articulation also shifts the best-exemplar range for voiceless stop consonants, following Green and Kuhl's (1989) demonstration of effects of visual place-of- articulation on the location of voicing boundaries.

...read moreread less

Abstract: Previous work has demonstrated that the graded internal structure of phonetic categories is sensitive to a variety of contextual factors. One such factor is place of articulation: The best exemplars of voiceless stop consonants along auditory bilabial and velar voice onset time (VOT) continua occur over different ranges of VOTs (Volaitis & Miller, 1992). In the present study, we exploited theMcGurk effect to examine whether visual information for place of articulation also shifts the best-exemplar range for voiceless consonants, following Green and Kuhl’s (1989) demonstration of effects of visual place of articulation on the location of voicing boundaries. In Experiment 1, we established that /p/ and /t/ have different best-exemplar ranges along auditory bilabial and alveolar VOT continua. We then found, in Experiment 2, a similar shift in the best-exemplar range for /t/ relative to that for /p/ when there was a change in visual place of articulation, with auditory place of articulation held constant. These findings indicate that the perceptual mechanisms that determine internal phonetic category structure are sensitive to visual, as well as to auditory, information.

...read moreread less

13 citations

Collapse

Network Information

Performance

Metrics

2,751

Papers

60,788

Citations

No. of papers in the topic in previous years
Year	Papers
2023	102
2022	248
2021	56
2020	73
2019	81
2018	88

Voice

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics