Topic

Viseme

About: Viseme is a research topic. Over the lifetime, 865 publications have been published within this topic receiving 17889 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Patent•

Speech recognizer and speech recognition method

[...]

Ou Giyoutou, Owa Kunihiko, Shosakai Makoto

16 Jul 2008

TL;DR: In this paper, the authors proposed a speech recognizer that has less throughput and high recognition performance especially for speech recognition of a tone language, where tone information indicating tone of the selected label was extracted from the input speech, and corrected on the basis of the extracted tone information and content of the pattern list.

...read moreread less

Abstract: PROBLEM TO BE SOLVED: To provide a speech recognizer that has less throughput and high recognition performance especially for speech recognition of a tone language.SOLUTION: The speech recognizer: extracts a fundamental frequency from input speech, and acoustically analyses the input speech; selects one of plural speech recognition results obtained by speech recognition, and outputs a label string indicating the selected speech recognition result; selects at least one label in the output label string on the basis of a pattern list held in advance; and extracts tone information indicating tone of the selected label on the basis of the fundamental frequency extracted from the input speech, and corrects the selected label on the basis of the extracted tone information and content of the pattern list.

...read moreread less

3 citations

Journal Article•DOI•

Starting Characteristics of Speech Sounds

[...]

R. O. Drew, E. W. Kellogg

01 Jan 1940-Smpte Motion Imaging Journal

TL;DR: In this paper, a large number of oscillograms were taken, a number of which are reproduced herewith, and the most important observation is that the human voice can start several of the vowel sounds in such a way that the first wave is from 40 to 80 percent of the final amplitude.

...read moreread less

Abstract: In view of its bearing on the design of ground noise reduction systems, a study was undertaken, to determine how sudden or rapid are the increases in amplitude of the speech sounds that must be recorded in dialogue. A large number of oscillograms were taken, a number of which are reproduced herewith. The most important observation is that the human voice can start several of the vowel sounds in such a way that the first wave is from 40 to 80 percent of the final amplitude, or in other words with a suddenness comparable to that of keying an oscillator, but this is rare, being for all practical purposes confined to a few of the more open vowel sounds, when not preceded by any consonant, and only true of certain individuals, and depending on the manner of releasing the breath. Progressive build‐up at rates which would carry the modulation from zero to 100 percent in 0.05 second are frequent, while the great majority of syllables start more gradually than this.

...read moreread less

2 citations

Journal Article•

A viseme recognition system using lip curvature and neural networks to detect Bangla vowels

[...]

Nahid Akhter

30 Dec 2017-Journal of Telecommunication, Electronic and Computer Engineering

TL;DR: This thesis report is submitted in partial fulfilment of the requirements for the degree of Master of Science in Computer Science and Engineering, 2016.

...read moreread less

Abstract: This thesis report is submitted in partial fulfilment of the requirements for the degree of Master of Science in Computer Science and Engineering, 2016.

...read moreread less

2 citations

Proceedings Article•DOI•

Prosody for Mandarin speech recognition: a comparative study of read and spontaneous speech.

[...]

Yu Ting Yeung¹, Yao Qian², Tan Lee, Frank K. Soong²•Institutions (2)

The Chinese University of Hong Kong¹, Microsoft²

22 Sep 2008

TL;DR: A comparative study between spontaneous speech and read Mandarin speech in the context of automatic speech recognition and the technique of Multispace distribution (MSD) to model partially continuous F0 contours is presented.

...read moreread less

Abstract: In this paper, we present a comparative study between spontaneous speech and read Mandarin speech in the context of automatic speech recognition. We focus on analysis and modeling of prosodic features, based on a unique speech corpus that contains similar amounts of read and spontaneous speech data from the same group of speakers. Statistical analysis is carried out on tone contours and duration of syllable and subsyllable units. Speech recognition experiments are performed to evaluate the effectiveness of different approaches to incorporate prosodic features into acoustic modeling. A key problem being addressed is how to deal with the unvoiced frames where F0 values are unavailable. We apply the technique of Multispace distribution (MSD) to model partially continuous F0 contours. For spontaneous speech, the tonal-syllable error rate is reduced from the MFCC baseline of 64.8% to 59.4% with the MSD based prosody model. For read speech, the performance improves from 46.0% to 36.4%.

...read moreread less

2 citations

Patent•

Speech processing system and method for recognizing speech samples from a speaker with an oriyan accent when speaking english

[...]

Suman Bhattacharya¹•Institutions (1)

Tata Consultancy Services¹

13 Mar 2013

TL;DR: In this paper, the authors described a speech processing system for Oriya English, where a plurality of speech samples are used to form a speech corpora where the plurality of samples comprise sounds of both vowels and consonants.

...read moreread less

Abstract: Method(s) and system(s) for speech processing of second language speech are described. According to the present subject matter, the system(s) implement the described method(s) for speech processing of Oriya English. The method for speech processing include receiving a plurality of speech samples of Oriya English to form a speech corpora where the plurality of speech samples comprise sounds of both vowels and consonants and, a plurality of speech parameters are associated with each of the plurality of speech samples. Method also includes determining values of the plurality of speech parameters for each of the plurality of speech samples and identifying difference between the values of each of the plurality of speech parameters and a corresponding value of accent neutral English. Further, the method includes articulating governing language rules based on the identifying to assess phonetic variation and mother tongue influence in sounds of vowels and consonants of Oriya English.

...read moreread less

2 citations

Collapse

Network Information

Performance

Metrics

884

Papers

19,235

Citations

No. of papers in the topic in previous years
Year	Papers
2023	7
2022	12
2021	13
2020	39
2019	19
2018	22

Viseme

Papers published on a yearly basis

Papers

Trending Questions (8)

Network Information

Related Topics (5)

Performance

Metrics