scispace - formally typeset
Proceedings ArticleDOI

Spoken word recognition system for unlimited adult male speakers

Reads0
Chats0
TLDR
An online automatic spoken word recognition system has been developed for the researches on the automatic recognition of speech and the possible combination of phonemes to every word was found to be 94% when the speakers were permitted to repeat their utterances for three times.
Abstract
An online automatic spoken word recognition system has been developed for the researches on the automatic recognition of speech. In this system, the spoken word is first frequency analysed with a filter bank of single tuned low selectivity filters. Three major local peaks in the spectrum and the amplitude of the speech wave are extracted every 10 ms. The frequencies of two local peaks are used for classifying the vowels, and the frequencies of three local peaks, the movements of them and the amplitude are used for classifying the semi-vowels and consonants. Input speech is thus transformed into a sequence of the notations expressing the phonemes or phoneme groups every 10 ms. The sequence is again transformed into possible phonemic strings which are called input words henceforth and are convenient for the comparison with the contents of the dictionary. The Hamming's distance between each input word and each item of the contents of the dictionary is computed where the notations of phonemes and phoneme groups are expressed by 9 bits binary vectors. The item in the dictionary nearest to one of the input word is selected as the output of the recognition system. The experiments were carried out with the utterance of five speakers from whose utterances the standard patterns for P1, P2 and Pe3 distribution had been made. The recognition score was 96% for the 20 city names involving all kinds of phonemes. The speech samples were increased to 166 city names and 82% of the utterances of three speakers were correctly recognized by adding the possible combination of phonemes to every word. Next, 13 different speakers uttered 51 city names having long distance between each other, the recognition score was found to be 94% when the speakers were permitted to repeat their utterances for three times.

read more

Citations
More filters
PatentDOI

Speech processing apparatus and methods

TL;DR: In this article, a memory address (ADR) is generated as a function of the position coordinate value (Xp, Yp, Zp) of points on a path in a mathematical space from frequency spectra (D(K)) of the speech occurring in successive time intervals respectively.
Proceedings ArticleDOI

Large vocabulary speaker-independent Japanese speech recognition system

TL;DR: This paper describes the speaker independent large vocabulary speech recognition system based on phoneme recognition, which employs LPC cepstrum coefficients as the feature parameter and statistical distance measure between an input pattern and phoneme reference template.
Proceedings ArticleDOI

A speaker independent word recognition system based on phoneme recognition for a large size (212 words) vocabulary

TL;DR: This paper describes the speaker-independent spoken word recognition system for a large size vocabulary and results are obtained for the training samples in the 212 words uttered by 10 male and 10 female speakers.
Journal ArticleDOI

Recognition of vowels in Japanese words using spectral local peaks

TL;DR: In this article, the frequency distribution of spectral local peaks for every frame was used for Japanese vowel recognition, which achieved a 90.8% recognition rate for designed samples, and 85.3% for test samples uttered by 5males.

A speaker independent word recognition system based on phoneme recognition

Shozo Makino
TL;DR: In this paper, a speaker-independent spoken word recognition system for a large size vocabulary is described, in which speech is analyzed by the filter bank, from whose logarithmic spectrum the 11 features are extracted every 10 ms.
References
More filters
Journal ArticleDOI

Discrete-word recognition utilizing a word dictionary and phonological rules

TL;DR: A discrete-word recognition system utilizing a word dictionary and phonological rules and a method of segmentation in which segmentation is performed using a duration dictionary is described.