An Improved Speech Segmentation Quality Measure: the R-value

Open AccessProceedings Article

An Improved Speech Segmentation Quality Measure: the R-value

- pp 1851-1854

TLDR

A new R-value quality measure is introduced that indicates how close a segmentation algorithm’s performance is to an ideal point of operation after established measures were found to be insensitive to this type of random boundary insertion.

Abstract:

Phone segmentation in ASR is usually performed indirectly by Viterbi decoding of HMM output. Direct approaches also exist, e.g., blind speech segmentation algorithms. In either case, performance of automatic speech segmentation algorithms is often measured using automated evaluation algorithms and used to optimize a segmentation system’s performance. However, evaluation approaches reported in literature were found to be lacking. Also, we have determined that increases in phone boundary location detection rates are often due to increased over-segmentation levels and not to algorithmic improvements, i.e., by simply adding random boundaries a better hit-rate can be achieved when using current quality measures. Since established measures were found to be insensitive to this type of random boundary insertion, a new R-value quality measure is introduced that indicates how close a segmentation algorithm’s performance is to an ideal point of operation.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Multilingual processing of speech via web services

Thomas Kisler, +2 more

- 01 Sep 2017 -

Computer Speech & Language

TL;DR: Five multilingual web services for speech science operational since 2012 are described and the benefits and drawbacks of the new paradigm as well as the experiences with user acceptance and implementation problems are discussed.

...read moreread less

Book ChapterDOI

Phoneme Recognition on the TIMIT Database

Carla Lopes, +1 more

TL;DR: Speech recognition based on phones is very attractive since it is inherently free from vocabulary limitations, but large Vocabulary ASR systems’ performance depends on the quality of the phone recognizer, so research teams continue developing phone recognizers, in order to enhance their performance as much as possible.

...read moreread less

Journal ArticleDOI

Pre-linguistic segmentation of speech into syllable-like units

Okko Räsänen, +2 more

- 01 Feb 2018 -

Cognition

TL;DR: The present study investigates the feasibility of speech segmentation into syllable-like chunks without any a priori linguistic knowledge, and shows that the sonority fluctuation in speech is highly informative of syllable and word boundaries in all three cases without any language-specific tuning of the model.

...read moreread less

Proceedings ArticleDOI

Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation.

Felix Kreuk, +2 more

TL;DR: In this article, a self-supervised representation learning model is proposed for unsupervised phoneme boundary detection, which is optimized to identify spectral changes in the signal using the Noise-Contrastive Estimation principle.

...read moreread less

Journal ArticleDOI

Phonetic segmentation of speech signal using local singularity analysis

Vahid Khanagha, +3 more

- 01 Dec 2014 -

Digital Signal Processing

TL;DR: A two-stage segmentation algorithm is developed that is significantly more accurate than state-of-the-art ones and convey relevant information about local dynamics of the speech signal that can be used for the task of phonetic segmentation.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Robust speaker change detection

Jitendra Ajmera, +2 more

- 26 Jul 2004 -

IEEE Signal Processing Letters

TL;DR: In this article, the authors present a criterion which can be used to identify speaker changes in an audio stream without such tuning, which consists of calculating the log likelihood ratio (LLR) of two models with the same number of parameters.

...read moreread less

An HMM-based system for automatic segmentation and alignment of speech

Kåre Sjölander

TL;DR: A system for automatic time-aligned phone transcription of spoken Swedish has been developed using a speech recording and an orthographic transcription of the words spoken in the recording to generate a phone-level segmentation without manual intervention.

...read moreread less

Proceedings ArticleDOI

A new text-independent method for phoneme segmentation

G. Aversano, +2 more

TL;DR: A new approach for text-independent speech segmentation based on critical-band perceptual analysis and an original algorithm for the individuation of phoneme boundaries is proposed, promising since the method gives 74% of correct segmentation without presenting over-segmentation.

...read moreread less

Proceedings ArticleDOI

Finding Maximum Margin Segments in Speech

Y.G. Estevan, +2 more

TL;DR: Initial analyses show that MMC is a promising method for the automatic detection of sub-phonetic information in the speech signal and is highly competitive with existing unsupervised methods for theautomatic detection of phoneme boundaries.

...read moreread less

Proceedings ArticleDOI

"Blind" speech segmentation: automatic segmentation of speech without linguistic knowledge

Manish Sharma, +1 more

TL;DR: A new automaticspeech segmentation procedure, called the "Blind" speech segmentation, is presented, which involves finding the optimal number of sub- word segments in the given speech sample, before locating the sub-word segment boundaries.

...read moreread less

An Improved Speech Segmentation Quality Measure: the R-value

Citations

Multilingual processing of speech via web services

Phoneme Recognition on the TIMIT Database

Pre-linguistic segmentation of speech into syllable-like units

Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation.

Phonetic segmentation of speech signal using local singularity analysis

References

Robust speaker change detection

An HMM-based system for automatic segmentation and alignment of speech

A new text-independent method for phoneme segmentation

Finding Maximum Margin Segments in Speech

"Blind" speech segmentation: automatic segmentation of speech without linguistic knowledge

Related Papers (5)

On the relation between maximum spectral transition positions and phone boundaries.

TIMIT Acoustic-Phonetic Continuous Speech Corpus

Finding Maximum Margin Segments in Speech

Unsupervised optimal phoneme segmentation: Objectives, algorithm and comparisons

A new text-independent method for phoneme segmentation