The ICSI RT07s Speaker Diarization System

doi:10.1007/978-3-540-68585-2_47

Book ChapterDOI

The ICSI RT07s Speaker Diarization System

- pp 509-519

TLDR

The ICSI speaker diarization system as mentioned in this paper automatically performs both speaker segmentation and clustering without any prior knowledge of the identities or the number of speakers, using standard speech processing components and techniques such as HMMs, agglomerative clustering, and the Bayesian Information Criterion.

Abstract:

In this paper, we present the ICSI speaker diarization system. This system was used in the 2007 National Institute of Standards and Technology (NIST) Rich Transcription evaluation. The ICSI system automatically performs both speaker segmentation and clustering without any prior knowledge of the identities or the number of speakers. Our system uses "standard" speech processing components and techniques such as HMMs, agglomerative clustering, and the Bayesian Information Criterion. However, we have developed the system with an eye towards robustness and ease of portability. Thus we have avoided the use of any sort of model that requires training on "outside" data and we have attempted to develop algorithms that require as little tuning as possible. The system is simular to last year's system [1] except for three aspects. We used the most recent available version of the beam-forming toolkit, we implemented a new speech/non-speech detector that does not require models trained on meeting data and we performed our development on a much larger set of recordings.

The ICSI RT07s Speaker Diarization System

Citations

Delay based optimisation of an integrated online call recording speaker diarisation and identification system

A Real-Time Speech Enhancement Front-End for Multi-Talker Reverberated Scenarios

New Insights into Hierarchical Clustering and Linguistic Normalization for Speaker Diarization

An Efficient Speaker Diarization using Privacy Preserving Audio Features Based of Speech/Non Speech Detection

References

Extrapolation, Interpolation, and Smoothing of Stationary Time Series

Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion

A robust speaker clustering algorithm

Approaches and applications of audio diarization

Robust speaker change detection

Related Papers (5)

An overview of automatic speaker diarization systems

Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion

A robust speaker clustering algorithm

Speaker Diarization: A Review of Recent Research

Approaches and applications of audio diarization