An Open-source State-of-the-art Toolbox for Broadcast News Diarization

doi:10.21437/INTERSPEECH.2013-383

Open AccessProceedings ArticleDOI

An Open-source State-of-the-art Toolbox for Broadcast News Diarization

Mickael Rouvier, +4 more

- pp 1477-1481

Chats0

TLDR

This paper presents the LIUM open-source speaker diarization toolbox, mostly dedicated to broadcast news, which includes both Hierarchical Agglomerative Clustering using well-known measures such as BIC and CLR, and the new ILP clustering algorithm using i-vectors.

Abstract:

This paper presents the LIUM open-source speaker diarization toolbox, mostly dedicated to broadcast news. This tool includes both Hierarchical Agglomerative Clustering using well-known measures such as BIC and CLR, and the new ILP clustering algorithm using i-vectors. Diarization systems are tested on the French evaluation data from ESTER, ETAPE and REPERE campaigns.

Figures

Table 2: Single- and cross-show DER on the ESTER 2, ETAPE and REPERE 2012 test corpora.

Table 1: Tools and features of the LIUM SpkDiarization v8.0

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

The Second DIHARD Diarization Challenge: Dataset, Task, and Baselines.

Neville Ryant, +6 more

TL;DR: The second edition of the DIHARD challenge as discussed by the authors was designed to improve the robustness of speaker diarization systems to variation in recording equipment, noise conditions, and conversational domain.

...read moreread less

Posted Content

Europarl-ST: A Multilingual Corpus For Speech Translation Of Parliamentary Debates

Javier Iranzo-Sánchez, +7 more

- 08 Nov 2019 -

arXiv: Computation and Language

TL;DR: A novel multilingual SLT corpus containing paired audio-text samples for SLT from and into 6 European languages, for a total of 30 different translation directions, compiled using the debates held in the European Parliament between 2008 and 2012 is presented.

...read moreread less

Proceedings ArticleDOI

Convolutional Neural Network for speaker change detection in telephone speaker diarization system

Marek Hrúz, +1 more

TL;DR: The final results on speaker diarization system indicate that the use of speaker change detection based on CNN is beneficial with relative improvement of diarized error rate by 28 %.

...read moreread less

Posted Content

The Third DIHARD Diarization Challenge

Neville Ryant, +8 more

- 02 Dec 2020 -

arXiv: Audio and Speech Processing

TL;DR: The third DIHARD challenge, the third in a series of speaker diarization challenges intended to improve the robustness of diarized systems to variation in recording equipment, noise conditions, and conversational domain, is introduced.

...read moreread less

Posted Content

VoxLingua107: a Dataset for Spoken Language Recognition

Jörgen Valk, +1 more

- 25 Nov 2020 -

arXiv: Audio and Speech Processing

TL;DR: This paper generates semi-random search phrases from language-specific Wikipedia data that are then used to retrieve videos from YouTube for 107 languages and uses the data to build language recognition models for several spoken language identification tasks.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Front-End Factor Analysis for Speaker Verification

Najim Dehak, +4 more

- 01 May 2011 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: An extension of the previous work which proposes a new speaker representation for speaker verification, a new low-dimensional speaker- and channel-dependent space is defined using a simple factor analysis, named the total variability space because it models both speaker and channel variabilities.

...read moreread less

Automatic Segmentation, Classification and Clustering of Broadcast News Audio

M. A. Siegler

TL;DR: This work describes the problems faced in adapting a system built to recognize one utterance at a time to a task that requires recognition of an entire half hour show, and shows that a priori knowledge of acoustic conditions and speakers in the broadcast data is not required for segmentation.

...read moreread less

Journal ArticleDOI

Video shot boundary detection: Seven years of TRECVid activity

Alan F. Smeaton, +2 more

- 01 Apr 2010 -

Computer Vision and Image Understanding

TL;DR: An overview of the TRECVid shot boundary detection task, a high-level overview ofThe most significant of the approaches taken, and a comparison of performances are presented, focussing on one year (2005) as an example.

...read moreread less

Proceedings ArticleDOI

The ESTER Phase II Evaluation Campaign for the Rich Transcription of French Broadcast News

Sylvain Galliano, +5 more

TL;DR: This paper gives the final results of the ESTER evaluation campaign which started in 2003 and ended in January 2005, to evaluate automatic broadcast news rich transcription systems for the French language.

...read moreread less

Proceedings Article

The ESTER 2 Evaluation Campaign for the Rich Transcription of French Radio Broadcasts

Sylvain Galliano, +2 more

TL;DR: The aim of this campaign was to evaluate automatic radio broadcasts rich transcription systems for the French language through evaluation of audio event detection and tracking, orthographic transcription, and information extraction.

...read moreread less