scispace - formally typeset
Open AccessProceedings ArticleDOI

An Open-source State-of-the-art Toolbox for Broadcast News Diarization

Reads0
Chats0
TLDR
This paper presents the LIUM open-source speaker diarization toolbox, mostly dedicated to broadcast news, which includes both Hierarchical Agglomerative Clustering using well-known measures such as BIC and CLR, and the new ILP clustering algorithm using i-vectors.
Abstract
This paper presents the LIUM open-source speaker diarization toolbox, mostly dedicated to broadcast news. This tool includes both Hierarchical Agglomerative Clustering using well-known measures such as BIC and CLR, and the new ILP clustering algorithm using i-vectors. Diarization systems are tested on the French evaluation data from ESTER, ETAPE and REPERE campaigns.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

The Second DIHARD Diarization Challenge: Dataset, Task, and Baselines.

TL;DR: The second edition of the DIHARD challenge as discussed by the authors was designed to improve the robustness of speaker diarization systems to variation in recording equipment, noise conditions, and conversational domain.
Posted Content

Europarl-ST: A Multilingual Corpus For Speech Translation Of Parliamentary Debates

TL;DR: A novel multilingual SLT corpus containing paired audio-text samples for SLT from and into 6 European languages, for a total of 30 different translation directions, compiled using the debates held in the European Parliament between 2008 and 2012 is presented.
Proceedings ArticleDOI

Convolutional Neural Network for speaker change detection in telephone speaker diarization system

TL;DR: The final results on speaker diarization system indicate that the use of speaker change detection based on CNN is beneficial with relative improvement of diarized error rate by 28 %.
Posted Content

The Third DIHARD Diarization Challenge

TL;DR: The third DIHARD challenge, the third in a series of speaker diarization challenges intended to improve the robustness of diarized systems to variation in recording equipment, noise conditions, and conversational domain, is introduced.
Posted Content

VoxLingua107: a Dataset for Spoken Language Recognition

TL;DR: This paper generates semi-random search phrases from language-specific Wikipedia data that are then used to retrieve videos from YouTube for 107 languages and uses the data to build language recognition models for several spoken language identification tasks.
References
More filters
Journal ArticleDOI

Front-End Factor Analysis for Speaker Verification

TL;DR: An extension of the previous work which proposes a new speaker representation for speaker verification, a new low-dimensional speaker- and channel-dependent space is defined using a simple factor analysis, named the total variability space because it models both speaker and channel variabilities.

Automatic Segmentation, Classification and Clustering of Broadcast News Audio

M. A. Siegler
TL;DR: This work describes the problems faced in adapting a system built to recognize one utterance at a time to a task that requires recognition of an entire half hour show, and shows that a priori knowledge of acoustic conditions and speakers in the broadcast data is not required for segmentation.
Journal ArticleDOI

Video shot boundary detection: Seven years of TRECVid activity

TL;DR: An overview of the TRECVid shot boundary detection task, a high-level overview ofThe most significant of the approaches taken, and a comparison of performances are presented, focussing on one year (2005) as an example.
Proceedings ArticleDOI

The ESTER Phase II Evaluation Campaign for the Rich Transcription of French Broadcast News

TL;DR: This paper gives the final results of the ESTER evaluation campaign which started in 2003 and ended in January 2005, to evaluate automatic broadcast news rich transcription systems for the French language.
Proceedings Article

The ESTER 2 Evaluation Campaign for the Rich Transcription of French Radio Broadcasts

TL;DR: The aim of this campaign was to evaluate automatic radio broadcasts rich transcription systems for the French language through evaluation of audio event detection and tracking, orthographic transcription, and information extraction.
Related Papers (5)