scispace - formally typeset
V

Vincent Wan

Researcher at Toshiba

Publications -  68
Citations -  2355

Vincent Wan is an academic researcher from Toshiba. The author has contributed to research in topics: Speech synthesis & Acoustic model. The author has an hindex of 29, co-authored 67 publications receiving 2287 citations. Previous affiliations of Vincent Wan include University of Sheffield.

Papers
More filters
Proceedings ArticleDOI

Support vector machines for speaker verification and identification

TL;DR: A new technique for normalising the polynomial kernel is developed and used to achieve performance comparable to other classifiers on the YOHO database.
Journal ArticleDOI

Speaker verification using sequence discriminant support vector machines

TL;DR: This paper presents a text-independent speaker verification system using support vector machines (SVMs) with score-space kernels and introduces a technique called spherical normalization that preconditions the Hessian matrix.
Proceedings ArticleDOI

The AMI System for the Transcription of Speech in Meetings

TL;DR: The AMI transcription system for speech in meetings developed in collaboration by five research groups includes generic techniques such as discriminative and speaker adaptive training, vocal tract length normalisation, heteroscedastic linear discriminant analysis, maximum likelihood linear regression, and phone posterior based features, as well as techniques specifically designed for meeting data.
Journal ArticleDOI

Transcribing Meetings With the AMIDA Systems

TL;DR: An overview of the AMIDA systems for transcription of conference and lecture room meetings, developed for participation in the Rich Transcription evaluations conducted by the National Institute for Standards and Technology in the years 2007 and 2009 is given.
Journal ArticleDOI

Speech and crosstalk detection in multichannel audio

TL;DR: Tests performed on a large corpus of recorded meetings show classification accuracies of up to 96%, and automatic speech recognition performance close to that obtained using ground truth segmentation.