Chris Bartels

Researcher at University of Maryland, College Park

Publications - 32

Citations - 910

Chris Bartels is an academic researcher from University of Maryland, College Park. The author has contributed to research in topics: Graphical model & Dynamic Bayesian network. The author has an hindex of 13, co-authored 32 publications receiving 846 citations. Previous affiliations of Chris Bartels include University of Washington & Apple Inc..

Papers

PDF

Open Access

More filters

Journal ArticleDOI

Graphical model architectures for speech recognition

Jeff A. Bilmes, +1 more

- 26 Sep 2005 -

IEEE Signal Processing Magazine

TL;DR: This discussion employs dynamic Bayesian networks (DBNs) and a DBN extension using the Graphical Model Toolkit's (GMTK's) basic template, a dynamic graphical model representation that is more suitable for speech and language systems.

...read moreread less

Proceedings ArticleDOI

Articulatory Feature-Based Methods for Acoustic and Audio-Visual Speech Recognition: Summary from the 2006 JHU Summer workshop

Karen Livescu, +14 more

TL;DR: This work reports on investigations into the use of articulatory features (AFs) for observation and pronunciation models in speech recognition, and investigates a model having multiple streams of AF states with soft synchrony constraints, for both audio-only and audio-visual recognition.

...read moreread less

Proceedings ArticleDOI

Submodular subset selection for large-scale speech training data

Kai Wei, +4 more

TL;DR: This work applies a novel data selection technique based on constrained submodular function maximization to subselecting a large set of acoustic data to train automatic speech recognition (ASR) systems and shows that training data can be reduced significantly, and that the technique outperforms both random selection and a previously proposed selection method utilizing comparable resources.

...read moreread less

Proceedings ArticleDOI

DBN based multi-stream models for audio-visual speech recognition

John N. Gowdy, +3 more

TL;DR: A model based on dynamic Bayesian networks (DBN) to integrate information from multiple audio and visual streams and an absolute improvement of about 4% in word accuracy in the -4 to 10db average case when making use of two audio and one video streams is indicated.

...read moreread less

Posted Content

Voices Obscured in Complex Environmental Settings (VOICES) corpus

Colleen Richey, +13 more

- 13 Apr 2018 -

arXiv: Sound

TL;DR: Voices Obscured In Complex Environmental Settings (VOICES) as mentioned in this paper is a large-scale dataset of speech recorded by far-field microphones in noisy room conditions, where audio was recorded in furnished rooms with background noise played in conjunction with foreground speech selected from the LibriSpeech corpus.

...read moreread less

Collapse