J
Jort F. Gemmeke
Researcher at Katholieke Universiteit Leuven
Publications - 95
Citations - 6306
Jort F. Gemmeke is an academic researcher from Katholieke Universiteit Leuven. The author has contributed to research in topics: Speech processing & Acoustic model. The author has an hindex of 23, co-authored 95 publications receiving 4398 citations. Previous affiliations of Jort F. Gemmeke include Google & Radboud University Nijmegen.
Papers
More filters
Proceedings ArticleDOI
Audio Set: An ontology and human-labeled dataset for audio events
Jort F. Gemmeke,Daniel P. W. Ellis,Dylan Freedman,Aren Jansen,Wade Lawrence,R. Channing Moore,Manoj Plakal,Marvin Ritter +7 more
TL;DR: The creation of Audio Set is described, a large-scale dataset of manually-annotated audio events that endeavors to bridge the gap in data availability between image and audio research and substantially stimulate the development of high-performance audio event recognizers.
Proceedings ArticleDOI
CNN architectures for large-scale audio classification
Shawn Hershey,Sourish Chaudhuri,Daniel P. W. Ellis,Jort F. Gemmeke,Aren Jansen,R. Channing Moore,Manoj Plakal,Devin Platt,Rif A. Saurous,Bryan Seybold,Malcolm Slaney,Ron Weiss,Kevin W. Wilson +12 more
TL;DR: In this paper, the authors used various CNN architectures to classify the soundtracks of a dataset of 70M training videos (5.24 million hours) with 30,871 video-level labels.
Posted Content
CNN Architectures for Large-Scale Audio Classification
Shawn Hershey,Sourish Chaudhuri,Daniel P. W. Ellis,Jort F. Gemmeke,Aren Jansen,R. Channing Moore,Manoj Plakal,Devin Platt,Rif A. Saurous,Bryan Seybold,Malcolm Slaney,Ron Weiss,Kevin W. Wilson +12 more
TL;DR: This work uses various CNN architectures to classify the soundtracks of a dataset of 70M training videos with 30,871 video-level labels, and investigates varying the size of both training set and label vocabulary, finding that analogs of the CNNs used in image classification do well on the authors' audio classification task, and larger training and label sets help up to a point.
Journal ArticleDOI
Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition
TL;DR: The results show that the hybrid system performed substantially better than source separation or missing data mask estimation at lower signal-to-noise ratios (SNRs), achieving up to 57.1% accuracy at SNR = -5 dB.
Journal ArticleDOI
Compressive Sensing for Missing Data Imputation in Noise Robust Speech Recognition
TL;DR: This paper introduces a novel non-parametric, exemplar-based method for reconstructing clean speech from noisy observations, based on techniques from the field of Compressive Sensing, which can impute missing features using larger time windows such as entire words.