Journal ArticleDOI
Robust and efficient multiple alignment of unsynchronized meeting recordings
T. J. Tsai,Andreas Stolcke +1 more
Reads0
Chats0
TLDR
This paper proposes a way to generate a single high-quality audio recording of a meeting using no equipment other than participants' personal devices using an adaptive audio fingerprint based on spectrotemporal eigenfilters, where the fingerprint design is learned on-the-fly in a totally unsupervised way to perform well on the data at hand.Abstract:
This paper proposes a way to generate a single high-quality audio recording of a meeting using no equipment other than participants’ personal devices. Each participant in the meeting uses their mobile device as a local recording node, and they begin recording whenever they arrive in an unsynchronized fashion. The main problem in generating a single summary recording is to temporally align the various audio recordings in a robust and efficient manner. We propose a way to do this using an adaptive audio fingerprint based on spectrotemporal eigenfilters, where the fingerprint design is learned on-the-fly in a totally unsupervised way to perform well on the data at hand. The adaptive fingerprints require only a few seconds of data to learn a robust design, and they require no tuning. Our method uses an iterative, greedy two-stage alignment algorithm which finds a rough alignment using indexing techniques, and then performs a more fine-grained alignment based on Hamming distance. Our proposed system achieves $>$ 99% alignment accuracy on challenging alignment scenarios extracted from the ICSI meeting corpus, and it outperforms five other well-known and state-of-the-art fingerprint designs. We conduct extensive analyses of the factors that affect the robustness of the adaptive fingerprints, and we provide a simple heuristic that can be used to adjust the fingerprint’s robustness according to the amount of computation we are willing to perform.read more
Citations
More filters
Journal ArticleDOI
Recognition of Activities of Daily Living Based on Environmental Analyses Using Audio Fingerprinting Techniques: A Systematic Review.
Ivan Miguel Pires,Rui Santos,Nuno Pombo,Nuno Pombo,Nuno M. Garcia,Nuno M. Garcia,Francisco Flórez-Revuelta,Susanna Spinsante,Rossitza Goleva,Eftim Zdravevski +9 more
TL;DR: audio fingerprinting techniques that can be used with the acoustic data acquired from mobile devices using data acquired with mobile devices published between 2002 and 2017 are reviewed.
Proceedings Article
Known Artist Live Song ID: A Hashprint Approach.
TL;DR: A system for known-artist live song identification and empirical evidence of its feasibility is provided and the proposed system improves the mean reciprocal rank from .68 to .79, while simultaneously reducing the average runtime per query from 10 seconds down to 0.9 seconds.
Journal ArticleDOI
Known-Artist Live Song Identification Using Audio Hashprints
TL;DR: This paper proposes a multistep approach to address the problem of live song identification for popular bands by representing the audio as a sequence of binary codes called hashprints, derived from a set of spectrotemporal filters that are learned in an unsupervised artist-specific manner.
Journal ArticleDOI
Multiresolution alignment for multiple unsynchronized audio sequences using sequential Monte Carlo samplers
TL;DR: A multiresolution alignment algorithm for aligning multiple unsynchronized audio sequences using Sequential Monte Carlo samplers using a model based approach and a score function analogous to similarity based methods is proposed.
Audio Hashprints: Theory & Application
TL;DR: This talk introduces a method for learning a mapping from a continuous time-series signal to a sequence of discrete symbols that is suitable for reverse-indexing and efficient pairwise comparison, and investigates the performance of the proposed hashprints on two different audio search tasks.
References
More filters
Journal ArticleDOI
Robust Real-Time Face Detection
Paul A. Viola,Michael Jones +1 more
TL;DR: In this paper, a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates is described. But the detection performance is limited to 15 frames per second.
Proceedings ArticleDOI
Robust real-time face detection
Paul A. Viola,Michael Jones +1 more
TL;DR: A new image representation called the “Integral Image” is introduced which allows the features used by the detector to be computed very quickly and a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions.
Proceedings ArticleDOI
Locality-sensitive hashing scheme based on p-stable distributions
TL;DR: A novel Locality-Sensitive Hashing scheme for the Approximate Nearest Neighbor Problem under lp norm, based on p-stable distributions that improves the running time of the earlier algorithm and yields the first known provably efficient approximate NN algorithm for the case p<1.
Proceedings Article
Spectral Hashing
TL;DR: The problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can be shown to be NP hard and a spectral method is obtained whose solutions are simply a subset of thresholded eigenvectors of the graph Laplacian.
Proceedings Article
A Highly Robust Audio Fingerprinting System.
Jaap A. Haitsma,Ton Kalker +1 more
TL;DR: An audio fingerprinting system that uses the fingerprint of an unknown audio clip as a query on a fingerprint database, which contains the fingerprints of a large library of songs, the audio clip can be identified.