Robust and efficient multiple alignment of unsynchronized meeting recordings

doi:10.1109/TASLP.2016.2526787

Journal ArticleDOI

Robust and efficient multiple alignment of unsynchronized meeting recordings

T. J. Tsai, +1 more

- 01 May 2016 -

IEEE Transactions on Audio, Speech, and ...

- Vol. 24, Iss: 5, pp 833-845

Chats0

TLDR

This paper proposes a way to generate a single high-quality audio recording of a meeting using no equipment other than participants' personal devices using an adaptive audio fingerprint based on spectrotemporal eigenfilters, where the fingerprint design is learned on-the-fly in a totally unsupervised way to perform well on the data at hand.

Abstract:

This paper proposes a way to generate a single high-quality audio recording of a meeting using no equipment other than participants’ personal devices. Each participant in the meeting uses their mobile device as a local recording node, and they begin recording whenever they arrive in an unsynchronized fashion. The main problem in generating a single summary recording is to temporally align the various audio recordings in a robust and efficient manner. We propose a way to do this using an adaptive audio fingerprint based on spectrotemporal eigenfilters, where the fingerprint design is learned on-the-fly in a totally unsupervised way to perform well on the data at hand. The adaptive fingerprints require only a few seconds of data to learn a robust design, and they require no tuning. Our method uses an iterative, greedy two-stage alignment algorithm which finds a rough alignment using indexing techniques, and then performs a more fine-grained alignment based on Hamming distance. Our proposed system achieves $>$ 99% alignment accuracy on challenging alignment scenarios extracted from the ICSI meeting corpus, and it outperforms five other well-known and state-of-the-art fingerprint designs. We conduct extensive analyses of the factors that affect the robustness of the adaptive fingerprints, and we provide a simple heuristic that can be used to adjust the fingerprint’s robustness according to the amount of computation we are willing to perform.

Robust and efficient multiple alignment of unsynchronized meeting recordings

Citations

Recognition of Activities of Daily Living Based on Environmental Analyses Using Audio Fingerprinting Techniques: A Systematic Review.

Known Artist Live Song ID: A Hashprint Approach.

Known-Artist Live Song Identification Using Audio Hashprints

Multiresolution alignment for multiple unsynchronized audio sequences using sequential Monte Carlo samplers

Audio Hashprints: Theory & Application

References

Robust Real-Time Face Detection

Robust real-time face detection

Locality-sensitive hashing scheme based on p-stable distributions

Spectral Hashing

A Highly Robust Audio Fingerprinting System.

Related Papers (5)

An Efficient Cascaded Filtering Retrieval Method for Big Audio Data

Audio fingerprinting based on local energy centroid

Robust quad-based audio fingerprinting

Detection of Double Compressed AMR Audio Using Stacked Autoencoder

MDCT-Based Perceptual Hashing for Compressed Audio Content Identification