An industrial-strength audio search algorithm

doi:10.5072/ZENODO.243872

Home
/
Papers
/
An industrial-strength audio search algorithm

An industrial-strength audio search algorithm

01 Jan 2004-pp 582-588

TL;DR: In this article, the authors developed and commercially deployed a flexible audio search engine that is noise and distortion resistant, computationally efficient, and massively scalable, capable of quickly identifying a short segment of music captured through a cellphone microphone in the presence of foreground voices and other dominant noise, and through voice codec compression.

read less

Abstract: We have developed and commercially deployed a flexible audio search engine. The algorithm is noise and distortion resistant, computationally efficient, and massively scalable, capable of quickly identifying a short segment of music captured through a cellphone microphone in the presence of foreground voices and other dominant noise, and through voice codec compression, out of a database of over a million tracks. The algorithm uses a combinatorially hashed time-frequency constellation analysis of the audio, yielding unusual properties such as transparency, in which multiple tracks mixed together may each be identified. Furthermore, for applications such as radio monitoring, search times on the order of a few milliseconds per query are attained, even on a massive music database.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Patent•

Device and method for analyzing an information signal

[...]

Juergen Herre, Eric Allamanche, Oliver Hellmuth, Thorsten Kastner

09 May 2005

TL;DR: In this article, the identification results for successive fingerprints are prepared while using a series of fingerprints for the series of blocks, whereby an identification result depicts an association of a block of information units with a predetermined information entity.

...read moreread less

Abstract: In order to analyze an information signal, which has a series of blocks of information units, whereby a number of successive blocks of the series of blocks depicts an information entity, identification results for successive fingerprints are prepared (12) while using a series of fingerprints for the series of blocks, whereby an identification result depicts an association of a block of information units with a predetermined information entity. After this, at least two hypotheses are formed (14) from the identification results for the successive fingerprints. A first hypothesis is an assumption for the association of the series of blocks with a first information entity, and the second hypothesis is an assumption for the association of the series of blocks with the second information entity. Afterwards, different hypotheses are tested (16) in order to obtain a test result on the basis of which an assertion concerning the information signal is made (20). This results in obtaining a meaningful and reliable continuous-time analysis of an information signal.

...read moreread less

94 citations

Patent•

Systems and methods for sound recognition

[...]

Aaron Master, Timothy P. Stonehocker, Benjamin John Levitt, Jun Huang, Keyvan Mohajer - Show less +1 more

04 May 2010

TL;DR: In this paper, a system and methods for recognizing sounds are provided, where user input relating to one or more sounds is received from a computing device, and instructions are executed by a processor to discriminate the one or multiple sounds, extract music features from the sounds, analyze the music features using databases, and obtain information regarding the features based on the analysis.

...read moreread less

Abstract: Systems and methods for recognizing sounds are provided herein. User input relating to one or more sounds is received from a computing device. Instructions, which are stored in memory, are executed by a processor to discriminate the one or more sounds, extract music features from the one or more sounds, analyze the music features using one or more databases, and obtain information regarding the music features based on the analysis. Further, information regarding the music features of the one or more sounds may be transmitted to display on the computing device.

...read moreread less

74 citations

Patent•

Systems and methods for enabling natural language processing

[...]

Keyvan Mohajer

21 May 2013

TL;DR: In this paper, the authors present a system and methods for searching databases by sound data input, and present a search technology that furnishes search results in a fast and accurate manner.

...read moreread less

Abstract: Systems and methods for searching databases by sound data input are provided herein. A service provider may have a need to make their database(s) searchable through search technology. However, the service provider may not have the resources to implement such search technology. The search technology may allow for search queries using sound data input. The technology described herein provides a solution addressing the service provider's need, by giving a search technology that furnishes search results in a fast, accurate manner. In further embodiments, systems and methods to monetize those search results are also described herein.

...read moreread less

72 citations

Proceedings Article•DOI•

Auditeur: a mobile-cloud service platform for acoustic event detection on smartphones

[...]

Shahriar Nirjon¹, Robert F. Dickerson¹, Philip Asare¹, Qiang Li¹, Dezhi Hong¹, John A. Stankovic¹, Pan Hu², Guobin Shen², Xiaofan Jiang³ - Show less +5 more•Institutions (3)

University of Virginia¹, Microsoft², Intel³

25 Jun 2013

TL;DR: A user study is presented to demonstrate that novice programmers can implement the core logic of interesting apps with Auditeur in less than 30 minutes, using only 15 - 20 lines of Java code.

...read moreread less

Abstract: Auditeur is a general-purpose, energy-efficient, and context-aware acoustic event detection platform for smartphones. It enables app developers to have their app register for and get notified on a wide variety of acoustic events. Auditeur is backed by a cloud service to store user contributed sound clips and to generate an energy-efficient and context-aware classification plan for the phone. When an acoustic event type has been registered, the smartphone instantiates the necessary acoustic processing modules and wires them together to execute the plan. The phone then captures, processes, and classifies acoustic events locally and efficiently. Our analysis on user-contributed empirical data shows that Auditeur's energy-aware acoustic feature selection algorithm is capable of increasing the device lifetime by 33.4%, sacrificing less than 2% of the maximum achievable accuracy. We implement seven apps with Auditeur, and deploy them in real-world scenarios to demonstrate that Auditeur is versatile, 11.04% - 441.42% less power hungry, and 10.71% - 13.86% more accurate in detecting acoustic events, compared to state-of-the-art techniques. We present a user study to demonstrate that novice programmers can implement the core logic of interesting apps with Auditeur in less than 30 minutes, using only 15 - 20 lines of Java code.

...read moreread less

67 citations

Proceedings Article•DOI•

Drone sound detection

[...]

Jozsef Mezei¹, Viktor Fiaska¹, Andras Molnar¹•Institutions (1)

Óbuda University¹

01 Nov 2015

TL;DR: Nowadays small size UAV (Unmanned Aerial Vehicle) systems are really popular and they can be used for smuggling, observation, violation of privacy etc, and the price of these things is quickly becoming lower.

...read moreread less

Abstract: Nowadays small size UAV (Unmanned Aerial Vehicle) systems are really popular. We can find so many of them on video sharing portals and nowadays there are many articles on the media with the advantages and disadvantages of them. We can use “drones” for so many useful task, but on the other hand the risk is increasing with the growing popularity. We can found articles about the unethical and unlawful usages of them. They can be used for smuggling, observation, violation of privacy etc. As a consequence of the growing popularity, the price of these things is quickly becoming lower. High quality cameras are also widespread and it's also easy to install one onto a commercially available drone. We have to admit that these factors are become increasingly critical.

...read moreread less

47 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Content-based classification, search, and retrieval of audio

[...]

E. Wold¹, T. Blum, D. Keislar, J. Wheaten•Institutions (1)

University of California, Berkeley¹

01 Sep 1996-IEEE MultiMedia

TL;DR: The audio analysis, search, and classification engine described here reduces sounds to perceptual and acoustical features, which lets users search or retrieve sounds by any one feature or a combination of them, by specifying previously learned classes based on these features.

...read moreread less

Abstract: Many audio and multimedia applications would benefit from the ability to classify and search for audio based on its characteristics. The audio analysis, search, and classification engine described here reduces sounds to perceptual and acoustical features. This lets users search or retrieve sounds by any one feature or a combination of them, by specifying previously learned classes based on these features, or by selecting or entering reference sounds and asking the engine to retrieve similar or dissimilar sounds.

...read moreread less

1,147 citations

Proceedings Article•

A Highly Robust Audio Fingerprinting System.

[...]

Jaap A. Haitsma, Ton Kalker

01 Jan 2002

TL;DR: An audio fingerprinting system that uses the fingerprint of an unknown audio clip as a query on a fingerprint database, which contains the fingerprints of a large library of songs, the audio clip can be identified.

...read moreread less

Abstract: Imagine the following situation. You’re in your car, listening to the radio and suddenly you hear a song that catches your attention. It’s the best new song you have heard for a long time, but you missed the announcement and don’t recognize the artist. Still, you would like to know more about this music. What should you do? You could call the radio station, but that’s too cumbersome. Wouldn’t it be nice if you could push a few buttons on your mobile phone and a few seconds later the phone would respond with the name of the artist and the title of the music you’re listening to? Perhaps even sending an email to your default email address with some supplemental information. In this paper we present an audio fingerprinting system, which makes the above scenario possible. By using the fingerprint of an unknown audio clip as a query on a fingerprint database, which contains the fingerprints of a large library of songs, the audio clip can be identified. At the core of the presented system are a highly robust fingerprint extraction method and a very efficient fingerprint search strategy, which enables searching a large fingerprint database with only limited computing resources.

...read moreread less

911 citations

Proceedings Article•DOI•

MACS: music audio characteristic sequence indexing for similarity retrieval

[...]

Cheng Yang¹•Institutions (1)

Stanford University¹

21 Oct 2001

TL;DR: The algorithm tries to capture the intuitive notion of similarity perceived by humans: two pieces are similar if they are fully or partially based on the same score, even if they were performed by different people or at different speed.

...read moreread less

Abstract: We present a prototype method of indexing raw-audio music files in a way that facilitates content-based similarity retrieval. The algorithm tries to capture the intuitive notion of similarity perceived by humans: two pieces are similar if they are fully or partially based on the same score, even if they are performed by different people or at different speed. Local peaks in signal power are identified in each audio file, and a spectral vector is extracted near each peak. Nearby peaks are selectively grouped together to form "characteristic sequences" which are used as the basis for indexing. A hashing scheme known as "locality-sensitive hashing" is employed to index the high-dimensional vectors. Retrieval results are ranked based on the number of final matches filtered by some linearity criteria.

...read moreread less

75 citations