scispace - formally typeset

Topic

Audio search engine

About: Audio search engine is a(n) research topic. Over the lifetime, 12 publication(s) have been published within this topic receiving 895 citation(s).
Papers
More filters

Proceedings Article
01 Jan 2003-
TL;DR: The algorithm is noise and distortion resistant, computationally efficient, and massively scalable, capable of quickly identifying a short segment of music captured through a cellphone microphone in the presence of foreground voices and other dominant noise, out of a database of over a million tracks.
Abstract: We have developed and commercially deployed a flexible audio search engine. The algorithm is noise and distortion resistant, computationally efficient, and massively scalable, capable of quickly identifying a short segment of music captured through a cellphone microphone in the presence of foreground voices and other dominant noise, and through voice codec compression, out of a database of over a million tracks. The algorithm uses a combinatorially hashed time-frequency constellation analysis of the audio, yielding unusual properties such as transparency, in which multiple tracks mixed together may each be identified. Furthermore, for applications such as radio monitoring, search times on the order of a few milliseconds per query are attained, even on a massive music database.

648 citations


DOI
01 Jan 2004-
Abstract: We have developed and commercially deployed a flexible audio search engine. The algorithm is noise and distortion resistant, computationally efficient, and massively scalable, capable of quickly identifying a short segment of music captured through a cellphone microphone in the presence of foreground voices and other dominant noise, and through voice codec compression, out of a database of over a million tracks. The algorithm uses a combinatorially hashed time-frequency constellation analysis of the audio, yielding unusual properties such as transparency, in which multiple tracks mixed together may each be identified. Furthermore, for applications such as radio monitoring, search times on the order of a few milliseconds per query are attained, even on a massive music database.

96 citations


Proceedings Article
01 Jan 2000-
TL;DR: A speech recognition based audio search engine for indexing spoken documents found on the World Wide Web, focusing on the speech recognition and retrieval aspects, and the results of retrieval experiments demonstrate that the system can index effectively.
Abstract: We have developed a speech recognition based audio search engine for indexing spoken documents found on the World Wide Web Our site (http://wwwcompaqcom/speechbot) indexes around 20 news and talk radio shows covering a wide range of topics, speaking styles and acoustic conditions from a selection of public Web sites with multimedia archives In this paper, we describe our system and its performance, focusing on the speech recognition and retrieval aspects We describe our training procedure in some detail and report our historical error rate since the site launch We also investigate the impact of Out Of Vocabulary (OOV) words Finally we report the results of retrieval experiments which demonstrate that our system can index effectively

53 citations


Proceedings Article
12 Apr 2000-
TL;DR: This site indexes several talk and news radio shows covering a wide range of topics and speaking styles from a selection of public Web sites with multimedia archives, and shows that, even if the transcription is inaccurate, it can still achieve good retrieval performance for typical user queries.
Abstract: We have developed an audio search engine incorporating speech recognition technology. This allows indexing of spoken documents from the World Wide Web when no transcription is available. This site indexes several talk and news radio shows covering a wide range of topics and speaking styles from a selection of public Web sites with multimedia archives. Our Web site is similar in spirit to normal Web search sites; it contains an index, not the actual multimedia content. The audio from these shows suffers in acoustic quality due to bandwidth limitations, coding, compression, and poor acoustic conditions. The shows are typically sampled at 8 kHz and transmitted, RealAudio compressed, at 6.5 kbps. Our word-error rate results using appropriately trained acoustic models show remarkable resilience to the high compression, though many factors combine to increase the average word-error rates over standard broadcast news benchmarks. We show that, even if the transcription is inaccurate, we can still achieve good retrieval performance for typical user queries (69%). Because the archive is large - over 5000 hours of content (and growing at a rate of 100 hours per week), totaling 47 million words and growing rapidly - we measure performance in terms of the precision of the top-ranked matches returned to the user.

39 citations


Journal ArticleDOI
TL;DR: A method is proposed for automatic fine-scale audio description that draws inspiration from ontological sound description methods such as Shaeffer's Objets Sonores and Smalley's Spectromorphology for complete automation of audio description at the level of sound objects for indexing and retrieving sound segments within Internet audio documents.
Abstract: In this article, a method is proposed for automatic fine-scale audio description that draws inspiration from ontological sound description methods such as Shaeffer's Objets Sonores and Smalley's Spectromorphology. The goal is complete automation of audio description at the level of sound objects for indexing and retrieving sound segments within Internet audio documents. To automatically segment audio documents into acoustic lexemes, a hidden Markov model is employed. It is demonstrated that the symbol stream of cluster labels, generated by the Viterbi algorithm, constitutes a detailed description of audio as a sequence of spectral archetypes. The ASCII base-64 encoding scheme maps cluster indices to one-character symbols that are segmented into 8-gram sequences for indexing in a relational database. To illustrate the methods, the essential components of an audio search engine are described: the automatic cataloguer, the retrieval engine and the query language. The results of experiments that test the accu...

27 citations


Network Information
Related Topics (5)
Web navigation

14.9K papers, 389.6K citations

82% related
XML

26.6K papers, 393.3K citations

81% related
Web modeling

21.8K papers, 467.5K citations

81% related
Web page

50.3K papers, 975.1K citations

80% related
Web query classification

11.9K papers, 339.3K citations

80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20211
20191
20161
20131
20121
20061

Top Attributes

Show by:

Topic's top 5 most impactful authors

Jean-Manuel Van Thong

2 papers, 92 citations

Avery Li-Chun Wang

2 papers, 744 citations

Beth Logan

2 papers, 92 citations

Pedro J. Moreno

2 papers, 92 citations

Christian Frisson

1 papers, 4 citations