scispace - formally typeset
Search or ask a question
Topic

Audio signal processing

About: Audio signal processing is a research topic. Over the lifetime, 21463 publications have been published within this topic receiving 319597 citations. The topic is also known as: audio processing & Acoustic signal processing.


Papers
More filters
Patent
09 Apr 1993
TL;DR: In this paper, a digital audio workstation for the audio portions of video programs is presented, which combines audio editing capability with the ability to immediately display video images associated with the audio program.
Abstract: The invention disclosed herein is a digital audio workstation for the audio portions of video programs. It combines audio editing capability with the ability to immediately display video images associated with the audio program. The invention detects an operator's indication of a point or segment of audio information and uses it to retrieve and display the video images that correspond to the indicated audio programming. Another aspect of the invention is a labeling and notation system for recorded digitized audio or video information. The system provides a means of storing in association with a particular point of the audio or video information a digitized voice or textual message for later reference regarding that information.

226 citations

Proceedings ArticleDOI
05 Jun 2000
TL;DR: This paper restricts its considerations to the case where only a single microphone recording of the noisy signal is available and proposes a method based on temporal quantiles in the power spectral domain, which is compared with pause detection and recursive averaging.
Abstract: Elimination of additive noise from a speech signal is a fundamental problem in audio signal processing. In this paper we restrict our considerations to the case where only a single microphone recording of the noisy signal is available. The algorithms which we investigate proceed in two steps. First, the noise power spectrum is estimated. A method based on temporal quantiles in the power spectral domain is proposed and compared with pause detection and recursive averaging. The second step is to eliminate the estimated noise from the observed signal by spectral subtraction or Wiener filtering. The database used in the experiments comprises 6034 utterances of German digits and digit strings by 770 speakers in 10 different cars. Without noise reduction, we obtain an error rate of 11.7%. Quantile based noise estimation and Wiener filtering reduce the error rate to 8.6%. Similar improvements are achieved in an experiment with artificial, non-stationary noise.

226 citations

Patent
05 Jan 2006
TL;DR: In this paper, a digital audio file search method and apparatus for digital audio files is provided that allows a user to navigate the audio files by generating speech sounds related to the information of the audio file to facilitate searching and playback.
Abstract: A digital audio file search method and apparatus for digital audio files is provided that allows a user to navigate the audio files by generating speech sounds related to the information of the audio files to facilitate searching and playback. The digital audio file search method and apparatus searches for audio files in a portable digital audio player in combination with an automobile audio system through speech sounds by utilizing text-to-speech processing and by prompting response from a user in response to the generated speech sounds. The text-to-speech technology is utilized to generate the speech sound based on tag-data of the audio files. When hearing the speech sounds, the user gives instruction for searching the files without being distracted from driving the automobile.

226 citations

Proceedings ArticleDOI
22 Apr 1998
TL;DR: This paper explains how the Informedia system takes advantage of the closed captioning frequently broadcast with the news, how it extracts timing information by aligning the closed-captions with the result of the speech recognition, and how the system integrates closed-caption cues with the results of image and audio processing.
Abstract: The Informedia Digital Library Project allows full content indexing and retrieval of text, audio and video material. Segmentation is an integral process in the Informedia digital video library. The success of the Informedia project hinges on two critical assumptions: that we can extract sufficiently accurate speech recognition transcripts from the broadcast audio and that we can segment the broadcast into video paragraphs, or stories, that are useful for information retrieval. In previous papers we have shown that speech recognition is sufficient for information retrieval of pre-segmented video news stories. We now address the issue of segmentation and demonstrate that a fully automatic system can extract story boundaries using available audio, video and closed-captioning cues. The story segmentation step for the Informedia Digital Video Library splits full-length news broadcasts into individual news stories. During this phase the system also labels commercials as separate "stories". We explain how the Informedia system takes advantage of the closed captioning frequently broadcast with the news, how it extracts timing information by aligning the closed-captions with the result of the speech recognition, and how the system integrates closed-caption cues with the results of image and audio processing.

224 citations

Patent
20 Mar 1995
TL;DR: In this paper, a system for broadcasting audio music and broadcasting lyrics for display and highlighting substantially simultaneously with the occurrence of the lyrics in the accompanying audio music is provided, which includes a audio music source that provides a data output and a analog audio signal output.
Abstract: A system for broadcasting audio music and broadcasting lyrics for display and highlighting substantially simultaneously with the occurrence of the lyrics in the accompanying audio music is provided. The system includes a audio music source that provides a data output and a analog audio signal output. A computer receives the data output by the music source and generates lyric text data and lyric timing commands. A subcarrier generator generates a subcarrier signal carrying the lyric text data and lyric timing commands. An FM transmitter broadcasts a composite signal that combines the analog output of the music source with the subcarrier signal. A lyric display unit receives the composite signal, separates and decodes the subcarrier signal and displays and highlights lyrics according the lyric text data and lyric timing commands decoded from the subcarrier signal.

224 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
81% related
Feature (computer vision)
128.2K papers, 1.7M citations
79% related
Robustness (computer science)
94.7K papers, 1.6M citations
78% related
Noise
110.4K papers, 1.3M citations
77% related
Image segmentation
79.6K papers, 1.8M citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202319
202263
2021217
2020525
2019659
2018597