scispace - formally typeset
Search or ask a question
Topic

Audio signal processing

About: Audio signal processing is a research topic. Over the lifetime, 21463 publications have been published within this topic receiving 319597 citations. The topic is also known as: audio processing & Acoustic signal processing.


Papers
More filters
Journal ArticleDOI
TL;DR: An automatic algorithm for accurate detection of breaths in speech or song signals based on a template matching approach is presented, which yielded a correct identification rate of 98% with a specificity of 96% on a database of speech and songs containing several hundred breath sounds.
Abstract: Automatic detection of predefined events in speech and audio signals is a challenging and promising subject in signal processing. One important application of such detection is removal or suppression of unwanted sounds in audio recordings, for instance in the professional music industry, where the demand for quality is very high. Breath sounds, which are present in most song recordings and often degrade the aesthetic quality of the voice, are an example of such unwanted sounds. Another example is bad pronunciation of certain phonemes. In this paper, we present an automatic algorithm for accurate detection of breaths in speech or song signals. The algorithm is based on a template matching approach, and consists of three phases. In the first phase, a template is constructed from mel frequency cepstral coefficients (MFCCs) matrices of several breath examples and their singular value decompositions, to capture the characteristics of a typical breath event. Next, in the initial processing phase, each short-time frame is compared to the breath template, and marked as breathy or nonbreathy according to predefined thresholds. Finally, an edge detection algorithm, based on various time-domain and frequency-domain parameters, is applied to demarcate the exact boundaries of each breath event and to eliminate possible false detections. Evaluation of the algorithm on a database of speech and songs containing several hundred breath sounds yielded a correct identification rate of 98% with a specificity of 96%

68 citations

Patent
16 May 2001
TL;DR: In this paper, a digital audio playback device is configured such that the computing platform may automatically or upon user request copy, add or remove digital audio content or other information, such as playlists.
Abstract: A digital audio playback device that includes a wireless communication link to enable it to interact and synchronize with computing platforms as well as other mobile digital audio players and fixed digital audio players. A digital audio playback device is configured such that the computing platform may automatically or upon user request copy, add or remove digital audio content or other information, such as playlists. In addition, digital content on the digital audio playback device can be synchronized with a computing platform. In one embodiment of the invention, the digital audio playback device is configured to enable wireless communication among other digital playback devices and/or a computing platform to allow synchronization and control.

68 citations

Patent
26 Nov 1976
TL;DR: In this article, a compressed analog signal is converted into a digital signal by an analog to digital converter, which is then expanded in a manner complimentary to the compressor operation, thus reconstructing the analog signal.
Abstract: This invention relates to an electronic system and a method for storing and distributing audio signals over existing communication lines. The system comprises a compressor for compressing in a predetermined manner the waveform amplitude of an input analog signal, thereby forming a compressed analog signal. The compressed analog signal is then converted into a digital signal by an analog to digital converter. A digital interface subsystem stores and retrieves selected ones of the digital signals for transmission over a communications line. At a remote end of the communications line the digital signal is converted back to its analog compressed signal representation by a digital to analog converter. The compressed analog signal is then expanded in a manner complimentary to the compressor operation, thus reconstructing the analog signal. A selector generator is provided at the remote end of the communications line for generating a command signal over the communications line to command the digital interface subsystem to select the desired one of the stored digital signals.

68 citations

Journal ArticleDOI
TL;DR: A dynamic Bayesian network (DBN) is presented that jointly infers onsets and end times of the most prominent sound events in the space, along with an extension of the algorithm for covering large spaces with distributed microphone arrays.
Abstract: We propose a method for characterizing sound activity in fixed spaces through segmentation, indexing, and retrieval of continuous audio recordings. Regarding segmentation, we present a dynamic Bayesian network (DBN) that jointly infers onsets and end times of the most prominent sound events in the space, along with an extension of the algorithm for covering large spaces with distributed microphone arrays. Each segmented sound event is indexed with a hidden Markov model (HMM) that models the distribution of example-based queries that a user would employ to retrieve the event (or similar events). In order to increase the efficiency of the retrieval search, we recursively apply a modified spectral clustering algorithm to group similar sound events based on the distance between their corresponding HMMs. We then conduct a formal user study to obtain the relevancy decisions necessary for evaluation of our retrieval algorithm on both automatically and manually segmented sound clips. Furthermore, our segmentation and retrieval algorithms are shown to be effective in both quiet indoor and noisy outdoor recording conditions.

68 citations

Patent
Hyen O Oh1, Hee Suk Pang1, Dong Soo Kim1, Jae Hyun Lim1, Yang Won Jung1, Hyo Jin Kim1 
13 Oct 2006
TL;DR: In this article, a method and apparatus for processing a signal, which allows a signal having optimized signal transmission efficiency to be transmitted/received, is presented, where the pilot reference value corresponds to a plurality of data and the pilot difference value corresponding to the pilot value from the coded audio data.
Abstract: The present invention relates to a method and apparatus for processing a signal. An object of the present invention devised to solve the problem lies on a method and apparatus for processing a signal, which allows a signal having optimized signal transmission efficiency to be transmitted/ received. According to an aspect of the present invention, there is provided a method of processing a signal including receiving a broadcasting signal including audio data coded using a pilot reference value and a pilot difference value, demodulating the broadcasting signal in consideration of a scattered pilot which varies over time and a continual pilot which is fixed over time in a frame of the received broadcasting signal and decoding the demodulated signal to obtain a broadcasting transmission stream, demultiplexing the broadcasting transmission stream to obtain coded audio data in an Internet protocol (IP) packet and an identifier for identifying a method of decoding the audio data, obtaining the pilot reference value corresponding to a plurality of data and the pilot difference value corresponding to the pilot reference value from the coded audio data and obtaining the audio data using the pilot reference value and the pilot difference value.

68 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
81% related
Feature (computer vision)
128.2K papers, 1.7M citations
79% related
Robustness (computer science)
94.7K papers, 1.6M citations
78% related
Noise
110.4K papers, 1.3M citations
77% related
Image segmentation
79.6K papers, 1.8M citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202319
202263
2021217
2020525
2019659
2018597