scispace - formally typeset
Search or ask a question
Topic

Audio signal processing

About: Audio signal processing is a research topic. Over the lifetime, 21463 publications have been published within this topic receiving 319597 citations. The topic is also known as: audio processing & Acoustic signal processing.


Papers
More filters
Patent
30 May 1990
TL;DR: In this article, a method and apparatus for digital encoding are described for compressing the augmentation channel signals (chrominance and luminance signals for panel information and high frequency luminance and line difference signal) so that this information can be transmitted in a 3 MHz wide RF channel using a digital transmission scheme.
Abstract: A method and apparatus for digital encoding are described for compressing the augmentation channel signals (chrominance and luminance signals for panel information and high frequency luminance and line difference signal) so that this information can be transmitted in a 3 MHz wide RF channel using a digital transmission scheme such as QPSK. Analog signal components are sampled and converted to digital signals. Each of the signals is fed into a separate coder which reduces the number of bits/pixel required to reconstruct the original signal. Compression is achieved by quantization and removal of redundancy. The compression scheme is based on the use of DCT together with VLC. Each augmentation signal has its own coder, which is adapted to the unique statistics of this signal.

97 citations

Patent
04 May 2001
TL;DR: In this article, an auditory scene is synthesized by applying two or more different sets of one or more spatial parameters (e.g., an inter-ear level difference (ILD), interear time difference (ITD), and/or head-related transfer function (HRTF)) to two different frequency bands of a combined audio signal, where each different frequency band is treated as if it corresponded to a single audio source in the auditory scene.
Abstract: An auditory scene is synthesized by applying two or more different sets of one or more spatial parameters (e.g., an inter-ear level difference (ILD), inter-ear time difference (ITD), and/or head-related transfer function (HRTF)) to two or more different frequency bands of a combined audio signal, where each different frequency band is treated as if it corresponded to a single audio source in the auditory scene. In one embodiment, the combined audio signal corresponds to the combination of two or more different source signals, where each different frequency band corresponds to a region of the combined audio signal in which one of the source signals dominates the others. In this embodiment, the different sets of spatial parameters are applied to synthesize an auditory scene comprising the different source signals. In another embodiment, the combined audio signal corresponds to the combination of the left and right audio signals of a binaural signal corresponding to an input auditory scene. In this embodiment, the different sets of spatial parameters are applied to reconstruct the input auditory scene. In either case, transmission bandwidth requirements are reduced by reducing to one the number of different audio signals that need to be transmitted to a receiver configured to synthesize/reconstruct the auditory scene.

97 citations

Proceedings ArticleDOI
28 Jun 2009
TL;DR: It is shown that HHMM can handle audio events with recursive patterns to improve the classification performance, and a model fusion method is proposed to cover large variations often existing in healthcare audio events.
Abstract: Audio is a useful modality complement to video for healthcare monitoring. In this paper, we investigate the use of Hierarchical Hidden Markov Models (HHMMs) for healthcare audio event classification. We show that HHMM can handle audio events with recursive patterns to improve the classification performance. We also propose a model fusion method to cover large variations often existing in healthcare audio events. Experimental results from classifying key eldercare audio events show the effectiveness of the model fusion method for healthcare audio event classification.

97 citations

Journal ArticleDOI
TL;DR: An efficient approach for unsupervised audio stream segmentation and clustering via the Bayesian Information Criterion via the BIC is proposed, which is particularly successful for short segment turns of less than 2 s in duration.
Abstract: In many speech and audio applications, it is first necessary to partition and classify acoustic events prior to voice coding for communication or speech recognition for spoken document retrieval. In this paper, we propose an efficient approach for unsupervised audio stream segmentation and clustering via the Bayesian Information Criterion (BIC). The proposed method extends an earlier formulation by Chen and Gopalakrishnan. In our formulation, Hotelling's T/sup 2/-Statistic is used to pre-select candidate segmentation boundaries followed by BIC to perform the segmentation decision. The proposed algorithm also incorporates a variable-size increasing window scheme and a skip-frame test. Our experiments show that we can improve the final algorithm speed by a factor of 100 compared to that in Chen and Gopalakrishnan's while achieving a 6.7% reduction in the acoustic boundary miss rate at the expense of a 5.7% increase in false alarm rate using DARPA Hub4 1997 evaluation data. The approach is particularly successful for short segment turns of less than 2 s in duration. The results suggest that the proposed algorithm is sufficiently effective and efficient for audio stream segmentation applications.

97 citations

Patent
30 Jan 1991
TL;DR: In this article, a system for selectively transmitting horizontal television lines including video and digital audio components or equivalent digital data lines containing a plurality of audio program signals is presented, where a conventional television signal transmission places a digital audio signal in the horizontal blanking interval, followed by analog video information.
Abstract: A system is provided for selectively transmitting horizontal television lines including video and digital audio components or equivalent digital data lines containing a plurality of audio program signals. A conventional television signal transmission places a digital audio signal in the horizontal blanking interval, followed by analog video information. In accordance with the present invention, the "window" containing the analog video information is replaced with a plurality of digital audio signals that are time division multiplexed within the window. An additional audio channel is placed in the horizontal blanking interval, at the same location the audio is placed when video information is transmitted. Selector switches are provided in the encoder and decoder for processing a composite waveform as either a video signal with an associated audio channel, or as a multiple channel digital audio signal. Independent encryption and decryption of each of the multiple audio channels is provided.

97 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
81% related
Feature (computer vision)
128.2K papers, 1.7M citations
79% related
Robustness (computer science)
94.7K papers, 1.6M citations
78% related
Noise
110.4K papers, 1.3M citations
77% related
Image segmentation
79.6K papers, 1.8M citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202319
202263
2021217
2020525
2019659
2018597