Showing papers on "Audio signal processing published in 2009"

PDF

Open Access

Journal Article•DOI•

Environmental Sound Recognition With Time–Frequency Audio Features

[...]

Selina Chu¹, Shrikanth S. Narayanan¹, C.-C.J. Kuo¹•Institutions (1)

01 Aug 2009-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: An empirical feature analysis for audio environment characterization is performed and a matching pursuit algorithm is proposed to use to obtain effective time-frequency features to yield higher recognition accuracy for environmental sounds.

...read moreread less

Abstract: The paper considers the task of recognizing environmental sounds for the understanding of a scene or context surrounding an audio sensor. A variety of features have been proposed for audio recognition, including the popular Mel-frequency cepstral coefficients (MFCCs) which describe the audio spectral shape. Environmental sounds, such as chirpings of insects and sounds of rain which are typically noise-like with a broad flat spectrum, may include strong temporal domain signatures. However, only few temporal-domain features have been developed to characterize such diverse audio signals previously. Here, we perform an empirical feature analysis for audio environment characterization and propose to use the matching pursuit (MP) algorithm to obtain effective time-frequency features. The MP-based method utilizes a dictionary of atoms for feature selection, resulting in a flexible, intuitive and physically interpretable set of features. The MP-based feature is adopted to supplement the MFCC features to yield higher recognition accuracy for environmental sounds. Extensive experiments are conducted to demonstrate the effectiveness of these joint features for unstructured environmental sound classification, including listening tests to study human recognition capabilities. Our recognition system has shown to produce comparable performance as human listeners.

...read moreread less

626 citations

Proceedings Article•DOI•

OpenEAR — Introducing the munich open-source emotion and affect recognition toolkit

[...]

Florian Eyben¹, Martin Wöllmer¹, Björn Schuller¹•Institutions (1)

Technische Universität München¹

08 Dec 2009

TL;DR: A novel open-source affect and emotion recognition engine, which integrates all necessary components in one highly efficient software package, and which can be used for batch processing of databases.

...read moreread less

Abstract: Various open-source toolkits exist for speech recognition and speech processing. These toolkits have brought a great benefit to the research community, i.e. speeding up research. Yet, no such freely available toolkit exists for automatic affect recognition from speech. We herein introduce a novel open-source affect and emotion recognition engine, which integrates all necessary components in one highly efficient software package. The components include audio recording and audio file reading, state-of-the-art paralinguistic feature extraction and plugable classification modules. In this paper we introduce the engine and extensive baseline results. Pre-trained models for four affect recognition tasks are included in the openEAR distribution. The engine is tailored for multi-threaded, incremental on-line processing of live input in real-time, however it can also be used for batch processing of databases.

...read moreread less

408 citations

Patent•

Audio Channel Assignment for Audio Output in a Movable Device

[...]

Heiko Panther¹, David P. Julian¹, Roberto G. Yepez¹•Institutions (1)

Apple Inc.¹

06 Jul 2009

TL;DR: In this paper, an orientation sensor detects an orientation of the speaker array and provides an orientation signal. But the orientation signal may be provided according to selection of display orientation, shape of touch input, image recognition of the listener, or the like.

...read moreread less

Abstract: A device that provides an audio output includes a speaker array mechanically fixed to the device. The speaker array includes at least three speakers. An orientation sensor detects an orientation of the speaker array and provides an orientation signal. An audio receiver receives a number of audio signals that include spatial position information. An audio processor is coupled to the speakers, the orientation sensor, and the audio receiver. The audio processor receives the audio signals and the orientation signal, and selectively routes the audio signals to the speakers according to the spatial position information and the orientation signal such that the spatial position information is perceptible to a listener. The orientation signal may be provided by a compass, an accelerometer, an inertial sensor, or other device. The orientation signal may be provided according to selection of display orientation, shape of touch input, image recognition of the listener, or the like.

...read moreread less

260 citations

Book•

Multi-Pitch Estimation

[...]

Mads Græsbøll Christensen¹, Andreas Jakobsson•Institutions (1)

Aalborg University¹

21 Mar 2009

TL;DR: An introduction to pitch estimation is given and a number of statistical methods for pitch estimation are presented, which include both single- and multi-pitch estimators based on statistical approaches, filtering methods based on both static and optimal adaptive designs, and subspace methodsbased on the principles of subspace orthogonality and shift-invariance.

...read moreread less

Abstract: Periodic signals can be decomposed into sets of sinusoids having frequencies that are integer multiples of a fundamental frequency. The problem of finding such fundamental frequencies from noisy observations is important in many speech and audio applications, where it is commonly referred to as pitch estimation. These applications include analysis, compression, separation, enhancement, automatic transcription and many more. In this book, an introduction to pitch estimation is given and a number of statistical methods for pitch estimation are presented. The basic signal models and associated estimation theoretical bounds are introduced, and the properties of speech and audio signals are discussed and illustrated. The presented methods include both single- and multi-pitch estimators based on statistical approaches, like maximum likelihood and maximum a posteriori methods, filtering methods based on both static and optimal adaptive designs, and subspace methods based on the principles of subspace orthogonality and shift-invariance. The application of these methods to analysis of speech and audio signals is demonstrated using both real and synthetic signals, and their performance is assessed under various conditions and their properties discussed. Finally, the estimators are compared in terms of computational and statistical efficiency, generalizability and robustness. Table of Contents: Fundamentals / Statistical Methods / Filtering Methods / Subspace Methods / Amplitude Estimation

...read moreread less

221 citations

Patent•

Systems, methods, apparatus, and computer program products for enhanced active noise cancellation

[...]

Hyun Jin Park¹, Kwokleung Chan¹•Institutions (1)

Qualcomm¹

18 Nov 2009

TL;DR: In this article, the uses of an enhanced sidetone signal in an active noise cancellation operation are described. But the authors do not discuss the application of the enhanced sideto-signal in the active cancellation operation.

...read moreread less

Abstract: Uses of an enhanced sidetone signal in an active noise cancellation operation are disclosed.

...read moreread less

202 citations

Patent•

Audio source proximity estimation using sensor array for noise reduction

[...]

Kwokleung Chan¹•Institutions (1)

Qualcomm¹

23 Oct 2009

TL;DR: In this article, the authors estimate the proximity of an audio source by transforming audio signals from a plurality of sensors to the frequency domain, and the amplitudes of the transformed audio signals are then determined.

...read moreread less

Abstract: Estimating the proximity of an audio source (14, 15) is accomplished by transforming audio signals from a plurality of sensors (18, 20) to frequency domain. The amplitudes of the transformed audio signals are then determined. The proximity of the audio source is determined based on a comparison of the frequency domain amplitudes. This estimation permits a device (16) to differentiate between relatively distant audio sources (14) and audio sources (15) at close proximity to the device. The technique can be applied to mobile handsets, such as cellular phones or PDAs, hands-free headsets, and other audio input devices. Devices taking advantage of this "close proximity" detection are better able to suppress background noise and deliver an improved user experience.

...read moreread less

201 citations

Patent•

Media processing method and device

[...]

David G. Conroy¹, Barry J. Corlett¹, Aram Lindahl¹, Steve Schell¹, Neil David Warren¹ - Show less +1 more•Institutions (1)

Apple Inc.¹

22 Jul 2009

TL;DR: In this article, a media processing system and device with improved power usage characteristics, improved audio functionality and improved media security is provided, including an audio processing subsystem that operates independently of the host processor for long periods of time.

...read moreread less

Abstract: A media processing system and device with improved power usage characteristics, improved audio functionality and improved media security is provided. Embodiments of the media processing system include an audio processing subsystem that operates independently of the host processor for long periods of time, allowing the host processor to enter a low power state while the audio data is being processed. Other aspects of the media processing system provide for enhanced audio effects such as mixing stored audio samples into real-time telephone audio. Still other aspects of the media processing system provide for improved media security due to the isolation of decrypted audio data from the host processor.

...read moreread less

179 citations

Patent•

Multi band audio compressor dynamic level adjust in a communications device

[...]

Chad G. Seguin¹•Institutions (1)

Apple Inc.¹

21 Jan 2009

TL;DR: In this article, a multi-band compressor has a band splitter that splits the input audio signal into a number of different band signals, each band signal is input to a respective compressor block, which is independently programmable so that its audio frequency response differs from a linear response in at least two non-overlapping windows of its input signal, and (b) differs from the frequency response of another one of the compressor blocks.

...read moreread less

Abstract: An uplink or downlink audio processor contains a multi band compressor that receives an input, uplink or downlink, audio signal. The multi-band compressor has a band splitter that splits the input audio signal into a number of different band signals. Each band signal is input to a respective compressor block, which is independently programmable so that its audio frequency response (a) differs from a linear response in at least two non-overlapping windows of its input signal, and (b) differs from the frequency response of another one of the compressor blocks. Other embodiments are also described and claimed.

...read moreread less

168 citations

Patent•

Intelligent clip mixing

[...]

Baptiste P. Paquier¹, Benjamin Andrew Rottler¹, Aram Lindahl¹•Institutions (1)

Apple Inc.¹

10 Mar 2009

TL;DR: In this paper, a secondary audio clip mixing profile is selected based upon the type of audio output device, such as a speaker or a headset, coupled to the electronic device, to substantially optimize audibility and user-perceived comfort.

...read moreread less

Abstract: Various techniques for controlling the playback of secondary audio data on an electronic device are provided. In one embodiment, a secondary audio clip mixing profile is selected based upon the type of audio output device, such as a speaker or a headset, coupled to the electronic device. The selected mixing profile may define respective digital gain values to be applied to a secondary audio stream at each digital audio level of the electronic device, and may be customized based upon one or more characteristics of the audio output device to substantially optimize audibility and user-perceived comfort. In this manner, the overall user listening experience may be improved.

...read moreread less

166 citations

Journal Article•DOI•

Classification of audio signals using SVM and RBFNN

[...]

P. Dhanalakshmi¹, S. Palanivel¹, V. Ramalingam¹•Institutions (1)

Annamalai University¹

01 Apr 2009-Expert Systems With Applications

TL;DR: This paper proposes effective algorithms to automatically classify audio clips into one of six classes: music, news, sports, advertisement, cartoon and movie, using the application of neural network for the classification of audio.

...read moreread less

Abstract: In the age of digital information, audio data has become an important part in many modern computer applications Audio classification has been becoming a focus in the research of audio processing and pattern recognition Automatic audio classification is very useful to audio indexing, content-based audio retrieval and on-line audio distribution, but it is a challenge to extract the most common and salient themes from unstructured raw audio data In this paper, we propose effective algorithms to automatically classify audio clips into one of six classes: music, news, sports, advertisement, cartoon and movie For these categories a number of acoustic features that include linear predictive coefficients, linear predictive cepstral coefficients and mel-frequency cepstral coefficients are extracted to characterize the audio content Support vector machines are applied to classify audio into their respective classes by learning from training data Then the proposed method extends the application of neural network (RBFNN) for the classification of audio RBFNN enables nonlinear transformation followed by linear transformation to achieve a higher dimension in the hidden space The experiments on different genres of the various categories illustrate the results of classification are significant and effective

...read moreread less

160 citations

Journal Article•DOI•

Beyond bandlimited sampling

[...]

Yonina C. Eldar¹, Tomer Michaeli²•Institutions (2)

Tel Aviv University¹, Technion – Israel Institute of Technology²

17 Apr 2009-IEEE Signal Processing Magazine

TL;DR: Examples include sampling rate conversion for software radio and between audio formats, biomedical imaging, lens distortion correction and the formation of image mosaics, and super-resolution of image sequences.

...read moreread less

Abstract: Digital applications have developed rapidly over the last few decades. Since many sources of information are of analog or continuous-time nature, discrete-time signal processing (DSP) inherently relies on sampling a continuous-time signal to obtain a discrete-time representation. Consequently, sampling theories lie at the heart of signal processing devices and communication systems. Examples include sampling rate conversion for software radio and between audio formats, biomedical imaging, lens distortion correction and the formation of image mosaics, and super-resolution of image sequences.

...read moreread less

Patent•

Method and circuit for controlling an output of an audio signal of a battery-powered device

[...]

Georg Siotis¹•Institutions (1)

Ericsson Mobile Communications¹

17 Jun 2009

TL;DR: In this paper, a method and a control circuit for controlling an output of an audio signal of a battery-powered device are described. And in case of low charge level, the audio filter/gain parameter and/or the audio compression parameter are adjusted to reduce power consumption, thus allowing longer reproducing time by lower sound quality.

...read moreread less

Abstract: A method and a control circuit for controlling an output of an audio signal of a battery-powered device are described. The charge condition of the battery is determined and in case of low charge level, the audio filter/gain parameter and/or the audio compression parameter are adjusted to reduce power consumption, thus allowing longer reproducing time by lower sound quality.

...read moreread less

Proceedings Article•DOI•

High resolution audio synchronization using chroma onset features

[...]

Sebastian Ewert¹, Meinard Müller², Peter Grosche²•Institutions (2)

University of Bonn¹, Saarland University²

19 Apr 2009

TL;DR: Novel audio features that combine the high temporal accuracy of onset features with the robustness of chroma features are introduced and it is shown how previous synchronization methods can be extended to make use of these new features.

...read moreread less

Abstract: The general goal of music synchronization is to automatically align the multiple information sources such as audio recordings, MIDI files, or digitized sheet music related to a given musical work. In computing such alignments, one typically has to face a delicate tradeoff between robustness and accuracy. In this paper, we introduce novel audio features that combine the high temporal accuracy of onset features with the robustness of chroma features. We show how previous synchronization methods can be extended to make use of these new features. We report on experiments based on polyphonic Western music demonstrating the improvements of our proposed synchronization framework.

...read moreread less

Journal Article•DOI•

The long road to automation: Neurocognitive development of letter-speech sound processing

[...]

Dries Froyen¹, Milene Bonte, Nienke van Atteveldt, Leo Blomert•Institutions (1)

Maastricht University¹

01 Mar 2009-Journal of Cognitive Neuroscience

TL;DR: A transition from mere association in beginner readers to more automatic, but still not “adult-like,” integration in advanced readers is indicated and evidence for an extended development of letter–speech sound integration is provided.

...read moreread less

Abstract: In transparent alphabetic languages, the expected standard for complete acquisition of letter-speech sound associations is within one year of reading instruction. The neural mechanisms underlying the acquisition of letter-speech sound associations have, however, hardly been investigated. The present article describes an ERP study with beginner and advanced readers in which the influence of letters on speech sound processing is investigated by comparing the MMN to speech sounds presented in isolation with the MMN to speech sounds accompanied by letters. Furthermore, SOA between letter and speech sound presentation was manipulated in order to investigate the development of the temporal window of integration for letter-speech sound processing. Beginner readers, despite one year of reading instruction, showed no early letter-speech sound integration, that is, no influence of the letter on the evocation of the MMN to the speech sound. Only later in the difference wave, at 650 msec, was an influence of the letter on speech sound processing revealed. Advanced readers, with 4 years of reading instruction, showed early and automatic letter-speech sound processing as revealed by an enhancement of the MMN amplitude, however, at a different temporal window of integration in comparison with experienced adult readers. The present results indicate a transition from mere association in beginner readers to more automatic, but still not "adult-like," integration in advanced readers. In contrast to general assumptions, the present study provides evidence for an extended development of letter-speech sound integration.

...read moreread less

Patent•

Audio device and method of operation therefor

[...]

Schijndel Nicolle H. Van¹, Julien L. Bergere¹, Vegten Susanne Van¹•Institutions (1)

Philips¹

16 Feb 2009

TL;DR: In this paper, a user preference processor (109) receives user preference feedback for the test audio signals and generates a personalization parameter for the user in response to the user preferences and a noise parameter for each noise component of at least one of the audio signals.

...read moreread less

Abstract: An audio device is arranged to present a plurality of test audio signals to a user where each test audio signal comprises a signal component and a noise component. A user preference processor (109) receives user preference feedback for the test audio signals and generates a personalization parameter for the user in response to the user preference feedback and a noise parameter for the noise component of at least one of the test audio signals. An audio processor (113) then processes an audio signal in response to the personalization parameter and the resulting signal is presented to the user. The invention may allow improved characterization of a user thereby resulting in improved adaptation of the processing and thus an improved personalization of the presented signal. The invention may e.g. be beneficial for hearing aids for hearing impaired users.

...read moreread less

Patent•

Method and an apparatus for processing an audio signal

[...]

Hyen O Oh¹, Yang Won Jung¹•Institutions (1)

LG Electronics¹

29 Jul 2009

TL;DR: In this article, an apparatus for processing an audio signal and method thereof is described, which includes receiving, by an audio processing apparatus, a signal including a first data of a first block encoded with rectangular coding scheme and a second data of the second block encoded in non-rectangular coding scheme.

...read moreread less

Abstract: An apparatus for processing an audio signal and method thereof are disclosed. The present invention includes receiving, by an audio processing apparatus, an audio signal including a first data of a first block encoded with rectangular coding scheme and a second data of a second block encoded with non-rectangular coding scheme; receiving a compensation signal corresponding to the second block; estimating a prediction of an aliasing part using the first data; and, obtaining a reconstructed signal for the second block based on the second data, the compensation signal and the prediction of aliasing part.

...read moreread less

Journal Article•DOI•

Music Information Retrieval Using Social Tags and Audio

[...]

Mark Levy, Mark Sandler¹•Institutions (1)

University of London¹

01 Apr 2009-IEEE Transactions on Multimedia

TL;DR: A novel approach to applying text-based information retrieval techniques to music collections that represents tracks with a joint vocabulary consisting of both conventional words, drawn from social tags, and audio muswords, representing characteristics of automatically-identified regions of interest within the signal.

...read moreread less

Abstract: In this paper we describe a novel approach to applying text-based information retrieval techniques to music collections. We represent tracks with a joint vocabulary consisting of both conventional words, drawn from social tags, and audio muswords, representing characteristics of automatically-identified regions of interest within the signal. We build vector space and latent aspect models indexing words and muswords for a collection of tracks, and show experimentally that retrieval with these models is extremely well-behaved. We find in particular that retrieval performance remains good for tracks by artists unseen by our models in training, and even if tags for their tracks are extremely sparse.

...read moreread less

Patent•

Apparatus and method for generating audio output signals using object based metadata

[...]

Stephan Schreiner, Wolfgang Fiesel, Matthias Neusinger, Oliver Hellmuth, Ralph Sperschneider - Show less +1 more

06 Jul 2009

TL;DR: In this paper, an apparatus for generating at least one audio output signal representing a superposition of at least two different audio objects comprises a processor for processing an audio input signal to provide an object representation of the input signal, where this object representation can be generated by a parametrically guided approximation of original objects using an object downmix signal.

...read moreread less

Abstract: An apparatus for generating at least one audio output signal representing a superposition of at least two different audio objects comprises a processor for processing an audio input signal to provide an object representation of the audio input signal, where this object representation can be generated by a parametrically guided approximation of original objects using an object downmix signal. An object manipulator individually manipulates objects using audio object based metadata referring to the individual audio objects to obtain manipulated audio objects. The manipulated audio objects are mixed using an object mixer for finally obtaining an audio output signal having one or several channel signals depending on a specific rendering setup.

...read moreread less

Journal Article•DOI•

Temporal Integration for Audio Classification With Application to Musical Instrument Classification

[...]

C. Joder¹, Slim Essid¹, Gael Richard¹•Institutions (1)

ParisTech¹

01 Jan 2009-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: A number of methods for early and late temporal integration are proposed and an in-depth experimental study on their interest for the task of musical instrument recognition on solo musical phrases is provided.

...read moreread less

Abstract: Nowadays, it appears essential to design automatic indexing tools which provide meaningful and efficient means to describe the musical audio content. There is in fact a growing interest for music information retrieval (MIR) applications amongst which the most popular are related to music similarity retrieval, artist identification, musical genre or instrument recognition. Current MIR-related classification systems usually do not take into account the mid-term temporal properties of the signal (over several frames) and lie on the assumption that the observations of the features in different frames are statistically independent. The aim of this paper is to demonstrate the usefulness of the information carried by the evolution of these characteristics over time. To that purpose, we propose a number of methods for early and late temporal integration and provide an in-depth experimental study on their interest for the task of musical instrument recognition on solo musical phrases. In particular, the impact of the time horizon over which the temporal integration is performed will be assessed both for fixed and variable frame length analysis. Also, a number of proposed alignment kernels will be used for late temporal integration. For all experiments, the results are compared to a state of the art musical instrument recognition system.

...read moreread less

Book•

Sound Capture and Processing: Practical Approaches

[...]

Ivan Tashev

08 Sep 2009

TL;DR: This invaluable guide will provide audio, R&D and software engineers in the industry of building systems or computer peripherals for speech enhancement with a comprehensive overview of the technologies, devices and algorithms required for modern computers and communication devices.

...read moreread less

Abstract: Provides state-of-the-art algorithms for sound capture, processing and enhancement Sound Capture and Processing: Practical Approaches covers the digital signal processing algorithms and devices for capturing sounds, mostly human speech. It explores the devices and technologies used to capture, enhance and process sound for the needs of communication and speech recognition in modern computers and communication devices. This book gives a comprehensive introduction to basic acoustics and microphones, with coverage of algorithms for noise reduction, acoustic echo cancellation, dereverberation and microphone arrays; charting the progress of such technologies from their evolution to present day standard. Sound Capture and Processing: Practical Approaches Brings together the state-of-the-art algorithms for sound capture, processing and enhancement in one easily accessible volume Provides invaluable implementation techniques required to process algorithms for real life applications and devices Covers a number of advanced sound processing techniques, such as multichannel acoustic echo cancellation, dereverberation and source separation Generously illustrated with figures and charts to demonstrate how sound capture and audio processing systems work An accompanying website containing Matlab code to illustrate the algorithms This invaluable guide will provide audio, R&D and software engineers in the industry of building systems or computer peripherals for speech enhancement with a comprehensive overview of the technologies, devices and algorithms required for modern computers and communication devices. Graduate students studying electrical engineering and computer science, and researchers in multimedia, cell-phones, interactive systems and acousticians will also benefit from this book.

...read moreread less

Journal Article•DOI•

Temporal Derivative-Based Spectrum and Mel-Cepstrum Audio Steganalysis

[...]

Qingzhong Liu¹, Andrew H. Sung¹, Mengyu Qiao¹•Institutions (1)

New Mexico Institute of Mining and Technology¹

01 Sep 2009-IEEE Transactions on Information Forensics and Security

TL;DR: Experimental results show that proposed derivative-based and wavelet-based approaches remarkably improve the detection accuracy.

...read moreread less

Abstract: To improve a recently developed mel-cepstrum audio steganalysis method, we present in this paper a method based on Fourier spectrum statistics and mel-cepstrum coefficients, derived from the second-order derivative of the audio signal. Specifically, the statistics of the high-frequency spectrum and the mel-cepstrum coefficients of the second-order derivative are extracted for use in detecting audio steganography. We also design a wavelet-based spectrum and mel-cepstrum audio steganalysis. By applying support vector machines to these features, unadulterated carrier signals (without hidden data) and the steganograms (carrying covert data) are successfully discriminated. Experimental results show that proposed derivative-based and wavelet-based approaches remarkably improve the detection accuracy. Between the two new methods, the derivative-based approach generally delivers a better performance.

...read moreread less

Proceedings Article•DOI•

Factorial Scaled Hidden Markov Model for polyphonic audio representation and source separation

[...]

Alexey Ozerov¹, Cédric Févotte¹, Maurice Charbit¹•Institutions (1)

Télécom ParisTech¹

04 Dec 2009

TL;DR: A new probabilistic model for polyphonic audio termed factorial scaled hidden Markov model (FS-HMM), which generalizes several existing models, notably the Gaussian scaled mixture model and the Itakura-Saito nonnegative matrix factorization (NMF) model is presented.

...read moreread less

Abstract: We present a new probabilistic model for polyphonic audio termed Factorial Scaled Hidden Markov Model (FS-HMM), which generalizes several existing models, notably the Gaussian scaled mixture model and the Itakura-Saito Nonnegative Matrix Factorization (NMF) model. We describe two expectation-maximization (EM) algorithms for maximum likelihood estimation, which differ by the choice of complete data set. The second EM algorithm, based on a reduced complete data set and multiplicative updates inspired from NMF methodology, exhibits much faster convergence. We consider the FS-HMM in different configurations for the difficult problem of speech / music separation from a single channel and report satisfying results.

...read moreread less

Journal Article•DOI•

Audio forensic examination

[...]

Robert C. Maher¹•Institutions (1)

Montana State University¹

27 Mar 2009-IEEE Signal Processing Magazine

TL;DR: The DSP audience is given some insight into the types of problems and challenges that face practitioners in audio forensic laboratories, and several of the frustrations and pitfalls encountered by signal processing experts when dealing with typical forensic material due to the standards and practices of the legal system.

...read moreread less

Abstract: The field of audio forensics involves many topics familiar to the general audio digital signal processing (DSP) community, such as speech recognition, talker identification, and signal quality enhancement. There is potentially much to be gained by applying modern DSP theory to problems of interest to the forensics community, and this article is written to give the DSP audience some insight into the types of problems and challenges that face practitioners in audio forensic laboratories. However, this article must also present several of the frustrations and pitfalls encountered by signal processing experts when dealing with typical forensic material due to the standards and practices of the legal system.

...read moreread less

Monograph•DOI•

Applied Speech and Audio Processing: With Matlab Examples

[...]

Ian McLoughlin¹•Institutions (1)

Nanyang Technological University¹

23 Mar 2009

TL;DR: This practically oriented text provides MATLAB examples throughout to illustrate the concepts discussed and to give the reader hands-on experience with important techniques, and is ideal for graduate students and practitioners working with speech or audio systems.

...read moreread less

Abstract: Applied Speech and Audio Processing is a MATLAB-based, one-stop resource that blends speech and hearing research in describing the key techniques of speech and audio processing This practically oriented text provides MATLAB examples throughout to illustrate the concepts discussed and to give the reader hands-on experience with important techniques Chapters on basic audio processing and the characteristics of speech and hearing lay the foundations of speech signal processing, which are built upon in subsequent sections explaining audio handling, coding, compression, and analysis techniques The final chapter explores a number of advanced topics that use these techniques, including psychoacoustic modelling, a subject which underpins MP3 and related audio formats With its hands-on nature and numerous MATLAB examples, this book is ideal for graduate students and practitioners working with speech or audio systems

...read moreread less

Patent•DOI•

Spatial disassembly processor

[...]

Paul E. Beckmann¹, Finn Arnold¹•Institutions (1)

Bose Corporation¹

07 Dec 2009-Journal of the Acoustical Society of America

TL;DR: In this paper, two-channel input audio signals are processed to construct output audio signals by decomposing the input signal into a plurality of subband audio signals, and the output signal is synthesized from the generated subband signals.

...read moreread less

Abstract: Two-channel input audio signals are processed to construct output audio signals by decomposing the two-channel input audio signals into a plurality of two-channel subband audio signals. Separately, in each of a plurality of subbands, at least three generated subband audio signals are generated by steering the two-channel subband audio signals into at least three generated signal locations. The output audio signals are synthesized from the generated subband audio signals. The steering applies differing construction rules in at least two of the plurality of subbands.

...read moreread less

Patent•

Method for adaptive control and equalization of electroacoustic channels

[...]

Matthew Fellers¹, Grant Allen Davidson¹, Rongshan Yu¹, Eric M. Benjamin¹, Kenneth James Gundry¹ - Show less +1 more•Institutions (1)

Dolby Laboratories¹

29 Jul 2009

TL;DR: In this article, a transfer function estimate of the electroacoustic channel is established, responsive to the second audio signal and part of the first audio signal, and filters are obtained with transfer functions based on the estimate.

...read moreread less

Abstract: An electroacoustic channel soundfield is altered. An audio signal is applied by an electromechanical transducer to an acoustic space, causing air pressure changes therein. Another audio signal is obtained by a second electromechanical transducer, responsive to air pressure changes in the acoustic space. A transfer function estimate of the electroacoustic channel is established, responsive to the second audio signal and part of the first audio signal. The transfer function estimate is derived to be adaptive to temporal variations in the electroacoustic channel transfer function. Filters are obtained with transfer functions based on the transfer function estimate. Part of the first audio signal is filtered therewith.

...read moreread less

Proceedings Article•DOI•

Healthcare audio event classification using Hidden Markov Models and Hierarchical Hidden Markov Models

[...]

Ya-Ti Peng¹, Ching-Yung Lin², Ming-Ting Sun¹, Kun-Cheng Tsai•Institutions (2)

University of Washington¹, IBM²

28 Jun 2009

TL;DR: It is shown that HHMM can handle audio events with recursive patterns to improve the classification performance, and a model fusion method is proposed to cover large variations often existing in healthcare audio events.

...read moreread less

Abstract: Audio is a useful modality complement to video for healthcare monitoring. In this paper, we investigate the use of Hierarchical Hidden Markov Models (HHMMs) for healthcare audio event classification. We show that HHMM can handle audio events with recursive patterns to improve the classification performance. We also propose a model fusion method to cover large variations often existing in healthcare audio events. Experimental results from classifying key eldercare audio events show the effectiveness of the model fusion method for healthcare audio event classification.

...read moreread less

Patent•

Audio apparatus and signal calibration method thereof

[...]

Eung-sik Yoon¹, Sung-Han Lee¹, Yong-Jin Kang¹•Institutions (1)

Samsung¹

02 Sep 2009

TL;DR: In this paper, an audio system to calibrate an audio signal based on a wirelessly received signal, and a signal calibration method are provided, where a transceiver is connected to an external device to enable wireless communication between a main unit and the external device.

...read moreread less

Abstract: An audio system to calibrate an audio signal based on a wirelessly received signal, and a signal calibration method are provided. The audio system includes a sound output unit to output a sound corresponding to a received audio signal. A transceiver is connected to an external device to enable wireless communication between a main unit and the external device. The external device converts the sound output from the sound output unit into an electric signal to generate a calibration audio signal. The main unit performs calibration on an audio signal to be played back through the sound output unit using the calibration audio signal.

...read moreread less

Journal Article•DOI•

Speech Recognition in Cochlear Implant Recipients: Comparison of Standard HiRes and HiRes 120 Sound Processing

[...]

Jill B. Firszt¹, Laura K. Holden, Ruth M. Reeder, Margaret W. Skinner•Institutions (1)

Washington University in St. Louis¹

01 Feb 2009-Otology & Neurotology

TL;DR: Speech recognition results in adult cochlear implant recipients revealed small but significant improvements with HR 120 for single syllable words and for 2 of 3 sentence recognition measures in noise, and 7 of 8 subjects preferred HR 120 over HR for listening in everyday life.

...read moreread less

Abstract: Objective HiRes (HR) 120 is a sound processing strategy purported to offer an increase in the precision of frequency-to-place mapping through the use of current steering. This within-subject study was designed to compare speech recognition as well as music and sound quality ratings for HR and HR 120 processing.

...read moreread less

Patent•DOI•

Electronic device and external equipment with configurable audio path circuitry

[...]

Wendell B. Sander¹, Jeffrey J. Terlizzi¹•Institutions (1)

Apple Inc.¹

09 Jun 2009-Journal of the Acoustical Society of America

TL;DR: In this paper, an audio plug that connects to a mating audio jack in an electronic device is used to establish a wired communications path between the accessory and the electronic device, and a microphone may be included in an accessory to capture sound for an associated electronic device.

...read moreread less

Abstract: Electronic devices and accessories such as headsets for electronic devices are provided. A microphone may be included in an accessory to capture sound for an associated electronic device. Buttons and other user interfaces may be included in the accessories. An accessory may have an audio plug that connects to a mating audio jack in an electronic device, thereby establishing a wired communications path between the accessory and the electronic device. Path configuration circuitry may be used to selectively configure the path between the electronic device and accessory to support different operational modes. Analog audio lines in the wired path may convey left and right channel analog audio channels. When it is desired to convey power over the wired path, one of the analog audio channel lines may be converted to a power line. Audio functionality may be retained by simultaneously converting a unidirectional line into a bidirectional line using hybrids.

...read moreread less

Collapse