Segmenting Audio Signals into Auditory Events

Patent

Segmenting Audio Signals into Auditory Events

Chats0

TLDR

In this paper, an audio signal is divided into auditory events, each of which tends to be perceived as separate and distinct, by calculating the spectral content of successive time blocks of the audio signal.

Abstract:

In one aspect, the invention divides an audio signal into auditory events, each of which tends to be perceived as separate and distinct, by calculating the spectral content of successive time blocks of the audio signal, calculating the difference in spectral content between successive time blocks of the audio signal, and identifying an auditory event boundary as the boundary between successive time blocks when the difference in the spectral content between such successive time blocks exceeds a threshold. In another aspect, the invention generates a reduced-information representation of an audio signal by dividing an audio signal into auditory events, each of which tends to be perceived as separate and distinct, and formatting and storing information relating to the auditory events. Optionally, the invention may also assign a characteristic to one or more of the auditory events. Auditory events may be determined according to the first aspect of the invention or by another method.

Citations

PDF

Open Access

More filters

Patent

Multichannel audio coding

Mark F. Davis

TL;DR: In this article, the authors proposed an improved decorrelation of multiple audio channels derived from a monophonic audio channel or from multiple channels of audio along with related auxiliary information from which multiple channels can be reconstructed.

...read moreread less

Patent

Calculating and adjusting the perceived loudness and/or the perceived spectral balance of an audio signal

Alan J. Seefeldt

TL;DR: In this paper, the authors proposed a measurement and control of perceived sound loudness and the perceived spectral balance of an audio signal, which is useful in one or more of: loudness-compensating volume control, automatic gain control, dynamic range control (including, for example, limiters, compressors, expanders, etc.), dynamic equalization, and compensating for background noise interference in an audio playback environment.

...read moreread less

PatentDOI

High quality time-scaling and pitch-scaling of audio signals

Brett G. Crockett

- 12 Feb 2002 -

Journal of the Acoustical Society of Ame...

TL;DR: In this paper, an audio signal is analyzed using multiple pschoacoustic criteria to identify a region of the signal in which time scaling and/or pitch shifting processing would be inaudible or minimally audible.

...read moreread less

Patent

Quality improvement techniques in an audio encoder

Wei-ge Chen, +2 more

TL;DR: In this paper, an audio encoder dynamically selects between joint and independent coding of a multi-channel audio signal via an open-loop decision based upon energy separation between the coding channels, and the disparity between excitation patterns of the separate input channels.

...read moreread less

Patent

Multi-channel audio encoding and decoding

Naveen Thumpudi, +1 more

TL;DR: In this article, the authors describe architectures and techniques that improve the efficiency of multi-channel audio coding and decoding, which can be used in combination or independently, and describe various techniques and tools.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Linear prediction: A tutorial review

John Makhoul

TL;DR: This paper gives an exposition of linear prediction in the analysis of discrete signals as a linear combination of its past values and present and past values of a hypothetical input to a system whose output is the given signal.

...read moreread less

Journal ArticleDOI

Numerical recipes in C. The art of scientific computing

William H. Press, +3 more

- 01 Jan 1988 -

Mathematics of Computation

Book

Auditory Scene Analysis: The Perceptual Organization of Sound

Albert S. Bregman

TL;DR: Auditory Scene Analysis as discussed by the authors addresses the problem of hearing complex auditory environments, using a series of creative analogies to describe the process required of the human auditory system as it analyzes mixtures of sounds to recover descriptions of individual sounds.

...read moreread less

Journal ArticleDOI

Speech analysis/Synthesis based on a sinusoidal representation

R.J. McAulay, +1 more

- 01 Aug 1986 -

IEEE Transactions on Acoustics, Speech, ...

TL;DR: A sinusoidal model for the speech waveform is used to develop a new analysis/synthesis technique that is characterized by the amplitudes, frequencies, and phases of the component sine waves, which forms the basis for new approaches to the problems of speech transformations including time-scale and pitch-scale modification, and midrate speech coding.

...read moreread less

Journal ArticleDOI

Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones

Eric Moulines, +1 more

- 01 Dec 1990 -

Speech Communication

TL;DR: In a common framework several algorithms that have been proposed recently, in order to improve the voice quality of a text-to-speech synthesis based on acoustical units concatenation based on pitch-synchronous overlap-add approach are reviewed.

...read moreread less

Collapse

Segmenting Audio Signals into Auditory Events

Citations

Multichannel audio coding

Calculating and adjusting the perceived loudness and/or the perceived spectral balance of an audio signal

High quality time-scaling and pitch-scaling of audio signals

Quality improvement techniques in an audio encoder

Multi-channel audio encoding and decoding

References

Linear prediction: A tutorial review

Numerical recipes in C. The art of scientific computing

Auditory Scene Analysis: The Perceptual Organization of Sound

Speech analysis/Synthesis based on a sinusoidal representation

Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones

Related Papers (5)

High quality time-scaling and pitch-scaling of audio signals

Method for time aligning audio signals using characterizations based on auditory events

Coding method and apparatus for multiple channels of audio information representing three-dimensional sound fields

Multichannel audio coding

pARAMETRIC REPRESENTATION OF SPATIAL AUDIO