scispace - formally typeset
Search or ask a question

Showing papers by "Dolby Laboratories published in 2003"


PatentDOI
TL;DR: In this paper, the N audio output channels were derived from the M audio input channels, where one or more of the audio inputs is associated with a spatial direction other than a spatial orientation with which any of the output channels is associated, and at least one of the input channels is mapped to a respective set of at least three output channels.
Abstract: M audio input channels, each associated with a spatial direction, are translated to N audio output channels, each associated with a spatial direction, wherein M and N are positive whole integers, M is three or more, and N is three or more, by deriving the N audio output channels from the M audio input channels, wherein one or more of the M audio input channels is associated with a spatial direction other than a spatial direction with which any of the N audio output channels is associated, and at least one of the one or more of the M audio input channels is mapped to a respective set of at least three of the N output channels. At least three output channels of a set may be associated with contiguous spatial directions.

132 citations


Patent
15 Aug 2003
TL;DR: In this article, an indication of the loudness of an audio signal containing speech and other types of audio material is obtained by classifying segments of audio information as either speech or non-speech.
Abstract: An indication of the loudness of an audio signal containing speech and other types of audio material is obtained by classifying segments of audio information as either speech or non-speech. The loudness of the speech segments is estimated and this estimate is used to derive the indication of loudness. The indication of loudness may be used to control audio signal levels so that variations in loudness of speech between different programs is reduced. A preferred method for classifying speech segments is described.

108 citations


Patent
09 Jun 2003
TL;DR: In this paper, a receiver in an audio coding system receives a signal conveying frequency subband signals representing an audio signal, and the subband signal is examined to assess one or more characteristics of the audio signal.
Abstract: A receiver in an audio coding system receives a signal conveying frequency subband signals representing an audio signal. The subband signals are examined to assess one or more characteristics of the audio signal. Spectral components are synthesized having the assessed characteristics. The synthesized spectral components are integrated with the subband signals and passed through a synthesis filterbank to generate an output signal. In one implementation, the assessed characteristic is temporal shape and noise-like spectral components are synthesized having the temporal shape of the audio signal.

58 citations


Patent
09 Jan 2003
TL;DR: In this article, an interactive spatialized audiovisual system links a plurality of remote used terminals, consisting of a networked computer having an associated user database including user status information.
Abstract: An interactive spatialized audiovisual system links a plurality of remote used terminals. The system comprises a networked computer having an associated user database including user status information. Input means are provided at the computer for receiving a plurality of audio streams and associated locating data from the remote user terminals for, virtually locating the users relative to one another within a virtual user environment such as a chat room environment Selection means are provided for enabling selection of at least the first group of the audio streams in a first selection process based on status information in the user database. Output means output the selected group of audio streams and associated locating data for spatialization of the audio streams relative to a first listener-based audio reference frame which is substantially coherent with visual representations of the audio sources defined by the locating data at the first user terminal. Merging means are provided for merging at least some of the audio streams into a merged audio stream for transmittal to the first and other user terminal, with the merged audio stream being spatialized so as to provide for a spatialized background audio effect in the audio reference frame at the user terminal.

54 citations


PatentDOI
TL;DR: In this article, an audio decoder synthesizes spectral components to replace the discarded spectral components and generates spectral components for individual channel signals from the coupled-channel signal, which substantially preserves the spectral energy of the original input signals.
Abstract: An audio encoder discards spectral components of an input signal and uses channel coupling to reduce the information capacity requirements of an encoded signal. Channel coupling represents selected spectral components of multiple channels of signals in a composite form. An audio decoder synthesizes spectral components to replace the discarded spectral components and generates spectral components for individual channel signals from the coupled-channel signal. The encoder provides scale factors in the encoded signal that improve the efficiency of the decoder to generate output signals that substantially preserve the spectral energy of the original input signals.

54 citations


Patent
02 Jan 2003
TL;DR: In this article, a perceptual encoder divides an audio signal into successive time blocks, each time block is divided into frequency bands, and a scale factor is assigned to each of ones of the frequency bands.
Abstract: A perceptual encoder divides an audio signal into successive time blocks, each time block is divided into frequency bands, and a scale factor is assigned to each of ones of the frequency bands. Bits per block increase with scale factor values and band-to-band variations in scale factor values. A preliminary scale factor for each of ones of the frequency bands is determined, and the scale factors for the each of ones of the frequency bands is optimized, the optimizing including increasing the scale factor to a value greater than the preliminary scale factor value for one or more of the frequency bands such that the increase in bit cost of the increasing is the same or less than the reduction in bit cost resulting from the decrease in band-to-band variations in scale factor values resulting from increasing the scale factor for one or more of the frequency bands.

34 citations


Patent
Gary A. Demos1
27 Jun 2003
TL;DR: In this paper, a method, system, and computer programs for improving the image quality of one or more predicted frames in a video image compression system, where each frame comprises a plurality of pixels, are presented.
Abstract: A method, system, and computer programs for improving the image quality of one or more predicted frames in a video image compression system, where each frame comprises a plurality of pixels (fig. 8, item 802). A picture region of macroblock of certain types of frames can be encoded by reference to one or more referenceable frames in some cases, and by reference to two or more refenenceable frames in other cases (fig. 8, item 802’). Such encoding may include interpolation, such as an unequal weighting (fig. 8, item 820). The DC value or AC pixel values of a picture region may be interpolated as well, with or without weighting (fig. 8, item 818). A code pattern of such frames having a variable number of bidirectional predicted frames can be dynamically determined. Frames can be transmitted from an encoder to a decoder in a delivery order different from a display order. Sharpening and/or softening filters can be applied to a picture region of certain frames during motion vector compensated prediction (fig. 8, item 814’).

25 citations


Patent
21 Mar 2003
TL;DR: In this article, an estimated spectral envelope and a noise-blending parameter derived from a measure of the signal's noise-like quality are used to adjust the temporal shape of the reconstructed signal.
Abstract: An audio signal is conveyed more efficiently by transmitting or recording a baseband of the signal with an estimated spectral envelope and a noise-blending parameter derived from a measure of the signal’s noise-like quality. The signal is reconstructed by translating spectral components of the baseband signal to frequencies outside the baseband, adjusting phase of the regenerated components to maintain phase coherency, adjusting spectral shape according to the estimated spectral envelope, and adding noise according to the noise-blending parameter. Preferably, the transmitted or recorded signal also includes an estimated temporal envelope that is used to adjust the temporal shape of the reconstructed signal.

25 citations


Patent
08 Dec 2003
TL;DR: In this article, a coding and decoding apparatus is constructed so that the coding side transmits coded data together with identifying information for identifying the device of decoding the coded data, and the decoding side is capable of storing a number of decoding schemes so as to perform decoding based on one of the previously stored schemes.
Abstract: A coding and decoding apparatus is constructed so that the coding side transmits coded data together with identifying information for identifying the device of decoding the coded data, and the decoding side is capable of storing a number of decoding schemes so as to perform decoding based on one of the previously stored schemes. The apparatus further has devices for storing the received tools and tool-correspondent information which numerically represents the capacities of the tools so that it can make a comparison between the decoding capacity and the processing capacities of the tools to determine the possibility of the operations of the received tools. Further, a set of the tools are hierarchized so that the coded data produced by the n-ranked tool can be decoded by the (n+1)-ranked tool. Alternatively, the tools are defined in a hierarchical manner so that the decoding tools installed in the decoding apparatus will be able to assure the minimum quality and the requested decoding process can be performed by the received decoding tool. Further, the identification code of the decoding scheme used can be transmitted as required so that the decoding scheme can be expanded by transmitting the differential information from the basic decoding scheme.

21 citations


Patent
08 Dec 2003
TL;DR: In this article, a method of determining filter coefficients for filter stages in a multirate digital filter device to achieve a desired filter response is presented, including the step of determining a first plurality of evenly spaced sample points representing the desired response function on a logarithmic time scale, such that the sample points of the first plurality have an increasing spacing when viewed in a linear time scale.
Abstract: A method of determining filter coefficients for filter stages in a multirate digital filter device to achieve a desired filter response. The method includes the step of determining a first plurality of evenly spaced sample points representing the desired response function on a logarithmic time scale, such that the sample points of the first plurality have an increasing spacing when viewed in a linear time scale. The method further includes the step determining filter coefficients for each filter stage from an associated group of sample points out of the first plurality of sample points.

18 citations


Patent
30 May 2003
TL;DR: In this paper, a method for generating audio information is proposed, where a set of subband signals each having one or more spectral components representing spectral content of an audio signal is extracted from an input signal and a modified subband signal is generated by substituting the synthesized spectral components for corresponding zero-valued spectral components.
Abstract: A method for generating audio information comprises: receiving an input signal and obtaining therefrom a set of subband signals each having one or more spectral components representing spectral content of an audio signal; identifying within the set of subband signals a particular subband signal in which one or more spectral components have a zero value and are quantized by a quantizer having a minimum quantizing level; generating one or more synthesized spectral components that correspond to the one or more zero-valued spectral components in the particular subband signal and that are scaled according to a scaling envelope based upon the minimum quantizing level; generating a modified set of subband signals by substituting the synthesized spectral components for corresponding zero-valued spectral components in the particular subband signal; and generating the audio information by applying a synthesis filterbank to the modified set of subband signals.

Patent
08 Jul 2003
TL;DR: In this paper, an expanding quantizer is used to control the number of signal components that are quantized to zero and arithmetic coding was used to efficiently code the quantized-to-zero coefficients.
Abstract: The perceived quality of an audio signals obtained from very low bit-rate audio coding system is improved by using expanding quantizers and arithmetic coding in a transmitter and using complementary compression and arithmetic decoding in a receiver. An expanding quantizer is used to control the number of signal components that are quantized to zero and arithmetic coding is used to efficiently code the quantized-to-zero coefficients. This allows a wider bandwidth and more accurately quantized baseband signal to be conveyed to the receiver, which regenerates an output signal by synthesizing the missing components.

Patent
Gary A. Demos1
27 Jun 2003
TL;DR: In this paper, a method and system for improving the image quality of one or more predicted frames in a video image compression system, where each frame comprises a plurality of pixels, is presented.
Abstract: A method and system for improving the image quality of one or more predicted frames in a video image compression system, where each frame comprises a plurality of pixels. A picture region or macroblock of certain types of frames can be encoded by reference to two or more referenceable frames. Such encoding includes determining at least one macroblock within a bidirectional predicted frame (B) using direct mode prediction based on motion vectors from two or more predicted frames (P), wherein at least one such motion vector is scaled by a frame scale fraction of less than zero or greater than one.

Proceedings ArticleDOI
C. Bauer1
28 Sep 2003
TL;DR: It is shown that modified maximal matching algorithms guarantee stability of the switch and establish bounds on the average delay experienced by a packet when deployed in switches with a speedup of less than two.
Abstract: Input queued switches and the solution of input/output contention by scheduling algorithms have been widely investigated. Most research has focused on fixed-she packets. This contrasts with the variable size of ~p networks. In this ~ a ~ e r . we investieate how the elass of a speedup of S,S 2 1, if it works at a speed s times faster than the speed of the input links. The less cornPlex and with regard to a chosen metric less Optimal maximal matching algorithms have been widely researched . . I maximal matching algorithms deployed in switches with a speedup of less than two can he modified to take into account the varying packet sizes. Using a novel model for the dynamics of maximal matching algorithms, we show that modified maximal matching algorithms guarantee stability of the switch and establish bounds on the average delay experienced by a packet.

Proceedings ArticleDOI
19 Nov 2003
TL;DR: NIMA’s Persistent Surveillance Office anticipates accelerating fielding of "264on2", a migration path to inject improved compressions technologies, which will yield improved image quality and/or reduced bandwidths.
Abstract: The Department of Defense (DoD) Motion Imagery Standards Board (MISB) has adopted ITU-T Rec. H.264 (Baseline, Main, and Extended Profiles) to be the standard for applications constrained by low bandwidth channels (typically less than 1 Mb/s that may not be adequately supported by MPEG-2). H.264 will be carried over the MPEG-2 transport streams using ISO/IEC 13818-1:2000/FPDAM 3: "Information technology --Generic coding of moving pictures and associated audio: Systems, AMENDMENT 3: Transport of ISO/IEC 14496 part 10 [ITU-T H.264] video data over ISO/IEC 13818-1" (DRAFT). This is a part of a new principal called "Xon2" to support the "seamless" rollout of advanced video compression technologies without disrupting current and future operations and systems. "X" defines existing or future video compression technologies and "on2" refers to the use of MPEG-2 transport streams and files. The MISB predecessor VWG standardized on MPEG-2 (H.262) in 1996. "2on2" payloads have been successfully deployed using standards compliant MPEG-2 compressed video elementary streams, audio elementary streams, and SMPTE standard 336M KLV encoded metadata as MPEG-2 private data streams in support of unmanned aerial vehicle (UAV) operations. Building on this baseline "2on2" capability, "Xon2" will provide a migration path to inject improved compressions technologies, which will yield improved image quality and/or reduced bandwidths. NIMA’s Persistent Surveillance Office anticipates accelerating fielding of "264on2" using advanced video compression standard H.264 as described in this paper.

Patent
16 Dec 2003
TL;DR: In this paper, a perceptual encoder divides an audio signal into successive time blocks, each time block is divided into frequency bands, and a scale factor is assigned to each of ones of the frequency bands.
Abstract: A perceptual encoder divides an audio signal into successive time blocks, each time block is divided into frequency bands, and a scale factor is assigned to each of ones of the frequency bands. Bits per block increase with scale factor values and band-to-band variations in scale factor values. A preliminary scale factor for each of ones of the frequency bands is determined, and the scale factors for the each of ones of the frequency bands is optimized, the optimizing including increasing the scale factor to a value greater than the preliminary scale factor value for one or more of the frequency bands such that the increase in bit cost of the increasing is the same or less than the reduction in bit cost resulting from the decrease in band-to-band variations in scale factor values resulting from increasing the scale factor for one or more of the frequency bands.

Patent
13 Aug 2003
TL;DR: An audio signal processor forms gaps or guard bands in sequences of blocks conveying encoded audio information and time aligns the guard bands with video information as mentioned in this paper, allowing variations in processing or circuit delays so that the routing or switching of different streams of video information with embedded audio information does not result in a loss of any encoded audio blocks.
Abstract: An audio signal processor forms gaps or guard bands in sequences of blocks conveying encoded audio information and time aligns the guard bands with video information. The guard bands are formed to allow for variations in processing or circuit delays so that the routing or switching of different streams of video information with embedded audio information does not result in a loss of any encoded audio blocks.

Patent
Gary A. Demos1
27 Jun 2003
TL;DR: In this paper, a method and system for improving the image quality of one or more predicted frames in a video image compression system, where each frame comprises a plurality of pixels, is presented.
Abstract: A method and system for improving the image quality of one or more predicted frames in a video image compression system, where each frame comprises a plurality of pixels. A picture region or macroblock of certain types of frames can be encoded by reference to two or more referenceable frames. Such encoding includes determining at least one macroblock within a bidirectional predicted frame (B) using direct mode prediction based on motion vectors from two or more predicted frames (P), wherein at least one such motion vector is scaled by a frame scale fraction of less than zero or greater than one.


Proceedings ArticleDOI
19 Oct 2003
TL;DR: In this paper, the authors analyzed how DC offsets/contaminants typically found in a signal acquisition system can affect head-related transfer functions (HRTFs) which are measured with the Golay code method of acoustic system identification.
Abstract: The paper analyzes how DC offsets/contaminants typically found in a signal acquisition system can affect head-related transfer functions (HRTFs) which are measured with the Golay code method of acoustic system identification. Loosely speaking, HRTFs are filters which describe the acoustic filtering that the head, torso, and external ear perform on a sound. HRTFs measured from a human or mannequin can be used to simulate spatial sound over headphones and speakers. We develop a model for the Golay-based HRTF system identification process and use typical values of system parameters estimated from an actual HRTF measurement session to show that effects due to DC contaminants are typically very small. From the same analysis, we also prove the surprising theoretical result that there exist pairs of Golay codes which can completely eliminate DC contaminants from the HRTF system identification process, no matter how large these contaminants are. Finally, we exhibit a pair of these "DC blocking Golay codes", and demonstrate how they can be used to construct other, arbitrarily long, "DC blocking Golay codes".

Patent
09 Jun 2003
TL;DR: In this paper, a method for processing encoded audio information comprises: receiving the encoded audio and obtaining therefrom subband signals representing spectral content of an audio signal, examining some but not all of the subband signal to obtain an indication of temporal shape of the audio signal; generating synthesized spectral components using a process that is adapted in response to the indication of the temporal shape.
Abstract: A method for processing encoded audio information comprises: receiving the encoded audio information and obtaining therefrom subband signals representing spectral content of an audio signal; examining some but not all of the subband signals to obtain an indication of temporal shape of the audio signal; generating synthesized spectral components using a process that is adapted in response to the indication of temporal shape; combining respective synthesized spectral components and subband signal spectral components representing corresponding frequencies to generate a set of modified subband signals; and generating the audio information by applying a synthesis filterbank to the set of modified subband signals.