scispace - formally typeset
Search or ask a question

Showing papers by "Dolby Laboratories published in 2007"


Patent•
30 Mar 2007
TL;DR: In this paper, a dynamic gain modification is applied to an audio signal at least partly in response to auditory events and/or the degree of change in signal characteristics associated with said auditory event boundaries.
Abstract: In one disclosed aspect, dynamic gain modification s are applied to an audio signal at least partly in response to auditory events and/or the degree of change in signal characteristics associated with said auditory event boundaries. In another aspect, an audio signal is divided into auditory events by comparing the difference in specific loudness between successive time blocks of the audio signal.

109 citations


Patent•
31 Jul 2007
TL;DR: In this article, a data structure defining a high dynamic range image consisting of a tone map having a reduced dynamic range and HDR information is defined, which can be reconstructed from the tone map and the HDR information.
Abstract: A data structure defining a high dynamic range image comprises a tone map having a reduced dynamic range and HDR information. The high dynamic range image can be reconstructed from the tone map and the HDR information. The data structure can be backwards compatible with legacy hardware or software viewers. The data structure may comprise a JFIF file having the tone map encoded as a JPEG image with the HDR information in an application extension or comment field of the JFIF file, or a MPEG file having the tone map encoded as a MPEG image with the HDR information in a video or audio channel of the MPEG file. Apparatus and methods for encoding or decoding the data structure may apply pre- or post correction to compensate for lossy encoding of the high dynamic range information.

97 citations


Patent•DOI•
TL;DR: In this article, an apparatus for creating, utilizing a pair of oppositely opposed headphone speakers, the sensation of a sound source being spatially distant from the area between the pair of headphones, the apparatus comprising a series of audio inputs representing audio signals being projected from an idealised sound source located at a spatial location relative to the idealised listener, is described.
Abstract: An apparatus for creating, utilizing a pair of oppositely opposed headphone speakers, the sensation of a sound source being spatially distant from the area between the pair of headphones, the apparatus comprising: (a) a series of audio inputs representing audio signals being projected from an idealised sound source located at a spatial location relative to the idealised listener; (b) a first mixing matrix means interconnected to the audio inputs and a series of feedback inputs for outputting a predetermined combination of the audio inputs as intermediate output signals; (c) a filter system of filtering the intermediate output signals and outputting filtered intermediate output signals and the series of feedback inputs, the filter system including separate filters for filtering the direct response and short time response and an approximation to the reverberant response, in addition to the feedback response filtering for producing the feedback inputs; and (d) a second matrix mixing means combining the filtered intermediate output signals to produce left and right channel stereo outputs.

76 citations


Patent•
18 May 2007
TL;DR: Spectral separation filters for channels of a 3D image projection incorporate passbands of primary colors in at least one of the filters, passbands are present in more than 3 primary colors as mentioned in this paper.
Abstract: Spectral separation filters for channels of a 3D image projection incorporate passbands of primary colors In at least one of the filters, passbands are present in more than 3 primary colors A set of two filters include a first filter having passbands of low blue, high blue, low green, high green, and red, and a second filter having passbands of blue, green, and red The additional primary passbands of the first filter allow for an increased color space in projections through the filters compared to filters only having red, green, and blue primaries The added flexibility of the increased color space is utilized to more closely match a color space and white point of a projector in which the filters are used

61 citations


Patent•
14 Mar 2007
TL;DR: In this article, amplitude, fractional-sample delay and phase-correction filters are arranged in cascade with one another and applied to subband signals that represent spectral content of an audio signal in frequency subbands.
Abstract: Transfer functions like Head Related Transfer Functions (HRTF) needed for binaural rendering are implemented efficiently by a subband-domain filter structure. In one implementation, amplitude, fractional-sample delay and phase-correction filters are arranged in cascade with one another and applied to subband signals that represent spectral content of an audio signal in frequency subbands. Other filter structures are also disclosed. These filter structures may be used advantageously in a variety of signal processing applications. A few examples of audio applications include signal bandwidth compression, loudness equalization, room acoustics correction and assisted listening for individuals with hearing impairments.

53 citations


Patent•
03 Dec 2007
TL;DR: In this paper, at least one audio signal is processed in order to derive instructions for channel reconfiguring, and the instructions are either transmitted or stored or transmitted during consumption, the at least signal is channel reconfigured in accordance with the instructions, including upmixing, downmixing and spatial reconfiguration.
Abstract: During production, at least one audio signal is processed in order to derive instructions for channel reconfiguring it The at least one audio signal and the instructions are stored or transmitted During consumption, the at least one audio signal is channel reconfigured in accordance with the instructions Channel reconfiguring includes upmixing, downmixing, and spatial reconfiguration By determining the channel reconfiguration instructions during production, processing resources during consumption are reduced

53 citations


Patent•
29 Nov 2007
TL;DR: In this paper, a hash function is applied to intermediate values derived from the measures of dissimilarity and to the low-resolution time-frequency representations of audio segments to identify video and audio content.
Abstract: Signatures that can be used to identify video and audio content are generated from the content by generating measures of dissimilarity between features of corresponding groups of pixels in frames of video content and by generating low-resolution time-frequency representations of audio segments. The signatures are generated by applying a hash function to intermediate values derived from the measures of dissimilarity and to the low-resolution time-frequency representations. The generated signatures may be used in a variety of applications such as restoring synchronization between video and audio content streams and identifying copies of original video and audio content. The generated signatures can provide reliable identifications despite intentional and unintentional modifications to the content.

51 citations


Journal Article•DOI•
TL;DR: An energy-based approach is employed to measure this motion-compensated edge artifact, using both compressed bitstream information and decoded pixels and shows results that accurately estimate the percentage of the added HF energy in compressed video.
Abstract: Little attention has been paid to an impairment common in motion-compensated video compression: the addition of high-frequency (HF) energy as motion compensation displaces blocking artifacts off block boundaries. In this paper, we employ an energy-based approach to measure this motion-compensated edge artifact, using both compressed bitstream information and decoded pixels. We evaluate the performance of our proposed metric, along with several blocking and blurring metrics, on compressed video in two ways. First, ordinal scales are evaluated through a series of expectations that a good quality metric should satisfy: the objective evaluation. Then, the best performing metrics are subjectively evaluated. The same subjective data set is finally used to obtain interval scales to gain more insight. Experimental results show that we accurately estimate the percentage of the added HF energy in compressed video

44 citations


Patent•
Alan J. Seefeldt1•
17 Dec 2007
TL;DR: In this paper, a loudness-compensating volume control method imposes a desired loudness scaling on an audio signal by processing the audio signal in both the digital and analog domains.
Abstract: A loudness-compensating volume control method imposes a desired loudness scaling on an audio signal by processing the audio signal in both the digital and analog domains by receiving a desired loudness scaling, deriving a wideband gain component and one or more other gain components from the desired loudness scaling, applying in the digital domain modifications to the audio signal based on the one or more other gain components to produce a partly-modified audio signal, and applying in the analog domain modifications to the partly-modified audio signal based on the wideband gain component. Additional loudness modifications other than volume control loudness modifications on the audio signal may also be imposed.

40 citations


Patent•
Jon C. Taenzer1•
19 Dec 2007
TL;DR: In this paper, the authors proposed two approaches for near-field sensing of wave signals in headsets and earsets, where two or more spaced-apart microphones are placed along a line generally between the headset and the user's mouth.
Abstract: Near-field sensing of wave signals, for example for application in headsets and earsets, is accomplished by placing two or more spaced-apart microphones along a line generally between the headset and the user's mouth. The signals produced at the output of the microphones will disagree in amplitude and time delay for the desired signal - the wearer's voice - but will disagree in a different manner for the ambient noises. Utilization of this difference enables recognizing, and subsequently ignoring, the noise portion of the signals and passing a clean voice signal. A first approach involves a complex vector difference equation applied in the frequency domain that creates a noise-reduced result. A second approach creates an attenuation value that is proportional to the complex vector difference, and applies this attenuation value to the original signal in order to effect a reduction of the noise. The two approaches can be applied separately or combined.

38 citations


Patent•
14 Mar 2007
TL;DR: In this paper, the perceived loudness of each individual channel may be scaled by changing the gain of the individual individual channel, wherein gain is a scaling of a channel's power.
Abstract: Scaling, by a desired amount sm, the overall perceived loudness Lm of a multichannel audio signal, wherein perceived loudness is a nonlinear function of signal power P, by scaling the perceived loudness of each individual channel Lc by an amount substantially equal to the desired amount of scaling of the overall perceived loudness of all channels sm, subject to accuracy in calculations and the desired accuracy of the overall perceived loudness scaling sm. The perceived loudness of each individual channel may be scaled by changing the gain of each individual channel, wherein gain is a scaling of a channel's power. Optionally, in addition, the loudness scaling applied to each channel may be modified so as to reduce the difference between the actual overall loudness scaling and the desired amount of overall loudness scaling

Proceedings Article•DOI•
R. Radhakrishnan1, C. Bauer1•
01 Oct 2007
TL;DR: The experimental results show that the proposed video signature is robust to most common signal processing operations on video content such as compression, resolution scaling, brightness scaling, and so on.
Abstract: We propose a novel video signature extraction method based on projections of difference images between consecutive video frames. The difference images are projected onto random basis vectors to create a low dimensional bitstream representation of the active content (moving regions) between two video frames. A sequence of these signatures serves to identify the underlying video content in a robust manner. Our experimental results show that the proposed video signature is robust to most common signal processing operations on video content such as compression, resolution scaling, brightness scaling.

Patent•
30 Mar 2007
TL;DR: In this article, the authors proposed a smoothing time constant commensurate with the integration time of human loudness perception or slower to measure the loudness of a time-sampled real signal.
Abstract: Processing an audio signal represented by the Modified Discrete Cosine Transform (MDCT) of a time-sampled real signal is disclosed in which the loudness of the transformed audio signal is measured, and at least in part in response to the measuring, the loudness of the transformed audio signal is modified. When gain modifying more than one frequency band, the variation or variations in gain from frequency band to frequency band, is smooth. The loudness measurement employs a smoothing time constant commensurate with the integration time of human loudness perception or slower.


Patent•
08 Nov 2007
TL;DR: In this article, an audio scene is created for an avatar in a virtual environment of multiple avatars, based on an avatar's associations with other linked avatars and a link structure between the avatars.
Abstract: An audio scene is created for an avatar in a virtual environment of multiple avatars. A link structure is created between the avatars. An audio scene is created for each avatar, based on an avatar's associations with other linked avatars.

Patent•
25 Sep 2007
TL;DR: In this article, an audio dynamics processor or processing method that uses a reset mechanism or process in order to adapt quickly to content changes in the audio signal is presented. But, the system switches back to the first audio source and the dynamics processor may be reset to the state previously stored or an approximation thereof.
Abstract: An audio dynamics processor or processing method that uses a reset mechanism or process in order to adapt quickly to content changes in the audio signal. A reset signal may be generated by analyzing the audio signal itself or the reset may be triggered from an external event such as a channel change on a television set or an input selection change on an audio/visual receiver. In the case of an external trigger, one or more indicators of the state of the dynamics processor for a current audio source may be saved and associated with that audio source before switching to a new audio source. Then, if the system switches back to the first audio source, the dynamics processor may be reset to the state previously stored or an approximation thereof.

Patent•
Wenyu Jiang1•
09 Apr 2007
TL;DR: In this article, the level of occupancy of a FIFO buffer in a processing device such as a router or wireless access point is estimated by monitoring packets transmitted by the processing device.
Abstract: The present invention may be used to estimate operational characteristics of devices that transmit and receive streams of information in a communication system. In one application, the level of occupancy of a FIFO buffer in a processing device such as a router or wireless access point is estimated by monitoring packets transmitted by the processing device. Estimates of the operational characteristics can be used to control communications in the system so that the overall performance is improved. Techniques that can be used to mitigate effects of low signal-to-noise ratio conditions are also disclosed.

Patent•
31 Jul 2007
TL;DR: A controller for a display having a screen which incorporates a light modulator is described in this article, where the modulator may be controlled by the controller to adjust the intensity of light emanating from corresponding areas on the screen.
Abstract: A controller for a display having a screen which incorporates a light modulator. The screen may be a front projection screen or a rear projection screen. Elements of the light modulator may be controlled by the controller to adjust the intensity of light emanating from corresponding areas on the screen. The display may provide a high dynamic range.

Patent•
17 Oct 2007
TL;DR: In this paper, information useful for modifying the dynamics of an audio signal is derived from one or more devices or processes operating at one or many respective nodes of each of a plurality of hierarchy levels.
Abstract: Information useful for modifying the dynamics of an audio signal is derived from one or more devices or processes operating at one or more respective nodes of each of a plurality of hierarchy levels, each hierarchical level having one or more nodes, in which the one or more devices or processes operating at each hierarchical level takes a measure of one or more characteristics of the audio signal such that the one or more devices or processes operating at each successively lower hierarchical level takes a measure of one or more characteristics of progressively smaller subdivisions of the audio signal.

Journal Article•DOI•
TL;DR: This work derives an efficient rate allocation for hierarchical B-pictures using the power spectral density of a wide-sense stationary process and investigates experimentally the tradeoff between delay and compression efficiency.
Abstract: Real-time video applications require tight bounds on end-to-end delay. Hierarchical bidirectional prediction requires buffering frames in the encoder input buffer, thereby contributing to encoder input delay. Long-term frame prediction with pulsed quality requires buffering at the encoder output, increasing the output buffer delay. Both hierarchical B-pictures and pulsed-quality coders involve uneven bit-rate allocation. Both the encoder and decoder buffering requirements depend on the rate allocation. We derive an efficient rate allocation for hierarchical B-pictures using the power spectral density of a wide-sense stationary process. In addition, we discuss important aspects of hierarchical predictive coding, such as the effect of the temporal prediction distance and delay tradeoffs for prediction branch truncation. Finally, we investigate experimentally the tradeoff between delay and compression efficiency.

Patent•DOI•
Mark Stuart Vinton1•
TL;DR: In this article, a two-channel to three-channel upmixer employs a difference in a measure of sound at the ears of a listener in accordance with first and second models, one based on a reproduction of the original channels and the other based on the upmixed channels.
Abstract: An audio upmixer, such as a two-channel to three-channel upmixer, employs a difference in a measure of sound at the ears of a listener in accordance with first and second models, one based on a reproduction of the original channels and the other based on a reproduction of the upmixed channels. The difference is minimized while simultaneously causing a, portion of one or more of the stereophonic channels to be applied to the center loudspeaker under some conditions of the signals in the stereophonic channels, the portion being commensurate with the value of a weighting factor, such that the weighting factor controls a balance between two opposing conditions, one in which no signals are applied to the center loudspeaker and another in which no signals are applied to the left and right loudspeakers.

Patent•DOI•
Mcgrath David S1•
TL;DR: In this paper, the authors derived signals that represent a sound field with high-order angular terms by analyzing input audio signals representing the sound field using zero-order and first-order terms to derive statistical characteristics of one or more angular directions of acoustic energy.
Abstract: Audio signals that represent a sound field with increased spatial resolution are obtained by deriving signals that represent the sound field with high-order angular terms. This is accomplished by analyzing input audio signals representing the sound field with zero-order and first-order angular terms to derive statistical characteristics of one or more angular directions of acoustic energy in the sound field. Processed signals are derived from weighted combinations of the input audio signals in which the input audio signals are weighted according to the statistical characteristics. The input audio signals and the processed signals represent the sound field as a function of angular direction with angular terms of one or more orders greater than one.

Patent•
Wenyu Jiang1•
13 Jul 2007
TL;DR: In this article, the authors proposed a two-stage encryption scheme, where the first stage encrypts control data and the second stage decrypts non-selected data from the control data.
Abstract: Processors that encrypt frames of data representing images and sounds, for example, use a first encryption process to encrypt control data that includes selected data from the data frames and use a second encryption process to encrypt non-selected data from the data frames. The first encryption process is responsive to a key, which may be associated with an intended recipient of the data frames. The second encryption process is responsive to a key that is obtained or derived from the control data. The encrypted control data and the encrypted non-selected data may be delivered to a receiver using separate media. The receiver recovers the data frames using decryption processes that are inverse to the first and second encryption processes. Efficient implementations of the second encryption process are disclosed.

Patent•
14 Mar 2007
TL;DR: In this article, a method for reducing phase differences varying with frequency occurring at certain listening positions with respect to loudspeakers reproducing respective ones of multiple sound channels in a listening space, the phase difference occurring in a sequence of frequency bands in which the phase differences alternate between being predominantly in-phase and predominantly out-of-phase, is proposed.
Abstract: A method for reducing phase differences varying with frequency occurring at certain listening positions with respect to loudspeakers reproducing respective ones of multiple sound channels in a listening space, the phase differences occurring in a sequence of frequency bands in which the phase differences alternate between being predominantly in-phase and predominantly out-of-phase, comprises adjusting the phase in multiple frequency bands in which the multiple sound channels are out-of-phase at such listening positions. Such adjustment of phase includes the frequency bands in which the width of comb filtering pass bands and notches resulting from phase differences at such listening positions would be greater than or commensurate with the critical band width if the phase adjustment were not applied. The listening space may be the interior of a vehicle.

Patent•
27 Nov 2007
TL;DR: In this paper, an expansion function is applied to generate from the original image data expanded data having a dynamic range greater than that of the original data and, obtaining an expand map comprising data indicative of a degree of luminance of regions associated with pixels in the image.
Abstract: A method for increasing dynamic range of original image data representing an image comprises applying an expansion function to generate from the original image data expanded data having a dynamic range greater than that of the original image data and, obtaining an expand map comprising data indicative of a degree of luminance of regions associated with pixels in the image. The method then combines the original image data and the expanded data according to the expand map to yield enhanced image data. Apparatus for boosting the dynamic range of image data comprises a dynamic range expander that produces expanded data, a luminance analyzer that produces an expand map and a combiner that combines the original and expanded data according to a variable weighting provided by the expand map.

Patent•
08 Aug 2007
TL;DR: In this article, a method and apparatus for limiting the absolute magnitude of an audio signal was proposed, which may include variable-gain reducing the gain of audio signal in a first stage, and the second variable gain reduction in a second stage that reduces the gain faster than the first stage.
Abstract: A method and apparatus for limiting the absolute magnitude of an audio signal. The method may include firstly variable-gain reducing the gain of an audio signal, and then secondly variable-gain reducing the gain of the audio signal faster than the first variable-gain reduction, thereby limiting the absolute magnitude of the audio signal to a threshold. The first variable-gain reduction may include variable-gain reducing the gain of the audio signal in a first stage, and the second variable-gain reduction may include variable-gain reducing the gain of the audio signal in a second stage that reduces the gain faster than the first stage. The second variable-gain reduction may include delaying the audio signal, finding a peak among the delayed audio signal, calculating a fast gain from a found peak, and modifying the delayed audio signal with the calculated fast gain.

Proceedings Article•DOI•
15 Apr 2007
TL;DR: This work evaluates existing flicker metrics, investigates the causes of flicker, and proposes a new rate-distortion optimal algorithm that suppresses flicker at a negligible cost in spatial image quality.
Abstract: Flickering is a temporal visual artifact that affects compressed video. It is prominent in intra-frame video coders and is largely the result of content variations and quantization. We concentrate on flickering due to quantization. JPEG2000 uses post-compression quantization which is applied through the EBCOT algorithm. EBCOT has been found, however, to cause significant flickering in the reconstructed video. In this work, we evaluate existing flicker metrics, investigate the causes of flicker, and propose a new rate-distortion optimal algorithm that suppresses flicker. The proposed algorithm suppresses temporal flicker at a negligible cost in spatial image quality.

Patent•
31 Jan 2007
TL;DR: In this article, a method for determining control values for pixels of the first array of pixels is presented, with an initial set of control values and refines the control values one at a time.
Abstract: A dual modulator display has a first array of pixels that illuminates a second array of pixels with a pattern of light. The second array of pixels modulates the pattern of light to yield an image. A method for determining control values for pixels of the first array of pixels begins with an initial set of control values and refines the control values. The control values may be refined one at a time. Images may be displayed in real time.

Proceedings Article•DOI•
02 Jul 2007
TL;DR: The experimental results show the robustness and the efficiency of the proposed content-based audio signature extraction method for various signal processing operations on audio content.
Abstract: Content-based signatures are designed to be a robust bit-stream representation of the content so as to enable content identification even though the original content may go through various signal processing operations. In this paper, we propose a novel content-based audio signature extraction method that captures temporal evolution of the audio spectrum. The proposed method, first, divides the input audio into overlapping chunks and computes a spectrogram for each chunk. Then, it projects each of the spectrograms onto random basis vectors to create a signature that is a low-dimensional bit-stream representation of the corresponding spectrogram. Our experimental results show the robustness and sensitivity of the proposed content-based audio signature extraction method for various signal processing operations on audio content.

Book Chapter•DOI•
Mark Franklin Davis1•
01 Jan 2007
TL;DR: In this article, the authors present a survey of devices and systems associated with audio and electroacoustics, focusing on the acquisition, transmission, storage, and reproduction of audio.
Abstract: This chapter surveys devices and systems associated with audio and electroacoustics: The acquisition, transmission, storage, and reproduction of audio. The chapter provides an historical overview of the field since before the days of Edison and Bell to the present day, and analyzes performance of audio transducers, components and systems from basic psychoacoustic principles, to arrive at an assessment of the perceptual performance of such elements and an indication of possible directions for future progress.