scispace - formally typeset
Search or ask a question

Showing papers by "Dolby Laboratories published in 2012"


Patent
27 Jun 2012
TL;DR: In this article, the authors describe an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams, which are associated with metadata that specifies whether the stream is a channel-based or object-based stream.
Abstract: Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that specifies whether the stream is a channel-based or object-based stream. Channel-based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the rendering location of a sound is based on the characteristics of the playback environment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of reference information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.

231 citations


Patent
06 Nov 2012
TL;DR: In this article, the authors present a method for applying corrective filters directly in a portable media device to correct, e.g., equalize for the overall system comprising the portable media devices and the playback system to which it is attached.
Abstract: A method, an apparatus, a system, and instructions stored in a non-transitory computer-readable medium to instruct a processing system to carry out the method. The method includes applying corrective filters directly in a portable media device to correct, e.g., equalize for the overall system comprising the portable media device and the playback system to which it is attached. Also a method of determining the corrective filters by playing back one or more calibration signals on the playback system while recording the resulting sound field on the portable media device.

132 citations


Patent
27 Jun 2012
TL;DR: Improved tools for authoring and rendering audio reproduction data are provided in this article, which allow audio reproducibility data to be generalized for a wide variety of reproduction environments by creating metadata for audio objects with reference to speaker zones.
Abstract: Improved tools for authoring and rendering audio reproduction data are provided. Some such authoring tools allow audio reproduction data to be generalized for a wide variety of reproduction environments. Audio reproduction data may be authored by creating metadata for audio objects. The metadata may be created with reference to speaker zones. During the rendering process, the audio reproduction data may be reproduced according to the reproduction speaker layout of a particular reproduction environment.

116 citations


Patent
13 Aug 2012
TL;DR: In this paper, the authors present a controller for controlling the shutter and the readout circuitry of an electronic camera consisting of a processor and a memory having computer-readable code embodied therein which, when executed by the processor, causes the controller to open the shutter for an image capture period to allow the two or more image sensor arrays to capture pixel data.
Abstract: An electronic camera comprises two or more image sensor arrays. At least one of the image sensor arrays has a high dynamic range. The camera also comprises a shutter for selectively allowing light to reach the two or more image sensor arrays, readout circuitry for selectively reading out pixel data from the image sensor arrays, and, a controller configured to control the shutter and the readout circuitry. The controller comprises a processor and a memory having computer-readable code embodied therein which, when executed by the processor, causes the controller to open the shutter for an image capture period to allow the two or more image sensor arrays to capture pixel data, and, read out pixel data from the two or more image sensor arrays.

104 citations


Patent
15 Mar 2012
TL;DR: In this article, a sigmoidal transfer function is used to control min-tone contrast in the image data for display on a target display in a way that substantially preserves the creative intent embodied in the original image data.
Abstract: Image data is transformed for display on a target display. A sigmoidal transfer function provides a free parameter controlling min-tone contrast. The transfer function may be dynamically adjusted to accommodate changing ambient lighting conditions. The transformation may be selected so as to automatically adapt image data for display on a target display in a way that substantially preserves creative intent embodied in the image data. The image data may be video data.

99 citations


Journal ArticleDOI
01 Nov 2012
TL;DR: This work reduces computational complexity with respect to the state-of-the-art, and adds a spatially varying model of lightness perception to scene reproduction.
Abstract: Managing the appearance of images across different display environments is a difficult problem, exacerbated by the proliferation of high dynamic range imaging technologies. Tone reproduction is often limited to luminance adjustment and is rarely calibrated against psychophysical data, while color appearance modeling addresses color reproduction in a calibrated manner, albeit over a limited luminance range. Only a few image appearance models bridge the gap, borrowing ideas from both areas. Our take on scene reproduction reduces computational complexity with respect to the state-of-the-art, and adds a spatially varying model of lightness perception. The predictive capabilities of the model are validated against all psychophysical data known to us, and visual comparisons show accurate and robust reproduction for challenging high dynamic range scenes.

75 citations


Patent
06 Dec 2012
TL;DR: In this article, a handheld imaging device has a data receiver that is configured to receive reference encoded image data, which includes reference code values, which are encoded by an external coding system, and the device-specific code values are configured to produce gray levels that are specific to the imaging device.
Abstract: A handheld imaging device has a data receiver that is configured to receive reference encoded image data. The data includes reference code values, which are encoded by an external coding system. The reference code values represent reference gray levels, which are being selected using a reference grayscale display function that is based on perceptual non-linearity of human vision adapted at different light levels to spatial frequencies. The imaging device also has a data converter that is configured to access a code mapping between the reference code values and device-specific code values of the imaging device. The device-specific code values are configured to produce gray levels that are specific to the imaging device. Based on the code mapping, the data converter is configured to transcode the reference encoded image data into device-specific image data, which is encoded with the device-specific code values.

73 citations


Patent
27 Jun 2012
TL;DR: In this paper, a method for monitoring speakers within an audio playback system (e.g., movie theater) environment is presented, which assumes that initial characteristics of the speakers have been determined at an initial time, and relies on one or more microphones positioned in the environment to perform a status check on each speaker to identify whether a change to at least one characteristic of any speaker has occurred since the initial time.
Abstract: In some embodiments, a method for monitoring speakers within an audio playback system (e.g., movie theater) environment. In typical embodiments, the monitoring method assumes that initial characteristics of the speakers (e.g., a room response for each of the speakers) have been determined at an initial time, and relies on one or more microphones positioned in the environment to perform a status check on each of the speakers to identify whether a change to at least one characteristic of any of the speakers has occurred since the initial time. In other embodiments, the method processes data indicative of output of a microphone to monitor audience reaction to an audiovisual program. Other aspects include a system configured (e.g., programmed) to perform any embodiment of the inventive method, and a computer readable medium (e.g., a disc) which stores code for implementing any embodiment of the inventive method.

57 citations


Patent
Guan-Ming Su1, Sheng Qu1, Hubert Koepfer1, Yufei Yuan1, Samir N. Hulyalkar1 
13 Apr 2012
TL;DR: In this paper, multi-channel multiple regression (MMR) models are applied to the efficient coding of images and video signals of high dynamic range, and closed form solutions for the prediction parameters are presented for a variety of MMR models.
Abstract: Inter-color image prediction is based on multi-channel multiple regression (MMR) models. Image prediction is applied to the efficient coding of images and video signals of high dynamic range. MMR models may include first order parameters, second order parameters, and cross-pixel parameters. MMR models using extension parameters incorporating neighbor pixel relations are also presented. Using minimum means-square error criteria, closed form solutions for the prediction parameters are presented for a variety of MMR models.

57 citations


Journal ArticleDOI
TL;DR: A comprehensive overview of the major techniques for generating spatialized sound is provided and perceptual and cross‐modal influences to consider and an in‐depth look at the emerging topics in the field is provided.
Abstract: In recent years research in the three-dimensional sound generation field has been primarily focussed upon new applications of spatialized sound In the computer graphics community the use of such techniques is most commonly found being applied to virtual, immersive environments However, the field is more varied and diverse than this and other research tackles the problem in a more complete, and computationally expensive manner Furthermore, the simulation of light and sound wave propagation is still unachievable at a physically accurate spatio-temporal quality in real time Although the Human Visual System (HVS) and the Human Auditory System (HAS) are exceptionally sophisticated, they also contain certain perceptional and attentional limitations Researchers, in fields such as psychology, have been investigating these limitations for several years and have come up with findings which may be exploited in other fields This paper provides a comprehensive overview of the major techniques for generating spatialized sound and, in addition, discusses perceptual and cross-modal influences to consider We also describe current limitations and provide an in-depth look at the emerging topics in the field © 2012 Wiley Periodicals, Inc

56 citations


Patent
Zhen Li1, Walter Gish1
12 Apr 2012
TL;DR: In this paper, a first video signal is accessed and represented in a first color space with a first colour gamut, related to a first dynamic range, and a second video signal was accessed, and represented by a second color space of a second colour spectrum.
Abstract: A first video signal is accessed, and represented in a first color space with a first color gamut, related to a first dynamic range. A second video signal is accessed, and represented in a second color space of a second color gamut, related to a second dynamic range. The first accessed video signal is converted to a video signal represented in the second color space. At least two color-related components of the converted video signal are mapped over the second dynamic range. The mapped video signal and the second accessed video signal are processed. Based on the processing, a difference is measured between the processed first and second video signals. A visual quality characteristic relates to a magnitude of the measured difference between the processed first and second video signals. The visual quality characteristic is assessed based, at least in part, on the measurement of the difference.

Patent
01 Nov 2012
TL;DR: In this paper, a base layer and one or more enhancement layers may be used to carry video signals, wherein the base layer cannot be decoded and viewed on its own, and the image data in the enhancement layer video signals may comprise residual values, quantization parameters, and mapping parameters based in part on a prediction method corresponding to a specific method used in the advanced quantization.
Abstract: Techniques use multiple lower bit depth codecs to provide higher bit depth, high dynamic range, images from an upstream device to a downstream device. A base layer and one or more enhancement layers may be used to carry video signals, wherein the base layer cannot be decoded and viewed on its own. Lower bit depth input image data to base layer processing may be generated from higher bit depth high dynamic range input image data via advanced quantization to minimize the volume of image data to be carried by enhancement layer video signals. The image data in the enhancement layer video signals may comprise residual values, quantization parameters, and mapping parameters based in part on a prediction method corresponding to a specific method used in the advanced quantization. Adaptive dynamic range adaptation techniques take into consideration special transition effects, such as fade-in and fade-outs, for improved coding performance.

Patent
15 Oct 2012
TL;DR: In this article, the authors proposed a video equalization method that is performed with a common anchor point (e.g., 20% gray level, or log mean of luminance) between the input video and the equalized video.
Abstract: Video equalization including performing equalization such that a sequence of images have dynamic range (optionally other characteristics) that is constant to a predetermined degree, where the input video includes high and standard dynamic range videos and images from both. Equalization is performed with a common anchor point (e.g., 20% gray level, or log mean of luminance) input video and the equalized video, and such that the images determined by the equalized video have at least substantially the same average luminance as images determined by the input video. Other aspects are systems (e.g., display systems and video delivery systems) configured to perform embodiments of the equalization method.

Patent
08 Mar 2012
TL;DR: In this paper, the authors describe methods for scalable video coding, which can be used to deliver video contents in low dynamic range (LDR) and/or one color format and then convert the video contents to High Dynamic Range (HDR) or a different color format, respectively, in block or macroblock levels.
Abstract: Methods for scalable video coding are described. Such methods can be used to deliver video contents in Low Dynamic Range (LDR) and/or one color format and then converting the video contents to High Dynamic Range (HDR) and/or a different color format, respectively, in block or macroblock levels.

Patent
01 Mar 2012
TL;DR: In this article, an output tone-mapped image is generated based on the high-resolution gray scale image and the local multiscale gray scale ratio image, each being of a different spatial resolution level.
Abstract: In a method to generate a tone-mapped image from a high-dynamic range image (HDR), an input HDR image is converted into a logarithmic domain and a global tone-mapping operator generates a high-resolution gray scale ratio image from the input HDR image. Based at least in part on the high-resolution gray scale ratio image, at least two different gray scale ratio images are generated and are merged together to generate a local multiscale gray scale ratio image that represents a weighted combination of the at least two different gray scale ratio images, each being of a different spatial resolution level. An output tone-mapped image is generated based on the high-resolution gray scale image and the local multiscale gray scale ratio image.

Patent
17 May 2012
TL;DR: In this article, the color management processing of source image data to be displayed on a target display is changed according to varying levels of metadata, such as metadata level and metadata level.
Abstract: Several embodiments of scalable image processing systems and methods are disclosed herein whereby color management processing of source image data to be displayed on a target display is changed according to varying levels of metadata.

Patent
Sheng Qu1, Peng Yin1, Yan Ye1, Yuwen He1, Gish Walter1, Guan-Ming Su1, Yufei Yuan1, Hulyalkar Samir1 
18 Dec 2012
TL;DR: In this paper, a coding syntax is signaled by upstream coding devices such as VDR encoders to downstream coding devices in a common vehicle in the form of RPU data units.
Abstract: Coding syntaxes in compliance with same or different VDR specifications may be signaled by upstream coding devices such as VDR encoders to downstream coding devices such as VDR decoders in a common vehicle in the form of RPU data units. VDR coding operations and operational parameters may be specified as sequence level, frame level, or partition level syntax elements in a coding syntax. Syntax elements in a coding syntax may be coded directly in one or more current RPU data units under a current RPU ID, predicted from other partitions/segments/ranges previously sent with the same current RPU ID, or predicted from other frame level or sequence level syntax elements previously sent with a previous RPU ID. A downstream device may perform decoding operations on multi-layered input image data based on received coding syntaxes to construct VDR images.

Patent
08 Mar 2012
TL;DR: In this article, the authors describe methods for scalable video coding, which can be used to deliver video contents in low dynamic range (LDR) and/or one color format and then convert the video contents to High Dynamic Range (HDR) or a different color format, respectively, while preprocessing video content.
Abstract: Methods for scalable video coding are described. Such methods can be used to deliver video contents in Low Dynamic Range (LDR) and/or one color format and then converting the video contents to High Dynamic Range (HDR) and/or a different color format, respectively, while pre-processing video content.

Patent
20 Jun 2012
TL;DR: In this paper, a method for processing output of at least one microphone of a device (e.g., a headset) to identify at least 1 touch gesture exerted by a user on the device, including by distinguishing the gesture from input to the microphone other than a touch gesture intended by the user.
Abstract: In some embodiments, a method for processing output of at least one microphone of a device (e.g., a headset) to identify at least one touch gesture exerted by a user on the device, including by distinguishing the gesture from input to the microphone other than a touch gesture intended by the user, and by distinguishing between a tap exerted by the user on the device and at least one dynamic gesture exerted by the user on the device, where the output of the at least one microphone is also indicative of ambient sound (e.g., voice utterences). Other embodiments are systems for detecting ambient sound (e.g., voice utterences) and touch gestures, each including a device including at least one microphone and a processor coupled and configured to process output of each microphone to identify at least one touch gesture exerted by a user on the device.

Journal ArticleDOI
TL;DR: A new model for an ideal operational amplifier that does not include implicit equations and is thus suitable for implementation using wave digital filters (WDFs) is introduced and a novel WDF model for a diode is proposed using the Lambert W function.
Abstract: This brief presents a generic model to emulate distortion circuits using operational amplifiers and diodes. Distortion circuits are widely used for enhancing the sound of guitars and other musical instruments. This brief introduces a new model for an ideal operational amplifier that does not include implicit equations and is thus suitable for implementation using wave digital filters (WDFs). Furthermore, a novel WDF model for a diode is proposed using the Lambert W function. A comparison of output signals of the proposed models to those obtained from a reference simulation using SPICE shows that the distortion characteristics are accurately reproduced over a wide frequency range. Additionally, the proposed model enables real-time emulation of distortion circuits using ten multiplications, 22 additions, and two interpolations from a lookup table per output sample.

Patent
24 Apr 2012
TL;DR: Several non-linear quantizers are presented in this paper, which are based on sigmoid-like transfer functions, controlled by one or more free parameters that control their mid-range slope.
Abstract: In layered VDR coding, inter-layer residuals are quantized by a non-linear quantizer before being coded by a subsequent encoder. Several non-linear quantizers are presented. Such non-linear quantizers may be based on sigmoid-like transfer functions, controlled by one or more free parameters that control their mid-range slope. These functions may also depend on an offset, an output range parameter, and the maximum absolute value of the input data. The quantizer parameters can time-vary and are signaled to a layered decoder. Example non-linear quantizers described herein may be based on the mu-law function, a sigmoid function, and/or a Laplacian distribution.

Patent
24 May 2012
TL;DR: In this article, a reconstruction method is proposed to generate video from a compressed representation using metadata indicative of at least one reconstruction parameter for spatial regions of the reconstructed video, and trade-offs may be made between temporal resolution and spatial resolution of regions of reconstructed video determined by the compressed representation to optimize perceived video quality while reducing the data rate.
Abstract: Compression transforming video into a compressed representation (which typically can be delivered at a capped pixel rate compatible with conventional video systems), including by generating spatially blended pixels and temporally blended pixels (e.g., temporally and spatially blended pixels) of the video, and determining a subset of the blended pixels for inclusion in the compressed representation including by assessing quality of reconstructed video determined from candidate sets of the blended pixels. Trade-offs may be made between temporal resolution and spatial resolution of regions of reconstructed video determined by the compressed representation to optimize perceived video quality while reducing the data rate. The compressed data may be packed into frames. A reconstruction method generates video from a compressed representation using metadata indicative of at least one reconstruction parameter for spatial regions of the reconstructed video.

Patent
10 May 2012
TL;DR: In this paper, an initial HDR image is coded and distributed, and a data packet is computed, which has a first and a second data set, each of which has an application marker that relates to the HDR-enhancement images.
Abstract: HDR images are coded and distributed. An initial HDR image is received. Processing the received HDR image creates a JPEG-2000 DCI-compliant coded baseline image and an HDR-enhancement image. The coded baseline image has one or more color components, each of which provide enhancement information that allows reconstruction of an instance of the initial HDR image using the baseline image and the HDR-enhancement images. A data packet is computed, which has a first and a second data set. The first data set relates to the baseline image color components, each of which has an application marker that relates to the HDR-enhancement images. The second data set relates to the HDR-enhancement image. The data packets are sent in a DCI-compliant bit stream.

Patent
25 Apr 2012
TL;DR: In this paper, a display including an image-generating panel and at least one contrast-enhancing panel, a cross BEF collimator between a backlight and one of the panels, and a polarization-preserving diffuser (e.g., holographic diffuser) between the panels is presented.
Abstract: A display including an image-generating panel and at least one contrast-enhancing panel, a cross BEF collimator between a backlight and one of the panels, and a polarization-preserving diffuser (e.g., holographic diffuser) between the panels. Typically, the contrast panel is upstream of the image panel, and a reflective polarizer is positioned between the cross BEF collimator and contrast panel, with the reflective polarizer oriented relative to an initial polarizer of the contrast panel. Polarization of light transmitted by the reflective polarizer matches that transmitted by the initial polarizer. Collimated light propagating from the cross BEF collimator toward the contrast-enhancing panel is given a polarization bias by the reflective polarizer, which reflects incorrectly polarized light back toward the cross BEF collimator. Alternatively, the reflective polarizer may be positioned between the cross BEF collimator and the image-generating panel when the image-generating panel is upstream of the contrast-enhancing panel.

Patent
01 Aug 2012
TL;DR: In this paper, an up-sampling filter is selected to up-sample the first image to a third image with a spatial resolution same as the second spatial resolution by minimizing an error measurement (e.g., MSE) between pixel values of the second image and the third image.
Abstract: An encoder receives a first image of a first spatial resolution and a second image of a second spatial resolution, wherein both the first image and the second image represent the same scene and the second spatial resolution is higher than the first spatial resolution. A filter is selected to up-sample the first image to a third image with a spatial resolution same as the second spatial resolution. The filtering coefficients for the up-sampling filter are computed by minimizing an error measurement (e.g., MSE) between pixel values of the second image and the third image. The computed set of filtering coefficients is signaled to a receiver (e.g., as metadata). A decoder receives the first image (or its approximation) and the metadata, and may up-sample the first image using the same filter and optimally selected filtering coefficients as those derived by the encoder.

Patent
31 Aug 2012
TL;DR: In this article, an encoding method for generating an extended dynamic range (EDR) channel in response to an input video channel was proposed, such that the EDR channel's code values consist of code values in a range from a standard black level, X, through a standard white level, Z, and an additional code value set.
Abstract: In some embodiments, an encoding method for generating an extended dynamic range (EDR) channel in response to an input video channel, such that the EDR channel's code values consist of code values in a range from a standard black level, X, through a standard white level, Z, and an additional code value set. The EDR channel is displayable with standard dynamic range and standard precision by a standard dynamic range video system which maps to the level, X, any of the EDR channel's values less than X, and maps to the level, Z, any of the EDR channel's values greater than Z, and is displayable with an extended dynamic range greater than the standard dynamic range and/or a precision greater than the standard precision by an EDR video system. Other aspects are systems configured to perform embodiments of the encoding method, and methods and systems for displaying EDR video.

Patent
19 Dec 2012
TL;DR: In this paper, a dual-panel display system is provided that comprises control modules and algorithms to select codeword pairs (CWs) to drive a first image-generating panel and a second contrast-improving panel.
Abstract: A dual-panel display system is provided that comprises control modules and algorithms to select codeword pairs (CWs) to drive a first image-generating panel and a second contrast-improving panel. The first codewords is selected by considering some characteristics of the input image data (e.g., peak luminance) and to improve some image rendering metric (e.g., reduced parallax, reduced contouring, improved level precision). The first codeword may be selected to be the minimum first codeword within a set of codeword pairs that preserves the peak luminance required by the input image data. Also, the first codeword may be selected to minimize the number of Just Noticeable Difference (JND) steps in the final image to be rendered. The second codeword may be selected to similarly improve image quality according to a given quality metric.

Patent
27 Jun 2012
TL;DR: In this article, the synchronization and switchover mechanism for an adaptive audio system is described, in which multi-channel (e.g., surround sound) audio is provided along with object-based adaptive audio content.
Abstract: Embodiments are described for a synchronization and switchover mechanism for an adaptive audio system in which multi-channel (e.g., surround sound) audio is provided along with object-based adaptive audio content. A synchronization signal is embedded in the multi-channel audio stream and contains a track identifier and frame count for the adaptive audio stream to play out. The track identifier and frame count of a received adaptive audio frame is compared to the track identifier and frame count contained in the synchronization signal. If either the track identifier or frame count does not match the synchronization signal, a switchover process fades out the adaptive audio track and fades in the multi-channel audio track. The system plays the multi-channel audio track until the synchronization signal track identifier and frame count and adaptive audio track identifier and frame count match, at which point the adaptive audio content will be faded back in.

Patent
11 Apr 2012
TL;DR: In this paper, a highlight projector is used to boost luminance in highlight areas of a base image projected by the main projector using steerable beams, holographic projectors and spatial light modulators.
Abstract: Projection displays include a highlight projector and a main projector Highlights projected by the highlight projector boost luminance in highlight areas of a base image projected by the main projector Various highlight projectors including steerable beams, holographic projectors and spatial light modulators are described

Patent
Robin Atkins1
26 Nov 2012
TL;DR: An HDR display is a combination of technologies including, for example, a dual modulation architecture incorporating algorithms for artifact reduction, selection of individual components, and a design process for the display and/or pipeline for preserving the visual dynamic range from capture to display of an image or images as discussed by the authors.
Abstract: An HDR display is a combination of technologies including, for example, a dual modulation architecture incorporating algorithms for artifact reduction, selection of individual components, and a design process for the display and/or pipeline for preserving the visual dynamic range from capture to display of an image or images In one embodiment, the dual modulation architecture includes a backlight with an array of RGB LEDs and a combination of a heat sink and thermally conductive vias for maintaining a desired operating temperature