scispace - formally typeset
Search or ask a question

Showing papers by "Alan C. Bovik published in 1995"


Journal ArticleDOI
TL;DR: Unlike earlier non-feature-based, curved surface shape-from-texture approaches, the assumption that the surface texture is isotropic is not required; surface texture homogeneity can be assumed instead.
Abstract: Presents a non-feature-based solution to the problem of computing the shape of curved surfaces from texture information. First, the use of local spatial-frequency spectra and their moments to describe texture is discussed and motivated. A new, more accurate method for measuring the local spatial-frequency moments of an image texture using Gabor elementary functions and their derivatives is presented. Also described is a technique for separating shading from texture information, which makes the shape-from-texture algorithm robust to the shading effects found in real imagery. Second, a detailed model for the projection of local spectra and spectral moments of any surface reflectance patterns (not just textures) is developed. Third, the conditions under which the projection model can be solved for the orientation of the surface at each point are explored. Unlike earlier non-feature-based, curved surface shape-from-texture approaches, the assumption that the surface texture is isotropic is not required; surface texture homogeneity can be assumed instead. The algorithm's ability to operate on anisotropic and nondeterministic textures, and on both smooth- and rough-textured surfaces, is demonstrated. >

147 citations


Journal ArticleDOI
TL;DR: An image-demodulation problem is formulated and a solution based on the multidimensional energy operator Φ(f) = ||∇f||2 − f∇2f is presented, which develops a multiddimensional energy-separation algorithm to estimate the amplitude envelope and instantaneous frequencies of 2D spatially varying AM–FM signals.
Abstract: Locally narrow-band images can be modeled as two-dimensional (2D) spatial AM–FM signals with several applications in image texture analysis and computer vision. We formulate an image-demodulation problem and present a solution based on the multidimensional energy operator Φ(f) = ||∇f||2 − f∇2f. This nonlinear operator is a multidimensional extension of the one-dimensional (1D) energy-tracking operator Ψ(f) = (f′)2 − ff″, which has been found useful for demodulating 1D AM–FM and speech signals. We discuss some interesting properties of the multidimensional operator and develop a multidimensional energy-separation algorithm to estimate the amplitude envelope and instantaneous frequencies of 2D spatially varying AM–FM signals. Experiments are also presented on applying this 2D energy-demodulation algorithm to estimate the instantaneous amplitude contrast and spatial frequencies of image textures bandpass filtered by means of Gabor filters. The attractive features of the multidimensional energy operator and the 2D energy-separation algorithm are their simplicity, efficiency, and ability to track instantaneously varying spatial-modulation patterns.

121 citations


Journal ArticleDOI
TL;DR: The current model and algorithm are more accurate, yet substantially simpler, than earlier versions of this approach, and are tested on photographs of real-world surfaces.

93 citations


Proceedings ArticleDOI
05 Aug 1995
TL;DR: A new method for actively recovering depth information using image defocus is demonstrated and shown to support active stereo vision depth recovery by providing monocular depth estimates to guide the positioning of cameras for stereo processing.
Abstract: A new method for actively recovering depth information using image defocus is demonstrated and shown to support active stereo vision depth recovery by providing monocular depth estimates to guide the positioning of cameras for stereo processing. This active depth-from-defocus approach employs a spatial frequency model for image defocus which incorporates the optical transfer function of the image acquisition system and a maximum likelihood estimator to determine the amount of defocus present in a sequence of two or more images taken from the same pose. This defocus estimate is translated into a measurement of depth and associated uncertainty that is used to control the positioning of a variable baseline stereo camera system. This cooperative arrangement significantly reduces the matching uncertainty of the stereo correspondence process and increases the depth resolution obtainable with an active stereo vision platform.

30 citations


Proceedings ArticleDOI
23 Oct 1995
TL;DR: This work presents powerful multi-component AM-FM image models capable of efficiently representing complicated nonstationary multi-partite images, and shows how the representation can be computed in a practical implementation.
Abstract: We present powerful multi-component AM-FM image models capable of efficiently representing complicated nonstationary multi-partite images, and show how the representation can be computed in a practical implementation. With images of this type, important structural and perceptual information is often manifest in the nonstationarities. Highly localized nonlinear operators are used to simultaneously estimate the amplitude and frequency modulating functions associated with each of the multiple components on a pixel-by-pixel basis. For the first time, we also demonstrate image reconstruction from the AM-FM representation.

15 citations


Book ChapterDOI
01 Jan 1995
TL;DR: It is demonstrated that the combined system has superior performance to either of the subsystems acting alone, and is tested with a SOO word vocabulary.
Abstract: An audiovisual automatic speech recognition (ASR) system is described It is composed of two independent subsystems, both of which are complete speech recognition systems, plus a final integration stage One of the subsystems is an audio ASR system which uses well-known audio processing techniques The other subsystem performs lipreading, using a VQ-related method and hidden Markov models The integration stage uses a product rule The system is tested with a 500 word vocabulary It is demonstrated that the combined system has superior performance to either of the subsystems acting alone

14 citations


Proceedings ArticleDOI
21 Nov 1995
TL;DR: A stereo algorithm employing new modulation (multicomponent AM-FM) models for image representation, a disparity channel model for depth computation, and a multichannel processing for multi scale computation of stereo disparity from local image phase is presented.
Abstract: We present a stereo algorithm employing new modulation (multicomponent AM-FM) models for image representation, a disparity channel model for depth computation, and a multichannel (Gabor wavelet like) processing for multi scale (coarse to fine) computation of stereo disparity from local image phase. The algorithm generates a dense, subpixel accuracy disparity map without sophisticated feature extraction and interpolation.

9 citations


Proceedings ArticleDOI
21 Nov 1995
TL;DR: A new scene segmentation scheme is proposed, which first segments a scene at a coarse resolution and proceeds progressively to finer resolutions, and gives a description of the 3-dimensional structure of a scene.
Abstract: A new scene segmentation scheme is proposed. A scene with different depth ranges (e.g., with multiple objects) is segmented into ranges. Focus cues are used to measure depth ranges and segmentation is performed using a multiresolution approach. To use focus cues in a multiresolution framework, we develop a criterion function and pyramid for focus measure. The segmentation algorithm first segments a scene at a coarse resolution and proceeds progressively to finer resolutions. This segmentation into ranges gives a description of the 3-dimensional structure of a scene. This segmentation method does not require any prior knowledge of depth ranges. The segmentation results for a scene with multiple objects with different depth ranges is presented.

5 citations


Proceedings ArticleDOI
17 Apr 1995
TL;DR: This paper presents a real time full motion video conferencing system based on the Visual Pattern Image Sequence Coding (VPISC) software codec to demonstrate the practicality of software based real time video codecs.
Abstract: This paper presents a real time full motion video conferencing system based on the Visual Pattern Image Sequence Coding (VPISC) software codec. The prototype system hardware is comprised of two personal computers, two camcorders, two frame grabbers, and an ethernet connection. The prototype system software has a simple structure. It runs under the Disk Operating System, and includes a user interface, a video I/O interface, an event driven network interface, and a free running or frame synchronous video codec that also acts as the controller for the video and network interfaces. Two video coders have been tested in this system. Simple implementations of Visual Pattern Image Coding and VPISC have both proven to support full motion video conferencing with good visual quality. Future work will concentrate on expanding this prototype to support the motion compensated version of VPISC, as well as encompassing point-to-point modem I/O and multiple network protocols. The application will be ported to multiple hardware platforms and operating systems. The motivation for developing this prototype system is to demonstrate the practicality of software based real time video codecs. Furthermore, software video codecs are not only cheaper, but are more flexible system solutions because they enable different computer platforms to exchange encoded video information without requiring on-board protocol compatible video codex hardware. Software based solutions enable true low cost video conferencing that fits the `open systems' model of interoperability that is so important for building portable hardware and software applications.

Proceedings ArticleDOI
28 Mar 1995
TL;DR: The method of ranked residuals is proposed to restore binary texture which is corrupted by Gaussian noise and not only deletes the noise but also preserves all details of a texture.
Abstract: Textures are degraded by Gaussian noise in the process of image acquisition. The restoration of a texture is very important for later texture analysis and classification. In this paper, the method of ranked residuals is proposed to restore binary texture which is corrupted by Gaussian noise. This method not only deletes the noise but also preserves all details of a texture. In addition, it has the property of preserving any line endings (not necessarily straight) and any boundary (concave or convex) at any orientation, edges, and corners. The main idea of ranked residual method is that it selects the windowed pixels that are closest to the windowed central value as the subset and chooses an estimator (median, mean, LMS, etc.) to estimate the central value. This allows us to adapt our choice of subsets. Therefore whatever the shape of texture looks like, the filter can preserve the texture detail and eliminate the noise at the same time. Some synthetic and real textures are used to demonstrate the properties of this filter.