scispace - formally typeset
Search or ask a question

Showing papers on "Object detection published in 1997"


Journal ArticleDOI
TL;DR: An unsupervised technique for visual learning is presented, which is based on density estimation in high-dimensional spaces using an eigenspace decomposition and is applied to the probabilistic visual modeling, detection, recognition, and coding of human faces and nonrigid objects.
Abstract: We present an unsupervised technique for visual learning, which is based on density estimation in high-dimensional spaces using an eigenspace decomposition. Two types of density estimates are derived for modeling the training data: a multivariate Gaussian (for unimodal distributions) and a mixture-of-Gaussians model (for multimodal distributions). Those probability densities are then used to formulate a maximum-likelihood estimation framework for visual search and target detection for automatic object recognition and coding. Our learning technique is applied to the probabilistic visual modeling, detection, recognition, and coding of human faces and nonrigid objects, such as hands.

1,624 citations


Proceedings ArticleDOI
17 Jun 1997
TL;DR: This paper presents a trainable object detection architecture that is applied to detecting people in static images of cluttered scenes and shows how the invariant properties and computational efficiency of the wavelet template make it an effective tool for object detection.
Abstract: This paper presents a trainable object detection architecture that is applied to detecting people in static images of cluttered scenes. This problem poses several challenges. People are highly non-rigid objects with a high degree of variability in size, shape, color, and texture. Unlike previous approaches, this system learns from examples and does not rely on any a priori (hand-crafted) models or on motion. The detection technique is based on the novel idea of the wavelet template that defines the shape of an object in terms of a subset of the wavelet coefficients of the image. It is invariant to changes in color and texture and can be used to robustly define a rich and complex class of objects such as people. We show how the invariant properties and computational efficiency of the wavelet template make it an effective tool for object detection.

811 citations


Journal ArticleDOI
TL;DR: The paper demonstrates a successful application of PDBNN to face recognition applications on two public (FERET and ORL) and one in-house (SCR) databases and experimental results on three different databases such as recognition accuracies as well as false rejection and false acceptance rates are elaborated.
Abstract: This paper proposes a face recognition system, based on probabilistic decision-based neural networks (PDBNN). With technological advance on microelectronic and vision system, high performance automatic techniques on biometric recognition are now becoming economically feasible. Among all the biometric identification methods, face recognition has attracted much attention in recent years because it has potential to be most nonintrusive and user-friendly. The PDBNN face recognition system consists of three modules: First, a face detector finds the location of a human face in an image. Then an eye localizer determines the positions of both eyes in order to generate meaningful feature vectors. The facial region proposed contains eyebrows, eyes, and nose, but excluding mouth (eye-glasses will be allowed). Lastly, the third module is a face recognizer. The PDBNN can be effectively applied to all the three modules. It adopts a hierarchical network structures with nonlinear basis functions and a competitive credit-assignment scheme. The paper demonstrates a successful application of PDBNN to face recognition applications on two public (FERET and ORL) and one in-house (SCR) databases. Regarding the performance, experimental results on three different databases such as recognition accuracies as well as false rejection and false acceptance rates are elaborated. As to the processing speed, the whole recognition process (including PDBNN processing for eye localization, feature extraction, and classification) consumes approximately one second on Sparc10, without using hardware accelerator or co-processor.

637 citations


Journal ArticleDOI
TL;DR: A feature-based segmentation approach to the object detection problem is pursued, where the features are computed over multiple spatial orientations and frequencies, which helps in the detection of objects located in complex backgrounds.

306 citations


Proceedings ArticleDOI
17 Jun 1997
TL;DR: A novel boundary detection scheme based on "edge flow" that utilizes a predictive coding model to identify the direction of change in color and texture at each image location at a given scale, and constructs an edge flow vector.
Abstract: A novel boundary detection scheme based on "edge flow" is proposed in this paper. This scheme utilizes a predictive coding model to identify the direction of change in color and texture at each image location at a given scale, and constructs an edge flow vector. By iteratively propagating the edge flow, the boundaries can be detected at image locations which encounter two opposite directions of flow in the stable state. A user defined image scale is the only significant control parameter that is needed by the algorithm. The scheme facilitates integration of color and texture into a single framework for boundary detection.

258 citations


Journal ArticleDOI
TL;DR: A geometric approach for 3D object segmentation and representation is presented that links between classical deformable surfaces obtained via energy minimization, and intrinsic ones derived from curvature based flows.
Abstract: A geometric approach for 3D object segmentation and representation is presented. The segmentation is obtained by deformable surfaces moving towards the objects to be detected in the 3D image. The model is based on curvature motion and the computation of surfaces with minimal areas, better known as minimal surfaces. The space where the surfaces are computed is induced from the 3D image (volumetric data) in which the objects are to be detected. The model links between classical deformable surfaces obtained via energy minimization, and intrinsic ones derived from curvature based flows. The new approach is stable, robust, and automatically handles changes in the surface topology during the deformation.

225 citations


Journal ArticleDOI
TL;DR: The proposed adaptive segmentation method uses local color information to estimate the membership probability in the object, respectively, background class and the method is applied to the recognition and localization of human hands in color camera images of complex laboratory scenes.
Abstract: With the availability of more powerful computers it is nowadays possible to perform pixel based operations on real camera images even in the full color space. New adaptive classification tools like neural networks make it possible to develop special-purpose object detectors that can segment arbitrary objects in real images with a complex distribution in the feature space after training with one or several previously labeled image(s). The paper focuses on a detailed comparison of a neural approach based on local linear maps (LLMs) to a classifier based on normal distributions. The proposed adaptive segmentation method uses local color information to estimate the membership probability in the object, respectively, background class. The method is applied to the recognition and localization of human hands in color camera images of complex laboratory scenes.

167 citations


Journal ArticleDOI
TL;DR: A unified framework for these problems, which utilizes a maximum likelihood ratio approach to detection, is presented and certain multiband detectors become special cases in this framework.
Abstract: Multispectral or hyperspectral sensors can facilitate automatic target detection and recognition in clutter since natural clutter from vegetation is characterized by a grey body, and man-made objects, compared with blackbody radiators, emit radiation more strongly at some wavelengths. Various types of data fusion of the spectral-spatial features contained in multiband imagery developed for detecting and recognizing low-contrast targets in clutter appear to have a common framework. A generalized hypothesis test on the observed data is formulated by partitioning the received bands into two groups. In one group, targets exhibit substantial coloring in their signatures but behave either like grey bodies or emit negligible radiant energy in the other group. This general observation about the data generalizes the data models used previously. A unified framework for these problems, which utilizes a maximum likelihood ratio approach to detection, is presented. Within this framework, a performance evaluation and a comparison of the various types of multiband detectors are conducted by finding the gain of the SNR needed for detection as well as the gain required for separability between the target classes used for recognition. Certain multiband detectors become special cases in this framework. The incremental gains in SNR and separability obtained by using what are called target-feature bands plus clutter-reference bands are studied. Certain essential parameters are defined that effect the gains in SNR and target separability.

154 citations


Proceedings ArticleDOI
17 Jun 1997
TL;DR: An algorithm for tracking non-rigid, moving objects in a sequence of colored images, which were recorded by a non-stationary camera is presented, which remarkably simplifies the correspondence problem and also ensures a robust tracking behaviour.
Abstract: In this contribution we present an algorithm for tracking non-rigid, moving objects in a sequence of colored images, which were recorded by a non-stationary camera. The application background is vision-based driving assistance in the inner city. In an initial step, object parts are determined by a divisive clustering algorithm, which is applied to all pixels in the first image of the sequence. The feature space is defined by the color and position of a pixel. For each new image the clusters of the previous image are adapted iteratively by a parallel k-means clustering algorithm. Instead of tracking single points, edges, or areas over a sequence of images, only the centroids of the clusters are tracked. The proposed method remarkably simplifies the correspondence problem and also ensures a robust tracking behaviour.

135 citations


Patent
26 Mar 1997
TL;DR: In this paper, a compact system for producing high contrast and high resolution images of a topographical surface associated with an object is presented, which utilizes a novel holographic optical element to produce images of topographical surfaces differentiating between ridges and valleys, providing image details of artifacts.
Abstract: A compact system for producing high contrast and high resolution images of a topographical surface associated with an object. The system utilizes a novel holographic optical element to produce images of topographical surfaces differentiating between ridges and valleys, and providing image details of artifacts. The system of the present invention can be realized in the form of a hand-held instrument for use in in vivo imaging of fingerprint, skin tissue and the like.

129 citations


Journal ArticleDOI
TL;DR: It is argued that the processing gain that results in using more than two images justifies the increased computational complexity and storage requirements of the approach over single image object detection and conventional change detection techniques.
Abstract: A new approach to wide area surveillance is described that is based on the detection and analysis of changes across two or more images over time. Methods for modeling and detecting general patterns of change associated with construction and other kinds of activities that can be observed in remotely sensed imagery are presented. They include a new nonlinear prediction technique for measuring changes between images and temporal segmentation and filtering techniques for analyzing patterns of change over time. These methods are applied to the problem of detecting facility construction using Landsat Thematic Mapper imagery. Full scene results show the methods to be capable of detecting specific patterns of change with very few false alarms. Under all conditions explored, as the number of images used increases, the number of false alarms decreases dramatically without affecting the detection performance. It is argued that the processing gain that results in using more than two images justifies the increased computational complexity and storage requirements of our approach over single image object detection and conventional change detection techniques.

Proceedings ArticleDOI
09 Nov 1997
TL;DR: In this paper, an optical flow based obstacle detection system for detecting vehicles approaching the blind spot of a car on highways and city streets is described. But this approach does not explicitly calculate optical flow.
Abstract: We describe an optical flow based obstacle detection system for use in detecting vehicles approaching the blind spot of a car on highways and city streets. The system runs at near frame rate (8-15 frames/second) on PC hardware. We will discuss the prediction of a camera image given an implicit optical flow field and comparison with the actual camera image. The advantage to this approach is that we never explicitly calculate optical flow. We will also present results on digitized highway images, and video taken from Navlab 5 while driving on a Pittsburgh highway.

Journal ArticleDOI
TL;DR: New detection algorithms and the fusion of their outputs are considered to reduce the probability of false alarm P(FA) while maintaining high probability of detection P(D) in infrared imagery.
Abstract: Detection involves locating all candidate regions of interest (objects) in a scene independent of the object class with object distortions and contrast differences, etc., present. It is one of the most formidable problems in automatic target recognition, since it involves analysis of every local scene region. We consider new detection algorithms and the fusion of their outputs to reduce the probability of false alarm P/sub FA/ while maintaining high probability of detection P/sub D/. Emphasis is given to detecting obscured targets in infrared imagery.

Proceedings ArticleDOI
H. Buxton1
10 Mar 1997
TL;DR: A knowledge-based approach is adopted in which domain specific models of the dynamic objects, events and behaviour are used to meet the requirement for sensitive and accurate performance.
Abstract: Visual surveillance primarily involves the interpretation of image sequences Advanced visual surveillance goes further and automates the detection of predefined alarm events in a given context However, it is the intelligent, dynamic scene and event discrimination which lies at the heart of advanced vision systems Developing a systematic methodology for the design, implementation and integration of such systems is currently a very important research problem One way of overcoming some of these problems is to build in more knowledge of the scene and tasks For example, in object detection and tracking in the image, we have demonstrated that it is beneficial to bring scene-based knowledge of expected object trajectories, size and speed into the interpretation process We have also shown that both scene and task-based knowledge allows for selective processing under attentional control for behavioural evaluation In addition to this general requirement for integration of information in advanced visual surveillance, we have adopted more specific requirements A fixed, precalibrated camera model and precomputed ground-plane geometry is used to simplify the interpretation of the scene data in the on-line system We also adopt a knowledge-based approach in which domain specific models of the dynamic objects, events and behaviour are used to meet the requirement for sensitive and accurate performance (5 pages)

Journal ArticleDOI
TL;DR: The development of an improved 2-D adaptive lattice algorithm (2-D AL) and its application to the removal of correlated clutter to enhance the detectability of small objects in images is focused on.
Abstract: Two-dimensional (2-D) adaptive filtering is a technique that can be applied to many image processing applications. This paper will focus on the development of an improved 2-D adaptive lattice algorithm (2-D AL) and its application to the removal of correlated clutter to enhance the detectability of small objects in images. The two improvements proposed here are increased flexibility in the calculation of the reflection coefficients and a 2-D method to update the correlations used in the 2-D AL algorithm. The 2-D AL algorithm is shown to predict correlated clutter in image data and the resulting filter is compared with an ideal Wiener-Hopf filter. The results of the clutter removal will be compared to previously published ones for a 2-D least mean square (LMS) algorithm. 2-D AL is better able to predict spatially varying clutter than the 2-D LMS algorithm, since it converges faster to new image properties. Examples of these improvements are shown for a spatially varying 2-D sinusoid in white noise and simulated clouds. The 2-D LMS and 2-D AL algorithms are also shown to enhance a mammogram image for the detection of small microcalcifications and stellate lesions.

Journal ArticleDOI
TL;DR: A novel view representation based on "shape spectrum" features is introduced, and a general and powerful technique for organizing multiple views of objects of complex shape and geometry into compact and homogeneous clusters is proposed.
Abstract: We address the problem of constructing view aspects of 3D free-form objects for efficient matching during recognition. We introduce a novel view representation based on "shape spectrum" features, and propose a general and powerful technique for organizing multiple views of objects of complex shape and geometry into compact and homogeneous clusters. Our view grouping technique obviates the need for surface segmentation and edge detection. Experiments on 6,400 synthetically generated views of 20 free-form objects and 100 real range images of 10 sculpted objects demonstrate the good performance of our shape spectrum based model view selection technique.

Journal ArticleDOI
TL;DR: In this article, a novel geometric approach for 3D object segmentation is presented, which is based on geometric deformable surfaces moving towards the objects to be detected, and the model is related to the computation of surfaces of minimal area (local minimal surfaces).
Abstract: A novel geometric approach for three dimensional object segmentation is presented. The scheme is based on geometric deformable surfaces moving towards the objects to be detected. We show that this model is related to the computation of surfaces of minimal area (local minimal surfaces). The space where these surfaces are computed is induced from the three dimensional image in which the objects are to be detected. The general approach also shows the relation between classical deformable surfaces obtained via energy minimization and geometric ones derived from curvature flows in the surface evolution framework. The scheme is stable, robust, and automatically handles changes in the surface topology during the deformation. Results related to existence, uniqueness, stability, and correctness of the solution to this geometric deformable model are presented as well. Based on an efficient numerical algorithm for surface evolution, we present a number of examples of object detection in real and synthetic images.

Proceedings ArticleDOI
09 Jun 1997
TL;DR: The schema provides a general framework for video object extraction, indexing, and classification and presents new video segmentation and tracking algorithms based on salient color and affine motion features.
Abstract: Object segmentation and tracking is a key component for new generation of digital video representation, transmission and manipulations. Example applications include content based video database and video editing. We present a general schema for video object modeling, which incorporates low level visual features and hierarchical grouping. The schema provides a general framework for video object extraction, indexing, and classification. In addition, we present new video segmentation and tracking algorithms based on salient color and affine motion features. Color feature is used for intra frame segmentation; affine motion is used for tracking image segments over time. Experimental evaluation results using several test video streams are included.

Journal ArticleDOI
TL;DR: It is demonstrated that image discrimination models can predict the relative detectability of objects in natural scenes by improving the predictions of all three models for all three exponents.

Book ChapterDOI
TL;DR: The topic of representation, recovery and manipulation of three-dimensional scenes from two-dimensional images thereof, provides a fertile ground for both intellectual theoretically inclined questions related to the algebra and geometry and practical applications such as Visual Recognition, Animation and View Synthesis.
Abstract: The topic of representation, recovery and manipulation of three-dimensional (3D) scenes from two-dimensional (2D) images thereof, provides a fertile ground for both intellectual theoretically inclined questions related to the algebra and geometry of the problem and to practical applications such as Visual Recognition, Animation and View Synthesis, recovery of scene structure and camera ego-motion, object detection and tracking, multi-sensor alignment, etc.

Patent
19 Nov 1997
TL;DR: In this article, a triggering control device for occupant restraint systems in a vehicle whose triggering readiness can be influenced as a function of collision-relevant parameters is presented, which can detect a collision object before a collision occurs in an area near the vehicle and determine at least a relative speed.
Abstract: The invention relates to a triggering control device for occupant restraint systems in a vehicle whose triggering readiness can be influenced as a function of collision-relevant parameters. According to the invention, collision parameter detection comprises object detection, which can detect a collision object before a collision occurs in an area near the vehicle, and can determine at least a relative speed. Whenever a collision object is detected, a signal generator delivers a collision parameter signal that depends on the relative speed determined. In a preferred embodiment the collision parameter detection unit also determines the intrinsic speed of the vehicle and whenever no collision object is detected, the signal generator supplies a collision parameter signal influenced by the intrinsic speed.

Journal ArticleDOI
TL;DR: A Bayesian approach is adopted in which prior distributions on target scenarios are constructed via dynamical models of the targets of interest, combined with physics-based sensor models which define conditional likelihoods for the coarse/fine scale sensor data given the underlying scene.
Abstract: Proposes a framework for simultaneous detection, tracking, and recognition of objects via data fused from multiple sensors. Complex dynamic scenes are represented via the concatenation of simple rigid templates. The variability of the infinity of pose is accommodated via the actions of matrix Lie groups extending the templates to individual instances. The variability of target number and target identity is accommodated via the representation of scenes as unions of templates of varying types, with the associated group transformations of varying dimension. We focus on recognition in the air-to-ground and ground-to-air scenarios. The remote sensing data is organized around both the coarse scale associated with detection as provided by tracking and range radars, along with the fine scale associated with pose and identity supported by high-resolution optical, forward looking infrared and delay-Doppler radar imagers. A Bayesian approach is adopted in which prior distributions on target scenarios are constructed via dynamical models of the targets of interest. These are combined with physics-based sensor models which define conditional likelihoods for the coarse/fine scale sensor data given the underlying scene. Inference via the Bayes posterior is organized around a random sampling algorithm based on jump-diffusion processes. New objects are detected and object identities are recognized through discrete jump moves through parameter space, the algorithm exploring scenes of varying complexity as it proceeds. Between jumps, the scale and rotation group transformations are generated via continuous diffusions in order to smoothly deform templates into individual instances of objects.

Patent
Yasuhiro Taniguchi1
15 Aug 1997
TL;DR: A moving object detection apparatus includes a movable input section to input a plurality of images in a time series, in which a background area and a moving object are included.
Abstract: A moving object detection apparatus includes a movable input section to input a plurality of images in a time series, in which a background area and a moving object are included. A calculation section divides each input image by unit of predetermined area, and calculates the moving vector between two images in a time series and a corresponding confidence value of the moving vector by unit of the predetermined area. A background area detection section detects a group of the predetermined areas, each of which moves almost equally as the background area from the input image according to the moving vector and the confidence value by unit of the predetermined area. A moving area detection section detects the area other than the background area as the moving area from the input image according to the moving vector of the background area.

Journal ArticleDOI
TL;DR: A novel method to detect three-dimensional objects in arbitrary poses and sizes from a complex image and to simultaneously measure their poses and size using appearance matching is proposed.

Patent
27 Aug 1997
TL;DR: In this article, a method of operation for a motor vehicle object detection system is described, in which the extent angle of an identified target is accurately determined by applying a point source scatterer identification technique to data at the periphery of a composite return amplitude data from one or more complete scans of the sensor beam.
Abstract: A method of operation for a motor vehicle object detection system is described, in which the extent angle of an identified target is accurately determined by applying a point source scatterer identification technique to data at the periphery of a composite return Return amplitude data from one or more complete scans of the sensor beam are collected and compared with a target threshold to identify objects in the viewing angle, thereby forming an array of amplitude data associated with successive beam positions for each identified object In each array, the left-most and right-most pair of amplitude data points associated with successive beam positions are selected and individually used to compute the angle of a point source scatterer which would be responsible for that data pair The computed scatterer angles are taken as the left and right edges of the target and used to determine the angle extent of the identified object, which in turn, enables reliable determination as to whether the identified object is in or out of the vehicle travel path, and what, if any, vehicle response is appropriate to maintain a given headway or avoid a collision with the object

Proceedings ArticleDOI
09 Nov 1997
TL;DR: ARGO integrates the GOLD (generic obstacle and lane detection) system, a stereo vision-based hardware and software architecture that allows detection of both generic obstacles on flat roads and the lane position in structured environments (with painted lane markings).
Abstract: Presents ARGO, the experimental land vehicle developed at the Dipartimento di Ingegneria dell'Informazione of the University of Parma, Italy. ARGO integrates the GOLD (generic obstacle and lane detection) system, a stereo vision-based hardware and software architecture that allows detection of both generic obstacles (without constraints on shape, color or symmetry) on flat roads and the lane position in structured environments (with painted lane markings). In addition, the paper presents an approach that allows us to handle also non-flat roads.

Journal ArticleDOI
TL;DR: In this paper, three different algorithms for mobile robot vision-based obstacle detection are presented, each based on different assumptions, and the performance of the three algorithms under different noise levels is compared in simulation.
Abstract: Three different algorithms for mobile robot vision-based obstacle detection are presented in this paper each based on different assumptions The first two algorithms are qualitative in that they return only yes/no answers regarding the presence of obstacles in the field of view; no 3D reconstruction is performed. They have the advantage of fast determination of the existence of obstacles in a scene based on the solvability of a linear system. The first algorithm uses information about the ground plane, while the second only assumes that the ground is planar. The third algorithm is quantitative in that it continuously estimates the ground plane and reconstructs partial 3D structures by determining the height above the ground plane of each point in the scene. Experimental results are presented for real and simulated data, and the performance of the three algorithms under different noise levels is compared in simulation. We conclude that in terms of the robustness of performance, the third algorithm is superior to the other two.

Journal ArticleDOI
TL;DR: The key idea is to use magnitude, phase, and frequency measures of the Gabor wavelet representation in an innovative flexible matching approach that can provide robust recognition.
Abstract: This paper presents a model-based object recognition approach that uses a Gabor wavelet representation. The key idea is to use magnitude, phase, and frequency measures of the Gabor wavelet representation in an innovative flexible matching approach that can provide robust recognition. The Gabor grid, a topology-preserving map, efficiently encodes both signal energy and structural information of an object in a sparse multiresolution representation. The Gabor grid subsamples the Gabor wavelet decomposition of an object model and is deformed to allow the indexed object model match with similar representation obtained using image data. Flexible matching between the model and the image minimizes a cost function based on local similarity and geometric distortion of the Gabor grid. Grid erosion and repairing is performed whenever a collapsed grid, due to object occlusion, is detected. The results on infrared imagery are presented, where objects undergo rotation, translation, scale, occlusion, and aspect variations under changing environmental conditions.

Patent
Uwe Franke1
05 Sep 1997
TL;DR: In this article, a method for detecting and tracking objects by stereo image evaluation is presented. But the method is limited to stereo image pairs and is not suitable for the detection of objects.
Abstract: In a method for detecting and tracking objects by stereo image evaluation a part of structure class images is initially generated from a recorded stereo image pair. Differences in brightness of selected pixels in the environment are determined for each pixel as digital values, which are combined to form a digital value group, with identical groups defining their own structure classes. Structure classes which lack a brightness change along the epipolar line are discarded. Corresponding disparity values are then determined for the pixels in the other structure classes and are collected in a disparity histogram with a given frequency increment. The pixel group that belongs to a given grouping point area of the histogram is then interpreted as an object to be detected.

Patent
28 Mar 1997
TL;DR: In this paper, an object detection system including passive sensors for receiving electromagnetic radiation from a moving object and generating intensity signals representative of the received radiation, and a processing system for subtracting the intensity signals to obtain a differential signature representing the position of the moving object.
Abstract: An object detection system including passive sensors for receiving electromagnetic radiation from a moving object and generating intensity signals representative of the received radiation, and a processing system for subtracting the intensity signals to obtain a differential signature representative of the position of the moving object. An image acquisition system including at least one camera for acquiring an image of at least part of a moving object, in response to a trigger signal, and an analysis system for processing the image to locate a region in the image including markings identifying the object and processing the region to extract the markings for optical recognition.