scispace - formally typeset
Search or ask a question

Showing papers on "Object detection published in 2002"


Journal ArticleDOI
TL;DR: In this article, the authors categorize and evaluate face detection algorithms and discuss relevant issues such as data collection, evaluation metrics and benchmarking, and conclude with several promising directions for future research.
Abstract: Images containing faces are essential to intelligent vision-based human-computer interaction, and research efforts in face processing include face recognition, face tracking, pose estimation and expression recognition. However, many reported methods assume that the faces in an image or an image sequence have been identified and localized. To build fully automated systems that analyze the information contained in face images, robust and efficient face detection algorithms are required. Given a single image, the goal of face detection is to identify all image regions which contain a face, regardless of its 3D position, orientation and lighting conditions. Such a problem is challenging because faces are non-rigid and have a high degree of variability in size, shape, color and texture. Numerous techniques have been developed to detect faces in a single image, and the purpose of this paper is to categorize and evaluate these algorithms. We also discuss relevant issues such as data collection, evaluation metrics and benchmarking. After analyzing these algorithms and identifying their limitations, we conclude with several promising directions for future research.

3,894 citations


Proceedings ArticleDOI
Rainer Lienhart1, J. Maydt1
10 Dec 2002
TL;DR: This paper introduces a novel set of rotated Haar-like features that significantly enrich the simple features of Viola et al. scheme based on a boosted cascade of simple feature classifiers.
Abstract: Recently Viola et al. [2001] have introduced a rapid object detection. scheme based on a boosted cascade of simple feature classifiers. In this paper we introduce a novel set of rotated Haar-like features. These novel features significantly enrich the simple features of Viola et al. and can also be calculated efficiently. With these new rotated features our sample face detector shows off on average a 10% lower false alarm rate at a given hit rate. We also present a novel post optimization procedure for a given boosted cascade improving on average the false alarm rate further by 12.5%.

3,133 citations


Journal ArticleDOI
07 Nov 2002
TL;DR: This paper constructs a statistical representation of the scene background that supports sensitive detection of moving objects in the scene, but is robust to clutter arising out of natural scene variations.
Abstract: Automatic understanding of events happening at a site is the ultimate goal for many visual surveillance systems. Higher level understanding of events requires that certain lower level computer vision tasks be performed. These may include detection of unusual motion, tracking targets, labeling body parts, and understanding the interactions between people. To achieve many of these tasks, it is necessary to build representations of the appearance of objects in the scene. This paper focuses on two issues related to this problem. First, we construct a statistical representation of the scene background that supports sensitive detection of moving objects in the scene, but is robust to clutter arising out of natural scene variations. Second, we build statistical representations of the foreground regions (moving objects) that support their tracking and support occlusion reasoning. The probability density functions (pdfs) associated with the background and foreground are likely to vary from image to image and will not in general have a known parametric form. We accordingly utilize general nonparametric kernel density estimation techniques for building these statistical representations of the background and the foreground. These techniques estimate the pdf directly from the data without any assumptions about the underlying distributions. Example results from applications are presented.

1,539 citations


Journal ArticleDOI
TL;DR: This work focuses on detection algorithms that assume multivariate normal distribution models for HSI data and presents some results which illustrate the performance of some detection algorithms using real hyperspectral imaging (HSI) data.
Abstract: We introduce key concepts and issues including the effects of atmospheric propagation upon the data, spectral variability, mixed pixels, and the distinction between classification and detection algorithms. Detection algorithms for full pixel targets are developed using the likelihood ratio approach. Subpixel target detection, which is more challenging due to background interference, is pursued using both statistical and subspace models for the description of spectral variability. Finally, we provide some results which illustrate the performance of some detection algorithms using real hyperspectral imaging (HSI) data. Furthermore, we illustrate the potential deviation of HSI data from normality and point to some distributions that may serve in the development of algorithms with better or more robust performance. We therefore focus on detection algorithms that assume multivariate normal distribution models for HSI data.

1,170 citations


Journal ArticleDOI
TL;DR: Algorithm for vision-based detection and classification of vehicles in monocular image sequences of traffic scenes recorded by a stationary camera based on the establishment of correspondences between regions and vehicles, as the vehicles move through the image sequence is presented.
Abstract: This paper presents algorithms for vision-based detection and classification of vehicles in monocular image sequences of traffic scenes recorded by a stationary camera. Processing is done at three levels: raw images, region level, and vehicle level. Vehicles are modeled as rectangular patches with certain dynamic behavior. The proposed method is based on the establishment of correspondences between regions and vehicles, as the vehicles move through the image sequence. Experimental results from highway scenes are provided which demonstrate the effectiveness of the method. We also briefly describe an interactive camera calibration tool that we have developed for recovering the camera parameters using features in the image selected by the user.

833 citations


Proceedings ArticleDOI
TL;DR: The construction of the "v-disparity" image, its main properties, and the obstacle detection method, which is able to cope with uphill and downhill gradients and dynamic pitching of the vehicle, are explained.
Abstract: Presents a road obstacle detection method able to cope with uphill and downhill gradients and dynamic pitching of the vehicle. Our approach is based on the construction and investigation of the "v-disparity" image which provides a good representation of the geometric content of the road scene. The advantage of this image is that it provides semi-global matching and is able to perform robust obstacle detection even in the case of partial occlusion or errors committed during the matching process. Furthermore, this detection is performed without any explicit extraction of coherent structures. This paper explains the construction of the "v-disparity" image, its main properties, and the obstacle detection method. The longitudinal profile of the road is estimated and the objects located above the road surface are then extracted as potential obstacles; subsequently, the accurate detection of road obstacles, in particular the position of tyre-road contact points is computed in a precise manner. The whole process is performed at frame rate with a current-day PC. Our experimental findings and comparisons with the results obtained using a flat geometry hypothesis show the benefits of our approach.

734 citations


Book ChapterDOI
28 May 2002
TL;DR: An approach for learning to detect objects in still gray images, that is based on a sparse, part-based representation of objects, that achieves high detection accuracy on a difficult test set of real-world images, and is highly robust to partial occlusion and background variation.
Abstract: We present an approach for learning to detect objects in still gray images, that is based on a sparse, part-based representation of objects. A vocabulary of information-rich object parts is automatically constructed from a set of sample images of the object class of interest. Images are then represented using parts from this vocabulary, along with spatial relations observed among them. Based on this representation, a feature-efficient learning algorithm is used to learn to detect instances of the object class. The framework developed can be applied to any object with distinguishable parts in a relatively fixed spatial configuration. We report experiments on images of side views of cars. Our experiments show that the method achieves high detection accuracy on a difficult test set of real-world images, and is highly robust to partial occlusion and background variation.In addition, we discuss and offer solutions to several methodological issues that are significant for the research community to be able to evaluate object detection approaches.

605 citations


Journal ArticleDOI
TL;DR: The ability of SVM to outperform several well-known methods developed for the widely studied problem of MC detection suggests that SVM is a promising technique for object detection in a medical imaging application.
Abstract: We investigate an approach based on support vector machines (SVMs) for detection of microcalcification (MC) clusters in digital mammograms, and propose a successive enhancement learning scheme for improved performance. SVM is a machine-learning method, based on the principle of structural risk minimization, which performs well when applied to data outside the training set. We formulate MC detection as a supervised-learning problem and apply SVM to develop the detection algorithm. We use the SVM to detect at each location in the image whether an MC is present or not. We tested the proposed method using a database of 76 clinical mammograms containing 1120 MCs. We use free-response receiver operating characteristic curves to evaluate detection performance, and compare the proposed algorithm with several existing methods. In our experiments, the proposed SVM framework outperformed all the other methods tested. In particular, a sensitivity as high as 94% was achieved by the SVM method at an error rate of one false-positive cluster per image. The ability of SVM to outperform several well-known methods developed for the widely studied problem of MC detection suggests that SVM is a promising technique for object detection in a medical imaging application.

574 citations


Journal ArticleDOI
Rainer Lienhart1, A. Wernicke
TL;DR: This work proposes a novel method for localizing and segmenting text in complex images and videos that is not only able to locate and segment text occurrences into large binary images, but is also able to track each text line with sub-pixel accuracy over the entire occurrence in a video.
Abstract: Many images, especially those used for page design on Web pages, as well as videos contain visible text. If these text occurrences could be detected, segmented, and recognized automatically, they would be a valuable source of high-level semantics for indexing and retrieval. We propose a novel method for localizing and segmenting text in complex images and videos. Text lines are identified by using a complex-valued multilayer feed-forward network trained to detect text at a fixed scale and position. The network's output at all scales and positions is integrated into a single text-saliency map, serving as a starting point for candidate text lines. In the case of video, these candidate text lines are refined by exploiting the temporal redundancy of text in video. Localized text lines are then scaled to a fixed height of 100 pixels and segmented into a binary image with black characters on white background. For videos, temporal redundancy is exploited to improve segmentation performance. Input images and videos can be of any size due to a true multiresolution approach. Moreover, the system is not only able to locate and segment text occurrences into large binary images, but is also able to track each text line with sub-pixel accuracy over the entire occurrence in a video, so that one text bitmap is created for all instances of that text line. Therefore, our text segmentation results can also be used for object-based video encoding such as that enabled by MPEG-4.

478 citations


Journal ArticleDOI
TL;DR: An efficient moving object segmentation algorithm suitable for real-time content-based multimedia communication systems is proposed and a processing speed of 25 QCIF fps can be achieved on a personal computer with a 450-MHz Pentium III processor.
Abstract: An efficient moving object segmentation algorithm suitable for real-time content-based multimedia communication systems is proposed in this paper. First, a background registration technique is used to construct a reliable background image from the accumulated frame difference information. The moving object region is then separated from the background region by comparing the current frame with the constructed background image. Finally, a post-processing step is applied on the obtained object mask to remove noise regions and to smooth the object boundary. In situations where object shadows appear in the background region, a pre-processing gradient filter is applied on the input image to reduce the shadow effect. In order to meet the real-time requirement, no computationally intensive operation is included in this method. Moreover, the implementation is optimized using parallel processing and a processing speed of 25 QCIF fps can be achieved on a personal computer with a 450-MHz Pentium III processor. Good segmentation performance is demonstrated by the simulation results.

441 citations


Proceedings ArticleDOI
17 May 2002
TL;DR: This work presents a novel approach to dynamic datarace detection for multithreaded object-oriented programs that results in very few false positives and runtime overhead in the 13% to 42% range, making it both efficient and precise.
Abstract: We present a novel approach to dynamic datarace detection for multithreaded object-oriented programs. Past techniques for on-the-fly datarace detection either sacrificed precision for performance, leading to many false positive datarace reports, or maintained precision but incurred significant overheads in the range of 3x to 30x. In contrast, our approach results in very few false positives and runtime overhead in the 13% to 42% range, making it both efficient and precise. This performance improvement is the result of a unique combination of complementary static and dynamic optimization techniques.

Journal ArticleDOI
TL;DR: It is demonstrated that, by recognizing the properties of the structures present in the image, one can infer the scale of the scene and, therefore, its absolute mean depth.
Abstract: In the absence of cues for absolute depth measurements as binocular disparity, motion, or defocus, the absolute distance between the observer and a scene cannot be measured. The interpretation of shading, edges, and junctions may provide a 3D model of the scene but it will not provide information about the actual "scale" of the space. One possible source of information for absolute depth estimation is the image size of known objects. However, object recognition, under unconstrained conditions, remains difficult and unreliable for current computational approaches. We propose a source of information for absolute depth estimation based on the whole scene structure that does not rely on specific objects. We demonstrate that, by recognizing the properties of the structures present in the image, we can infer the scale of the scene and, therefore, its absolute mean depth. We illustrate the interest in computing the mean depth of the scene with application to scene recognition and object detection.

Journal ArticleDOI
TL;DR: A novel algorithm for segmentation of moving objects in video sequences and extraction of video object planes (VOPs) based on connected components analysis and smoothness of VO displacement in successive frames is proposed.
Abstract: The new video-coding standard MPEG-4 enables content-based functionality, as well as high coding efficiency, by taking into account shape information of moving objects. A novel algorithm for segmentation of moving objects in video sequences and extraction of video object planes (VOPs) is proposed . For the case of multiple video objects in a scene, the extraction of a specific single video object (VO) based on connected components analysis and smoothness of VO displacement in successive frames is also discussed. Our algorithm begins with a robust double-edge map derived from the difference between two successive frames. After removing edge points which belong to the previous frame, the remaining edge map, moving edge (ME), is used to extract the VOP. The proposed algorithm is evaluated on an indoor sequence captured by a low-end camera as well as MPEG-4 test sequences and produces promising results.

Proceedings ArticleDOI
08 Apr 2002
TL;DR: A new classification of most important and commonly used edge detection algorithms, namely ISEF, Canny, Marr-Hildreth, Sobel, Kirsch, Lapla1 and LaplA2 is introduced.
Abstract: Since edge detection is in the forefront of image processing for object detection, it is crucial to have a good understanding of edge detection algorithms. This paper introduces a new classification of most important and commonly used edge detection algorithms, namely ISEF, Canny, Marr-Hildreth, Sobel, Kirsch, Lapla1 and Lapla2. Five categories are included in our classification, and then advantages and disadvantages of some available algorithms within this category are discussed. A representative group containing the above seven algorithms are the implemented in C++ and compared subjectively, using 30 images out of 100 images. Two sets of images resulting from the application of those algorithms are then presented. It is shown that under noisy conditions, ISEF, Canny, Marr-Hildreth, Kirsch, Sobel, Lapla2, Lapla1 exhibit better performance, respectively.

Journal ArticleDOI
07 Nov 2002
TL;DR: This paper surveys the most advanced approaches to (partial) customization of the road following task, using on-board systems based on artificial vision, and describes the functionalities of lane detection, obstacle detection and pedestrian detection.
Abstract: The last few decades have witnessed the birth and growth of a new sensibility to transportation efficiency. In particular the need for efficient and improved people and goods mobility has pushed researchers to address the problem of intelligent transportation systems. This paper surveys the most advanced approaches to (partial) customization of the road following task, using on-board systems based on artificial vision. The functionalities of lane detection, obstacle detection and pedestrian detection are described and classified, and their possible application in future road vehicles is discussed.

Proceedings ArticleDOI
10 Dec 2002
TL;DR: Efficient post-processing techniques namely noise removal, shape criteria, elliptic curve fitting and face/non-face classification are proposed in order to further refine skin segmentation results for the purpose of face detection.
Abstract: This paper presents a new human skin color model in YCbCr color space and its application to human face detection. Skin colors are modeled by a set of three Gaussian clusters, each of which is characterized by a centroid and a covariance matrix. The centroids and covariance matrices are estimated from large set of training samples after a k-means clustering process. Pixels in a color input image can be classified into skin or non-skin based on the Mahalanobis distances to the three clusters. Efficient post-processing techniques namely noise removal, shape criteria, elliptic curve fitting and face/non-face classification are proposed in order to further refine skin segmentation results for the purpose of face detection.

Proceedings ArticleDOI
TL;DR: A real time pedestrian detection system that works on low quality infrared videos and introduces probabilistic templates to capture the variations in human shape, especially for the case where contrast is low and body parts are missing.
Abstract: In this paper we present a real time pedestrian detection system that works on low quality infrared videos. We introduce probabilistic templates to capture the variations in human shape, especially for the case where contrast is low and body parts are missing. We present experimental results on infrared videos taken from a moving vehicle in various urban street scenarios to demonstrate the feasibility of the approach.

Book ChapterDOI
28 May 2002
TL;DR: A system that is capable of segmenting, detecting and tracking multiple people in a cluttered scene using multiple synchronized cameras located far from each other and a scheme for combining evidences gathered from different camera pairs using occlusion analysis so as to obtain a globally optimum detection and tracking of objects.
Abstract: We present a system that is capable of segmenting, detecting and tracking multiple people in a cluttered scene using multiple synchronized cameras located far from each other. The system improves upon existing systems in many ways including: (1)We do not assume that a foreground connected component belongs to only one object; rather, we segment the views taking into account color models for the objects and the background. This helps us to not only separate foreground regions belonging to different objects, but to also obtain better background regions than traditional background subtraction methods (as it uses foreground color models in the algorithm). (2) It is fully automatic and does not require any manual input or initializations of any kind. (3) Instead of taking decisions about object detection and tracking from a single view or camera pair, we collect evidences from each pair and combine the evidence to obtain a decision in the end. This helps us to obtain much better detection and tracking as opposed to traditional systems.Several innovations help us tackle the problem. The first is the introduction of a region-based stereo algorithm that is capable of finding 3D points inside an object if we know the regions belonging to the object in two views. No exact point matching is required. This is especially useful in wide baseline camera systems where exact point matching is very difficult due to self-occlusion and a substantial change in viewpoint. The second contribution is the development of a scheme for setting priors for use in segmentation of a view using bayesian classification. The scheme, which assumes knowledge of approximate shape and location of objects, dynamically assigns priors for different objects at each pixel so that occlusion information is encoded in the priors. The third contribution is a scheme for combining evidences gathered from different camera pairs using occlusion analysis so as to obtain a globally optimum detection and tracking of objects.The system has been tested using different density of people in the scene which helps us to determine the number of cameras required for a particular density of people.

Proceedings ArticleDOI
05 Dec 2002
TL;DR: A set of methods for multi view image tracking using a set of calibrated cameras to be robust in resolving dynamic and static object occlusions and tracking objects between overlapping and non-overlapping camera views.
Abstract: The paper presents a set of methods for multi view image tracking using a set of calibrated cameras. We demonstrate how effective the approach is for resolving occlusions and tracking objects between overlapping and non-overlapping camera views. Moving objects are initially detected using background subtraction. Temporal alignment is then performed between each video sequence in order to compensate for the different processing rates of each camera. A Kalman filter is used to track each object in 3D world coordinates and 2D image coordinates. Information is shared between the 2D/3D trackers of each camera view in order to improve the performance of object tracking and trajectory prediction. The system is shown to be robust in resolving dynamic and static object occlusions. Results are presented from a variety of outdoor surveillance video sequences.

Journal ArticleDOI
TL;DR: A new method for automatic segmentation of moving objects in image sequences for VOP extraction using a Markov random field, based on motion information, spatial information and the memory is presented.
Abstract: The emerging video coding standard MPEG-4 enables various content-based functionalities for multimedia applications. To support such functionalities, as well as to improve coding efficiency, MPEG-4 relies on a decomposition of each frame of an image sequence into video object planes (VOP). Each VOP corresponds to a single moving object in the scene. This paper presents a new method for automatic segmentation of moving objects in image sequences for VOP extraction. We formulate the problem as graph labeling over a region adjacency graph (RAG), based on motion information. The label field is modeled as a Markov random field (MRF). An initial spatial partition of each frame is obtained by a fast, floating-point based implementation of the watershed algorithm. The motion of each region is estimated by hierarchical region matching. To avoid inaccuracies in occlusion areas, a novel motion validation scheme is presented. A dynamic memory, based on object tracking, is incorporated into the segmentation process to maintain temporal coherence of the segmentation. Finally, a labeling is obtained by maximization of the a posteriori probability of the MRF using motion information, spatial information and the memory. The optimization is carried out by highest confidence first (HCF). Experimental results for several video sequences demonstrate the effectiveness of the proposed approach.

Journal ArticleDOI
TL;DR: To solve the problem whereby weak targets are shadowed by the sidelobes of strong ones, a new implementation of the CLEAN technique is proposed based on filtering in the fractional Fourier domain, and strong moving targets and weak ones can be detected iteratively.
Abstract: As a useful signal processing technique, the fractional Fourier transform (FrFT) is largely unknown to the radar signal processing community. In this correspondence, the FrFT is applied to airborne synthetic aperture radar (SAR) slow-moving target detection. For airborne SAR, the echo from a ground moving target can be regarded approximately as a chirp signal, and the FrFT is a way to concentrate the energy of a chirp signal. Therefore, the FrFT presents a potentially effective technique for ground moving target detection in airborne SAR. Compared with the common Wigner-Ville distribution (WVD) algorithm, the FrFT is a linear operator, and will not be influenced by cross-terms even if multiple moving targets exist. Moreover, to solve the problem whereby weak targets are shadowed by the sidelobes of strong ones, a new implementation of the CLEAN technique is proposed based on filtering in the fractional Fourier domain. In this way strong moving targets and weak ones can be detected iteratively. This combined method is demonstrated by using raw clutter data combined with simulated moving targets.

Journal ArticleDOI
TL;DR: These interactions differed from those found during object identification in sensory-specific areas and possibly in the superior colliculus, indicating that the neural operations governing multisensory integration depend crucially on the nature of the perceptual processes involved.
Abstract: Very recently, a number of neuroimaging studies in humans have begun to investigate the question of how the brain integrates information from different sensory modalities to form unified percepts. Already, intermodal neural processing appears to depend on the modalities of inputs or the nature (speech/non-speech) of information to be combined. Yet, the variety of paradigms, stimuli and technics used make it difficult to understand the relationships between the factors operating at the perceptual level and the underlying physiological processes. In a previous experiment, we used event-related potentials to describe the spatio-temporal organization of audio-visual interactions during a bimodal object recognition task. Here we examined the network of cross-modal interactions involved in simple detection of the same objects. The objects were defined either by unimodal auditory or visual features alone, or by the combination of the two features. As expected, subjects detected bimodal stimuli more rapidly than either unimodal stimuli. Combined analysis of potentials, scalp current densities and dipole modeling revealed several interaction patterns within the first 200 micro s post-stimulus: in occipito-parietal visual areas (45-85 micro s), in deep brain structures, possibly the superior colliculus (105-140 micro s), and in right temporo-frontal regions (170-185 micro s). These interactions differed from those found during object identification in sensory-specific areas and possibly in the superior colliculus, indicating that the neural operations governing multisensory integration depend crucially on the nature of the perceptual processes involved.

Patent
15 Mar 2002
TL;DR: An interruption free navigator includes an inertial measurement unit, a north finder, a velocity producer, a positioning assistant, a navigation processor, an altitude measurement, an object detection system, a wireless communication device, and a display device and map database.
Abstract: An interruption free navigator includes an inertial measurement unit, a north finder, a velocity producer, a positioning assistant, a navigation processor, an altitude measurement, an object detection system, a wireless communication device, and a display device and map database. Output signals of the inertial measurement unit, the velocity producer, the positioning assistant, the altitude measurement, the object detection system, and the north finder are processed to obtain highly accurate position measurements of the person. The user's position information can be exchanged with other users through the wireless communication device, and the location and surrounding information can be displayed on the display device by accessing a map database with the person position information.

Proceedings ArticleDOI
11 Aug 2002
TL;DR: A set of seven metrics are proposed for quantifying different aspects of a detection algorithm's performance and will be used to evaluate algorithms for detecting text, faces, moving people and vehicles.
Abstract: The continuous development of object detection algorithms is ushering in the need for evaluation tools to quantify algorithm performance. In this paper a set of seven metrics are proposed for quantifying different aspects of a detection algorithm's performance. The strengths and weaknesses of these metrics are described. They are implemented in the Video Performance Evaluation Resource (ViPER) system and will be used to evaluate algorithms for detecting text, faces, moving people and vehicles. Results for running two previous text-detection algorithms on a common data set are presented.

Journal ArticleDOI
TL;DR: An object detection method achieved by the fusion of millimeter-wave radar and a single video camera is proposed, considered as the least expensive solution because at least one camera is necessary for lane marking detection.
Abstract: In order to avoid collision with an object that blocks the course of a vehicle, measuring the distance to it and detecting positions of its side boundaries, are necessary. In the paper, an object detection method achieved by the fusion of millimeter-wave radar and a single video camera is proposed. We consider the method as the least expensive solution because at least one camera is necessary for lane marking detection. In the method, the distance is measured by the radar, and the boundaries are found from an image sequence, based on a motion stereo technique with the help of the distance measured by the radar. Since the method does not depend on the appearance of objects, it is capable of detecting not only an automobile but also other objects. Object detection by the method was confirmed through an experiment. In the experiment, both a stationary and a moving object were detected and a pedestrian as well as a vehicle was detected.

Patent
07 Oct 2002
TL;DR: In this article, a set of sub-classifiers for a detector of an object detection program are presented, where the coefficients are the result of a transform operation performed on a 2D digitized image.
Abstract: Systems and methods for determining a set of sub-classifiers for a detector of an object detection program are presented. According to one embodiment, the system may include a candidate coefficient-subset creation module, a training module in communication with the candidate coefficient-subset creation module, and a sub-classifier selection module in communication with the training module. The candidate coefficient-subset creation module may create a plurality of candidate subsets of coefficients. The coefficients are the result of a transform operation performed on a two-dimensional (2D) digitized image, and represent corresponding visual information from the 2D image that is localized in space, frequency, and orientation. The training module may train a sub-classifier for each of the plurality of candidate subsets of coefficients. The sub-classifier selection module may select certain of the plurality of sub-classifiers. The selected sub-classifiers may comprise the components of the detector. Also presented are systems and methods for detecting instances of an object in a 2D (two-dimensional) image.

Patent
08 Aug 2002
TL;DR: In this article, a system is provided for detecting blockage of an automotive side object detection system (SODS), which includes a blockage detection processor, which is operative to determine whether an RF leakage signal level sensed between transmit and receive antennas of the system substantially match one or more of a plurality of pattern recognition information curves.
Abstract: A system is provided for detecting blockage of an automotive side object detection system (“SODS”) The system includes a blockage detection processor, which is operative to determine whether an RF leakage signal level sensed between transmit and receive antennas of the system substantially match one or more of a plurality of pattern recognition information curves If it is determined that the leakage signal level substantially matches one or more of a plurality of pattern recognition information curves, a blocked condition of the SODS is declared, as may be caused by mud, salt, ice, etc The blockage detection processor is further operative to determine whether the leakage signal exceeds a predetermined blockage threshold level If the leakage exceeds the predetermined blockage threshold level, a blocked condition of the SODS is also declared

Proceedings ArticleDOI
03 Dec 2002
TL;DR: An in-vehicle real-time monocular precrash vehicle detection system that uses multi-scale techniques to speed up detection but also to improve system robustness by making system performance less sensitive to the choice of certain parameters.
Abstract: This paper presents an in-vehicle real-time monocular precrash vehicle detection system The system acquires grey level images through a forward facing low light camera and achieves an average detection rate of 10Hz The vehicle detection algorithm consists of two main steps: multi-scale driven hypothesis generation and appearance-based hypothesis verification In the multi-scale hypothesis generation step, possible image locations where vehicles might be present are hypothesized This step uses multi-scale techniques to speed up detection but also to improve system robustness by making system performance less sensitive to the choice of certain parameters Appearance-base hypothesis verification verifies those hypothesis using Haar Wavelet decomposition for feature extraction and Support Vector Machines (SVMs) for classification The monocular system was tested under different traffic scenarios (eg, simply structured highway, complex urban street, varying weather conditions), illustrating good performance

Proceedings ArticleDOI
Dariu M. Gavrila1, Jan Giebel1
TL;DR: This paper presents a large scale experimental study on pedestrian detection using the Chamfer System, a generic system for shape-based object recognition that is shown to be quite promising.
Abstract: This paper presents a large scale experimental study on pedestrian detection. The focus of the study is the Chamfer System, a generic system for shape-based object recognition. Matching involves a simultaneous coarse-to-fine approach over a template hierarchy and over the transformation parameters based on correlation with (chamfer) distance-transformed images. Candidate solutions are verified by a neural network with local receptive fields, using a richer set of texture features. Detection is supplemented by an alpha-beta tracker which integrates results over time; the tracker compensates for momentarily missing detections due to image noise or occlusions. For this study, an extensive database of 4762 pedestrian images was compiled with precise ground-truth data. System performance was analyzed by several ROC curves. Although not viable for real-world deployment yet, system performance is shown to be quite promising.

Book ChapterDOI
TL;DR: A novel color texture-based method for object detection in images that produces robust and efficient LP detection as time-consuming color texture analyses for less relevant pixels are restricted, leaving only a small part of the input image to be analyzed.
Abstract: This paper presents a novel color texture-based method for object detection in images. To demonstrate our technique, a vehicle license plate (LP) localization system is developed. A support vector machine (SVM) is used to analyze the color textural properties of LPs. No external feature extraction module is used, rather the color values of the raw pixels that make up the color textural pattern are fed directly to the SVM, which works well even in high-dimensional spaces. Next, LP regions are identified by applying a continuously adaptive meanshift algorithm (CAMShift) to the results of the color texture analysis. The combination of CAMShift and SVMs produces not only robust and but also efficient LP detection as time-consuming color texture analyses for less relevant pixels are restricted, leaving only a small part of the input image to be analyzed.