Showing papers on "Object detection published in 2008"

PDF

Open Access

Journal Article•DOI•

LabelMe: A Database and Web-Based Tool for Image Annotation

[...]

Bryan Russell¹, Antonio Torralba¹, Kevin Murphy², William T. Freeman¹•Institutions (2)

Massachusetts Institute of Technology¹, University of British Columbia²

01 May 2008-International Journal of Computer Vision

TL;DR: In this article, a large collection of images with ground truth labels is built to be used for object detection and recognition research, such data is useful for supervised learning and quantitative evaluation.

...read moreread less

Abstract: We seek to build a large collection of images with ground truth labels to be used for object detection and recognition research. Such data is useful for supervised learning and quantitative evaluation. To achieve this, we developed a web-based tool that allows easy image annotation and instant sharing of such annotations. Using this annotation tool, we have collected a large dataset that spans many object categories, often containing multiple instances over a wide variety of images. We quantify the contents of the dataset and compare against existing state of the art datasets used for object recognition and detection. Also, we show how to extend the dataset to automatically enhance object labels with WordNet, discover object parts, recover a depth ordering of objects in a scene, and increase the number of labels using minimal user supervision and images from the web.

...read moreread less

3,501 citations

Journal Article•DOI•

80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition

[...]

Antonio Torralba¹, Rob Fergus², William T. Freeman¹•Institutions (2)

Massachusetts Institute of Technology¹, New York University²

01 Nov 2008-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: For certain classes that are particularly prevalent in the dataset, such as people, this work is able to demonstrate a recognition performance comparable to class-specific Viola-Jones style detectors.

...read moreread less

Abstract: With the advent of the Internet, billions of images are now freely available online and constitute a dense sampling of the visual world. Using a variety of non-parametric methods, we explore this world with the aid of a large dataset of 79,302,017 images collected from the Internet. Motivated by psychophysical results showing the remarkable tolerance of the human visual system to degradations in image resolution, the images in the dataset are stored as 32 x 32 color images. Each image is loosely labeled with one of the 75,062 non-abstract nouns in English, as listed in the Wordnet lexical database. Hence the image database gives a comprehensive coverage of all object categories and scenes. The semantic information from Wordnet can be used in conjunction with nearest-neighbor methods to perform object classification over a range of semantic levels minimizing the effects of labeling noise. For certain classes that are particularly prevalent in the dataset, such as people, we are able to demonstrate a recognition performance comparable to class-specific Viola-Jones style detectors.

...read moreread less

1,871 citations

Proceedings Article•DOI•

Privacy preserving crowd monitoring: Counting people without people models or tracking

[...]

Antoni B. Chan¹, Z.-S.J. Liang¹, Nuno Vasconcelos¹•Institutions (1)

University of California, San Diego¹

23 Jun 2008

TL;DR: A privacy-preserving system for estimating the size of inhomogeneous crowds, composed of pedestrians that travel in different directions, without using explicit object segmentation or tracking is presented.

...read moreread less

Abstract: We present a privacy-preserving system for estimating the size of inhomogeneous crowds, composed of pedestrians that travel in different directions, without using explicit object segmentation or tracking. First, the crowd is segmented into components of homogeneous motion, using the mixture of dynamic textures motion model. Second, a set of simple holistic features is extracted from each segmented region, and the correspondence between features and the number of people per segment is learned with Gaussian process regression. We validate both the crowd segmentation algorithm, and the crowd counting system, on a large pedestrian dataset (2000 frames of video, containing 49,885 total pedestrian instances). Finally, we present results of the system running on a full hour of video.

...read moreread less

1,164 citations

Journal Article•DOI•

Robust Object Detection with Interleaved Categorization and Segmentation

[...]

Bastian Leibe¹, Ales Leonardis², Bernt Schiele³•Institutions (3)

ETH Zurich¹, University of Ljubljana², Technische Universität Darmstadt³

01 May 2008-International Journal of Computer Vision

TL;DR: A novel method for detecting and localizing objects of a visual category in cluttered real-world scenes that is applicable to a range of different object categories, including both rigid and articulated objects and able to achieve competitive object detection performance from training sets that are between one and two orders of magnitude smaller than those used in comparable systems.

...read moreread less

Abstract: This paper presents a novel method for detecting and localizing objects of a visual category in cluttered real-world scenes. Our approach considers object categorization and figure-ground segmentation as two interleaved processes that closely collaborate towards a common goal. As shown in our work, the tight coupling between those two processes allows them to benefit from each other and improve the combined performance. The core part of our approach is a highly flexible learned representation for object shape that can combine the information observed on different training examples in a probabilistic extension of the Generalized Hough Transform. The resulting approach can detect categorical objects in novel images and automatically infer a probabilistic segmentation from the recognition result. This segmentation is then in turn used to again improve recognition by allowing the system to focus its efforts on object pixels and to discard misleading influences from the background. Moreover, the information from where in the image a hypothesis draws its support is employed in an MDL based hypothesis verification stage to resolve ambiguities between overlapping hypotheses and factor out the effects of partial occlusion. An extensive evaluation on several large data sets shows that the proposed system is applicable to a range of different object categories, including both rigid and articulated objects. In addition, its flexible representation allows it to achieve competitive object detection performance already from training sets that are between one and two orders of magnitude smaller than those used in comparable systems.

...read moreread less

1,084 citations

Proceedings Article•DOI•

Global data association for multi-object tracking using network flows

[...]

Li Zhang¹, Yuan Li¹, Ramakant Nevatia¹•Institutions (1)

University of Southern California¹

23 Jun 2008

TL;DR: A network flow based optimization method for data association needed for multiple object tracking that is efficient and does not require hypotheses pruning, and compared with previous results on two public pedestrian datasets to show its improvement.

...read moreread less

Abstract: We propose a network flow based optimization method for data association needed for multiple object tracking. The maximum-a-posteriori (MAP) data association problem is mapped into a cost-flow network with a non-overlap constraint on trajectories. The optimal data association is found by a min-cost flow algorithm in the network. The network is augmented to include an explicit occlusion model(EOM) to track with long-term inter-object occlusions. A solution to the EOM-based network is found by an iterative approach built upon the original algorithm. Initialization and termination of trajectories and potential false observations are modeled by the formulation intrinsically. The method is efficient and does not require hypotheses pruning. Performance is compared with previous results on two public pedestrian datasets to show its improvement.

...read moreread less

1,046 citations

Journal Article•DOI•

Pedestrian Detection via Classification on Riemannian Manifolds

[...]

Oncel Tuzel¹, Fatih Porikli², Peter Meer¹•Institutions (2)

Rutgers University¹, Mitsubishi Electric Research Laboratories²

01 Oct 2008-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A novel approach for classifying points lying on a connected Riemannian manifold using the geometry of the space of d-dimensional nonsingular covariance matrices as object descriptors.

...read moreread less

Abstract: We present a new algorithm to detect pedestrian in still images utilizing covariance matrices as object descriptors. Since the descriptors do not form a vector space, well known machine learning techniques are not well suited to learn the classifiers. The space of d-dimensional nonsingular covariance matrices can be represented as a connected Riemannian manifold. The main contribution of the paper is a novel approach for classifying points lying on a connected Riemannian manifold using the geometry of the space. The algorithm is tested on INRIA and DaimlerChrysler pedestrian datasets where superior detection rates are observed over the previous approaches.

...read moreread less

1,044 citations

Journal Article•DOI•

Robust Real-Time Unusual Event Detection using Multiple Fixed-Location Monitors

[...]

Amit Adam¹, Ehud Rivlin¹, Ilan Shimshoni², David Reinitz³•Institutions (3)

Technion – Israel Institute of Technology¹, University of Haifa², Rafael Advanced Defense Systems³

01 Mar 2008-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A novel algorithm for detection of certain types of unusual events based on multiple local monitors which collect low-level statistics that is robust and works well in crowded scenes where tracking-based algorithms are likely to fail.

...read moreread less

Abstract: We present a novel algorithm for detection of certain types of unusual events. The algorithm is based on multiple local monitors which collect low-level statistics. Each local monitor produces an alert if its current measurement is unusual and these alerts are integrated to a final decision regarding the existence of an unusual event. Our algorithm satisfies a set of requirements that are critical for successful deployment of any large-scale surveillance system. In particular, it requires a minimal setup (taking only a few minutes) and is fully automatic afterwards. Since it is not based on objects' tracks, it is robust and works well in crowded scenes where tracking-based algorithms are likely to fail. The algorithm is effective as soon as sufficient low-level observations representing the routine activity have been collected, which usually happens after a few minutes. Our algorithm runs in real-time. It was tested on a variety of real-life crowded scenes. A ground-truth was extracted for these scenes, with respect to which detection and false-alarm rates are reported.

...read moreread less

822 citations

Proceedings Article•DOI•

Spatio-temporal Saliency detection using phase spectrum of quaternion fourier transform

[...]

Chenlei Guo¹, Qi Ma¹, Liming Zhang¹•Institutions (1)

Fudan University¹

23 Jun 2008

TL;DR: A simple and fast approach based on Fourier transform called spectral residual (SR) was proposed, which used SR of the amplitude spectrum to obtain the saliency map, and the results are good, but the reason is questionable.

...read moreread less

Abstract: Salient areas in natural scenes are generally regarded as the candidates of attention focus in human eyes, which is the key stage in object detection. In computer vision, many models have been proposed to simulate the behavior of eyes such as SaliencyToolBox (STB), neuromorphic vision toolkit (NVT) and etc., but they demand high computational cost and their remarkable results mostly rely on the choice of parameters. Recently a simple and fast approach based on Fourier transform called spectral residual (SR) was proposed, which used SR of the amplitude spectrum to obtain the saliency map. The results are good, but the reason is questionable.

...read moreread less

805 citations

Proceedings Article•DOI•

Beyond sliding windows: Object localization by efficient subwindow search

[...]

Christoph H. Lampert¹, Matthew B. Blaschko¹, Thomas Hofmann²•Institutions (2)

Max Planck Society¹, Google²

23 Jun 2008

TL;DR: A simple yet powerful branch-and-bound scheme that allows efficient maximization of a large class of classifier functions over all possible subimages and converges to a globally optimal solution typically in sublinear time is proposed.

...read moreread less

Abstract: Most successful object recognition systems rely on binary classification, deciding only if an object is present or not, but not providing information on the actual object location. To perform localization, one can take a sliding window approach, but this strongly increases the computational cost, because the classifier function has to be evaluated over a large set of candidate subwindows. In this paper, we propose a simple yet powerful branch-and-bound scheme that allows efficient maximization of a large class of classifier functions over all possible subimages. It converges to a globally optimal solution typically in sublinear time. We show how our method is applicable to different object detection and retrieval scenarios. The achieved speedup allows the use of classifiers for localization that formerly were considered too slow for this task, such as SVMs with a spatial pyramid kernel or nearest neighbor classifiers based on the chi2-distance. We demonstrate state-of-the-art performance of the resulting systems on the UIUC Cars dataset, the PASCAL VOC 2006 dataset and in the PASCAL VOC 2007 competition.

...read moreread less

801 citations

Journal Article•DOI•

A Self-Organizing Approach to Background Subtraction for Visual Surveillance Applications

[...]

Lucia Maddalena¹, Alfredo Petrosino²•Institutions (2)

National Research Council¹, Applied Science Private University²

01 Jul 2008-IEEE Transactions on Image Processing

TL;DR: This work proposes an approach based on self organization through artificial neural networks, widely applied in human image processing systems and more generally in cognitive science, that can handle scenes containing moving backgrounds, gradual illumination variations and camouflage, and achieves robust detection for different types of videos taken with stationary cameras.

...read moreread less

Abstract: Detection of moving objects in video streams is the first relevant step of information extraction in many computer vision applications. Aside from the intrinsic usefulness of being able to segment video streams into moving and background components, detecting moving objects provides a focus of attention for recognition, classification, and activity analysis, making these later steps more efficient. We propose an approach based on self organization through artificial neural networks, widely applied in human image processing systems and more generally in cognitive science. The proposed approach can handle scenes containing moving backgrounds, gradual illumination variations and camouflage, has no bootstrapping limitations, can include into the background model shadows cast by moving objects, and achieves robust detection for different types of videos taken with stationary cameras. We compare our method with other modeling techniques and report experimental results, both in terms of detection accuracy and in terms of processing speed, for color video sequences that represent typical situations critical for video surveillance systems.

...read moreread less

792 citations

Proceedings Article•DOI•

A mobile vision system for robust multi-person tracking

[...]

Andreas Ess¹, Bastian Leibe¹, Konrad Schindler¹, L. Van Gool¹•Institutions (1)

ETH Zurich¹

23 Jun 2008

TL;DR: A mobile vision system for multi-person tracking in busy environments that integrates continuous visual odometry computation with tracking-by-detection in order to track pedestrians in spite of frequent occlusions and egomotion of the camera rig is presented.

...read moreread less

Abstract: We present a mobile vision system for multi-person tracking in busy environments. Specifically, the system integrates continuous visual odometry computation with tracking-by-detection in order to track pedestrians in spite of frequent occlusions and egomotion of the camera rig. To achieve reliable performance under real-world conditions, it has long been advocated to extract and combine as much visual information as possible. We propose a way to closely integrate the vision modules for visual odometry, pedestrian detection, depth estimation, and tracking. The integration naturally leads to several cognitive feedback loops between the modules. Among others, we propose a novel feedback connection from the object detector to visual odometry which utilizes the semantic knowledge of detection to stabilize localization. Feedback loops always carry the danger that erroneous feedback from one module is amplified and causes the entire system to become instable. We therefore incorporate automatic failure detection and recovery, allowing the system to continue when a module becomes unreliable. The approach is experimentally evaluated on several long and difficult video sequences from busy inner-city locations. Our results show that the proposed integration makes it possible to deliver stable tracking performance in scenes of previously infeasible complexity.

...read moreread less

Proceedings Article•DOI•

Real time detection of lane markers in urban streets

[...]

Mohamed Aly¹•Institutions (1)

California Institute of Technology¹

04 Jun 2008

TL;DR: In this paper, a robust and real-time approach to lane marker detection in urban streets is presented, which is based on generating a top view of the road, filtering using selective oriented Gaussian filters, using RANSAC line fitting to give initial guesses to a new and fast RANAC algorithm for fitting Bezier Splines, which was then followed by a post-processing step.

...read moreread less

Abstract: We present a robust and real time approach to lane marker detection in urban streets. It is based on generating a top view of the road, filtering using selective oriented Gaussian filters, using RANSAC line fitting to give initial guesses to a new and fast RANSAC algorithm for fitting Bezier Splines, which is then followed by a post-processing step. Our algorithm can detect all lanes in still images of the street in various conditions, while operating at a rate of 50 Hz and achieving comparable results to previous techniques.

...read moreread less

Journal Article•DOI•

Groups of Adjacent Contour Segments for Object Detection

[...]

Vittorio Ferrari¹, L. Fevrier, Frédéric Jurie, Cordelia Schmid•Institutions (1)

University of Oxford¹

01 Jan 2008-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: It is shown that kAS substantially outperform IPs for detecting shape-based classes, and the object detector is compared to the recent state-of-the-art system by Dalal and Triggs (2005).

...read moreread less

Abstract: We present a family of scale-invariant local shape features formed by chains of k connected roughly straight contour segments (kAS), and their use for object class detection. kAS are able to cleanly encode pure fragments of an object boundary without including nearby clutter. Moreover, they offer an attractive compromise between information content and repeatability and encompass a wide variety of local shape structures. We also define a translation and scale invariant descriptor encoding the geometric configuration of the segments within a kAS, making kAS easy to reuse in other frameworks, for example, as a replacement or addition to interest points (IPs). Software for detecting and describing kAS is released at http://lear.inrialpes.fr/software. We demonstrate the high performance of kAS within a simple but powerful sliding-window object detection scheme. Through extensive evaluations, involving eight diverse object classes and more than 1,400 images, we (1) study the evolution of performance as the degree of feature complexity k varies and determine the best degree, (2) show that kAS substantially outperform IPs for detecting shape-based classes, and (3) compare our object detector to the recent state-of-the-art system by Dalal and Triggs (2005).

...read moreread less

Journal Article•DOI•

Trajectory-Based Anomalous Event Detection

[...]

Claudio Piciarelli¹, Christian Micheloni¹, Gian Luca Foresti¹•Institutions (1)

University of Udine¹

01 Nov 2008-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The proposed work addresses anomaly detection by means of trajectory analysis, an approach with several application fields, most notably video surveillance and traffic monitoring, based on single-class support vector machine (SVM) clustering, where the novelty detection SVM capabilities are used for the identification of anomalous trajectories.

...read moreread less

Abstract: During the last years, the task of automatic event analysis in video sequences has gained an increasing attention among the research community. The application domains are disparate, ranging from video surveillance to automatic video annotation for sport videos or TV shots. Whatever the application field, most of the works in event analysis are based on two main approaches: the former based on explicit event recognition, focused on finding high-level, semantic interpretations of video sequences, and the latter based on anomaly detection. This paper deals with the second approach, where the final goal is not the explicit labeling of recognized events, but the detection of anomalous events differing from typical patterns. In particular, the proposed work addresses anomaly detection by means of trajectory analysis, an approach with several application fields, most notably video surveillance and traffic monitoring. The proposed approach is based on single-class support vector machine (SVM) clustering, where the novelty detection SVM capabilities are used for the identification of anomalous trajectories. Particular attention is given to trajectory classification in absence of a priori information on the distribution of outliers. Experimental results prove the validity of the proposed approach.

...read moreread less

Book Chapter•DOI•

Learning Spatial Context: Using Stuff to Find Things

[...]

Geremy Heitz¹, Daphne Koller¹•Institutions (1)

Stanford University¹

20 Oct 2008

TL;DR: This paper clusters image regions based on their ability to serve as context for the detection of objects and shows that the things and stuff (TAS) context model produces meaningful clusters that are readily interpretable, and helps improve detection ability over state-of-the-art detectors.

...read moreread less

Abstract: The sliding window approach of detecting rigid objects (such as cars) is predicated on the belief that the object can be identified from the appearance in a small region around the object. Other types of objects of amorphous spatial extent (e.g., trees, sky), however, are more naturally classified based on texture or color. In this paper, we seek to combine recognition of these two types of objects into a system that leverages "context" toward improving detection. In particular, we cluster image regions based on their ability to serve as context for the detection of objects. Rather than providing an explicit training set with region labels, our method automatically groups regions based on both their appearance and their relationships to the detections in the image. We show that our things and stuff (TAS) context model produces meaningful clusters that are readily interpretable, and helps improve our detection ability over state-of-the-art detectors. We also present a method for learning the active set of relationships for a particular dataset. We present results on object detection in images from the PASCAL VOC 2005/2006 datasets and on the task of overhead car detection in satellite images, demonstrating significant improvements over state-of-the-art detectors.

...read moreread less

Proceedings Article•DOI•

Trajectory Outlier Detection: A Partition-and-Detect Framework

[...]

Jae-Gil Lee¹, Jiawei Han¹, Xiaolei Li¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

07 Apr 2008

TL;DR: A novel partition-and-detect framework for trajectory outlier detection is proposed, which partitions a trajectory into a set of line segments, and then, detects outlying line segments for trajectory outliers.

...read moreread less

Abstract: Outlier detection has been a popular data mining task. However, there is a lack of serious study on outlier detection for trajectory data. Even worse, an existing trajectory outlier detection algorithm has limited capability to detect outlying sub- trajectories. In this paper, we propose a novel partition-and-detect framework for trajectory outlier detection, which partitions a trajectory into a set of line segments, and then, detects outlying line segments for trajectory outliers. The primary advantage of this framework is to detect outlying sub-trajectories from a trajectory database. Based on this partition-and-detect framework, we develop a trajectory outlier detection algorithm TRAOD. Our algorithm consists of two phases: partitioning and detection. For the first phase, we propose a two-level trajectory partitioning strategy that ensures both high quality and high efficiency. For the second phase, we present a hybrid of the distance-based and density-based approaches. Experimental results demonstrate that TRAOD correctly detects outlying sub-trajectories from real trajectory data.

...read moreread less

Proceedings Article•DOI•

Estimating the number of people in crowded scenes by MID based foreground segmentation and head-shoulder detection

[...]

Min Li¹, Zhaoxiang Zhang¹, Kaiqi Huang¹, Tieniu Tan¹•Institutions (1)

Chinese Academy of Sciences¹

01 Dec 2008

TL;DR: A novel method to address the problem of estimating the number of people in surveillance scenes with people gathering and waiting by combining a MID based foreground segmentation algorithm and a HOG based head-shoulder detection algorithm to provide an accurate estimation of people counts in the observed area.

...read moreread less

Abstract: This paper proposes a novel method to address the problem of estimating the number of people in surveillance scenes with people gathering and waiting. The proposed method combines a MID (mosaic image difference) based foreground segmentation algorithm and a HOG (histograms of oriented gradients) based head-shoulder detection algorithm to provide an accurate estimation of people counts in the observed area. In our framework, the MID-based foreground segmentation module provides active areas for the head-shoulder detection module to detect heads and count the number of people. Numerous experiments are conducted and convincing results demonstrate the effectiveness of our method.

...read moreread less

Journal Article•DOI•

Moving Object Detection in Spatial Domain using Background Removal Techniques - State-of-Art

[...]

Shireen Y. Elhabian, Khaled M. F. Elsayed¹, Sumaya H. Ahmed•Institutions (1)

University of Louisville¹

01 Jan 2008-Recent Patents on Computer Science

TL;DR: This paper surveys many existing schemes in the literature of background removal, surveying the common pre-processing algorithms used in different situations, presenting different background models, and the most commonly used ways to update such models and how they can be initialized.

...read moreread less

Abstract: Identifying moving objects is a critical task for many computer vision applications; it provides a classification of the pixels into either foreground or background. A common approach used to achieve such classification is background removal. Even though there exist numerous of background removal algorithms in the literature, most of them follow a simple flow diagram, passing through four major steps, which are pre-processing, background modelling, foreground de- tection and data validation. In this paper, we survey many existing schemes in the literature of background removal, sur- veying the common pre-processing algorithms used in different situations, presenting different background models, and the most commonly used ways to update such models and how they can be initialized. We also survey how to measure the performance of any moving object detection algorithm, whether the ground truth data is available or not, presenting per- formance metrics commonly used in both cases.

...read moreread less

Journal Article•DOI•

Segmentation and Tracking of Multiple Humans in Crowded Environments

[...]

Tao Zhao, Ramakant Nevatia¹, Bo Wu¹•Institutions (1)

University of Southern California¹

01 Jul 2008-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A model-based approach to interpret the image observations by multiple partially occluded human hypotheses in a Bayesian framework is proposed, which defines a joint image likelihood for multiple humans based on the appearance of the humans, the visibility of the body obtained by occlusion reasoning, and foreground/background separation.

...read moreread less

Abstract: Segmentation and tracking of multiple humans in crowded situations is made difficult by interobject occlusion. We propose a model-based approach to interpret the image observations by multiple partially occluded human hypotheses in a Bayesian framework. We define a joint image likelihood for multiple humans based on the appearance of the humans, the visibility of the body obtained by occlusion reasoning, and foreground/background separation. The optimal solution is obtained by using an efficient sampling method, data-driven Markov chain Monte Carlo (DDMCMC), which uses image observations for proposal probabilities. Knowledge of various aspects, including human shape, camera model, and image cues, are integrated in one theoretically sound framework. We present experimental results and quantitative evaluation, demonstrating that the resulting approach is effective for very challenging data.

...read moreread less

Journal Article•DOI•

Multiscale Categorical Object Recognition Using Contour Fragments

[...]

Jamie Shotton¹, Andrew Blake¹, Roberto Cipolla¹•Institutions (1)

Toshiba¹

01 Jul 2008-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A new automatic visual recognition system based only on local contour features, capable of localizing objects in space and scale, is proposed and compared with other methods based on contour and local descriptors in a detailed evaluation over 17 challenging categories.

...read moreread less

Abstract: Psychophysical studies show that we can recognize objects using fragments of outline contour alone. This paper proposes a new automatic visual recognition system based only on local contour features, capable of localizing objects in space and scale. The system first builds a class-specific codebook of local fragments of contour using a novel formulation of chamfer matching. These local fragments allow recognition that is robust to within-class variation, pose changes, and articulation. Boosting combines these fragments into a cascaded sliding-window classifier, and mean shift is used to select strong responses as a final set of detection. We show how learning can be performed iteratively on both training and test sets to bootstrap an improved classifier. We compare with other methods based on contour and local descriptors in our detailed evaluation over 17 challenging categories and obtain highly competitive results. The results confirm that contour is indeed a powerful cue for multiscale and multiclass visual object recognition.

...read moreread less

Proceedings Article•DOI•

Learning object motion patterns for anomaly detection and improved object detection

[...]

Arslan Basharat¹, A. Gritai¹, Mubarak Shah¹•Institutions (1)

University of Central Florida¹

23 Jun 2008

TL;DR: The proposed method provides a new higher-level layer to the traditional surveillance pipeline for anomalous event detection and scene model feedback and successfully used the proposed scene model to detect local as well as global anomalies in object tracks.

...read moreread less

Abstract: We present a novel framework for learning patterns of motion and sizes of objects in static camera surveillance. The proposed method provides a new higher-level layer to the traditional surveillance pipeline for anomalous event detection and scene model feedback. Pixel level probability density functions (pdfs) of appearance have been used for background modelling in the past, but modelling pixel level pdfs of object speed and size from the tracks is novel. Each pdf is modelled as a multivariate Gaussian mixture model (GMM) of the motion (destination location & transition time) and the size (width & height) parameters of the objects at that location. Output of the tracking module is used to perform unsupervised EM-based learning of every GMM. We have successfully used the proposed scene model to detect local as well as global anomalies in object tracks. We also show the use of this scene model to improve object detection through pixel-level parameter feedback of the minimum object size and background learning rate. Most object path modelling approaches first cluster the tracks into major paths in the scene, which can be a source of error. We avoid this by building local pdfs that capture a variety of tracks which are passing through them. Qualitative and quantitative analysis of actual surveillance videos proved the effectiveness of the proposed approach.

...read moreread less

Journal Article•DOI•

Automatic feature localisation with constrained local models

[...]

David Cristinacce¹, Timothy F. Cootes¹•Institutions (1)

University of Manchester¹

01 Oct 2008-Pattern Recognition

TL;DR: The Constrained Local Model (CLM) algorithm is more robust and more accurate than the AAM search method, which relies on the image reconstruction error to update the model parameters, and improves localisation accuracy on photographs of human faces, magnetic resonance images of the brain and a set of dental panoramic tomograms.

...read moreread less

Journal Article•DOI•

Coupled Object Detection and Tracking from Static Cameras and Moving Vehicles

[...]

Bastian Leibe¹, Konrad Schindler¹, Nico Cornelis², L. Van Gool¹•Institutions (2)

ETH Zurich¹, Katholieke Universiteit Leuven²

01 Oct 2008-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A novel approach for multi-object tracking which considers object detection and spacetime trajectory estimation as a coupled optimization problem is presented, formulated in a minimum description length hypothesis selection framework, which allows the system to recover from mismatches and temporarily lost tracks.

...read moreread less

Abstract: We present a novel approach for multi-object tracking which considers object detection and spacetime trajectory estimation as a coupled optimization problem. Our approach is formulated in a minimum description length hypothesis selection framework, which allows our system to recover from mismatches and temporarily lost tracks. Building upon a state-of-the-art object detector, it performs multiview/multicategory object recognition to detect cars and pedestrians in the input images. The 2D object detections are checked for their consistency with (automatically estimated) scene geometry and are converted to 3D observations which are accumulated in a world coordinate frame. A subsequent trajectory estimation module analyzes the resulting 3D observations to find physically plausible spacetime trajectories. Tracking is achieved by performing model selection after every frame. At each time instant, our approach searches for the globally optimal set of spacetime trajectories which provides the best explanation for the current image and for all evidence collected so far while satisfying the constraints that no two objects may occupy the same physical space nor explain the same image pixels at any point in time. Successful trajectory hypotheses are then fed back to guide object detection in future frames. The optimization procedure is kept efficient through incremental computation and conservative hypothesis pruning. We evaluate our approach on several challenging video sequences and demonstrate its performance on both a surveillance-type scenario and a scenario where the input videos are taken from inside a moving vehicle passing through crowded city areas.

...read moreread less

Journal Article•DOI•

[...]

Qian Du¹, He Yang¹•Institutions (1)

Mississippi State University¹

05 Nov 2008-IEEE Geoscience and Remote Sensing Letters

TL;DR: The experimental result shows that the proposed unsupervised band selection algorithms based on band similarity measurement can yield a better result in terms of information conservation and class separability than other widely used techniques.

...read moreread less

Abstract: Band selection is a common approach to reduce the data dimensionality of hyperspectral imagery. It extracts several bands of importance in some sense by taking advantage of high spectral correlation. Driven by detection or classification accuracy, one would expect that, using a subset of original bands, the accuracy is unchanged or tolerably degraded, whereas computational burden is significantly relaxed. When the desired object information is known, this task can be achieved by finding the bands that contain the most information about these objects. When the desired object information is unknown, i.e., unsupervised band selection, the objective is to select the most distinctive and informative bands. It is expected that these bands can provide an overall satisfactory detection and classification performance. In this letter, we propose unsupervised band selection algorithms based on band similarity measurement. The experimental result shows that our approach can yield a better result in terms of information conservation and class separability than other widely used techniques.

...read moreread less

Book Chapter•DOI•

Learning to Localize Objects with Structured Output Regression

[...]

Matthew B. Blaschko¹, Christoph H. Lampert¹•Institutions (1)

Max Planck Society¹

20 Oct 2008

TL;DR: This work proposes to treat object localization in a principled way by posing it as a problem of predicting structured data: it model the problem not as binary classification, but as the prediction of the bounding box of objects located in images.

...read moreread less

Abstract: Sliding window classifiers are among the most successful and widely applied techniques for object localization. However, training is typically done in a way that is not specific to the localization task. First a binary classifier is trained using a sample of positive and negative examples, and this classifier is subsequently applied to multiple regions within test images. We propose instead to treat object localization in a principled way by posing it as a problem of predicting structured data: we model the problem not as binary classification, but as the prediction of the bounding box of objects located in images. The use of a joint-kernelframework allows us to formulate the training procedure as a generalization of an SVM, which can be solved efficiently. We further improve computational efficiency by using a branch-and-bound strategy for localization during both training and testing. Experimental evaluation on the PASCAL VOC and TU Darmstadt datasets show that the structured training procedure improves performance over binary training as well as the best previously published scores.

...read moreread less

Journal Article•DOI•

AdaBoost-Based Algorithm for Network Intrusion Detection

[...]

Weiming Hu¹, Wei Hu¹, Stephen J. Maybank²•Institutions (2)

Chinese Academy of Sciences¹, Birkbeck, University of London²

01 Apr 2008

TL;DR: An intrusion detection algorithm based on the AdaBoost algorithm is proposed, which has low computational complexity and error rates, as compared with algorithms of higher computational complexity, as tested on the benchmark sample data.

...read moreread less

Abstract: Network intrusion detection aims at distinguishing the attacks on the Internet from normal use of the Internet. It is an indispensable part of the information security system. Due to the variety of network behaviors and the rapid development of attack fashions, it is necessary to develop fast machine-learning-based intrusion detection algorithms with high detection rates and low false-alarm rates. In this correspondence, we propose an intrusion detection algorithm based on the AdaBoost algorithm. In the algorithm, decision stumps are used as weak classifiers. The decision rules are provided for both categorical and continuous features. By combining the weak classifiers for continuous features and the weak classifiers for categorical features into a strong classifier, the relations between these two different types of features are handled naturally, without any forced conversions between continuous and categorical features. Adaptable initial weights and a simple strategy for avoiding overfitting are adopted to improve the performance of the algorithm. Experimental results show that our algorithm has low computational complexity and error rates, as compared with algorithms of higher computational complexity, as tested on the benchmark sample data.

...read moreread less

Proceedings Article•DOI•

Automated gland and nuclei segmentation for grading of prostate and breast cancer histopathology

[...]

S. Naik¹, Scott Doyle¹, Shannon Agner¹, Anant Madabhushi¹, Michael Feldman², John E. Tomaszewski² - Show less +2 more•Institutions (2)

Rutgers University¹, University of Pennsylvania²

14 May 2008

TL;DR: The utility of the glandular and nuclear segmentation algorithm in accurate extraction of various morphological and nuclear features for automated grading of prostate cancer, breast cancer, and breast cancer specimens is demonstrated by distinguishing between cancerous and benign breast histology specimens.

...read moreread less

Abstract: Automated detection and segmentation of nuclear and glandular structures is critical for classification and grading of prostate and breast cancer histopathology. In this paper, we present a methodology for automated detection and segmentation of structures of interest in digitized histopathology images. The scheme integrates image information from across three different scales: (1) low- level information based on pixel values, (2) high-level information based on relationships between pixels for object detection, and (3) domain-specific information based on relationships between histological structures. Low-level information is utilized by a Bayesian classifier to generate a likelihood that each pixel belongs to an object of interest. High-level information is extracted in two ways: (i) by a level-set algorithm, where a contour is evolved in the likelihood scenes generated by the Bayesian classifier to identify object boundaries, and (ii) by a template matching algorithm, where shape models are used to identify glands and nuclei from the low-level likelihood scenes. Structural constraints are imposed via domain- specific knowledge in order to verify whether the detected objects do indeed belong to structures of interest. In this paper we demonstrate the utility of our glandular and nuclear segmentation algorithm in accurate extraction of various morphological and nuclear features for automated grading of (a) prostate cancer, (b) breast cancer, and (c) distinguishing between cancerous and benign breast histology specimens. The efficacy of our segmentation algorithm is evaluated by comparing breast and prostate cancer grading and benign vs. cancer discrimination accuracies with corresponding accuracies obtained via manual detection and segmentation of glands and nuclei.

...read moreread less

Proceedings Article•DOI•

Discriminative local binary patterns for human detection in personal album

[...]

Yadong Mu¹, Shuicheng Yan², Yi Liu¹, Thomas S. Huang³, Bingfeng Zhou¹ - Show less +1 more•Institutions (3)

Peking University¹, National University of Singapore², University of Illinois at Urbana–Champaign³

23 Jun 2008

TL;DR: A novel human detection system in personal albums based on LBP (local binary pattern) descriptor is developed and carefully designed experiments demonstrate the superiority of LBP over other traditional features for human detection.

...read moreread less

Abstract: In recent years, local pattern based object detection and recognition have attracted increasing interest in computer vision research community. However, to our best knowledge no previous work has focused on utilizing local patterns for the task of human detection. In this paper we develop a novel human detection system in personal albums based on LBP (local binary pattern) descriptor. Firstly we review the existing gradient based local features widely used in human detection, analyze their limitations and argue that LBP is more discriminative. Secondly, original LBP descriptor does not suit the human detecting problem well due to its high complexity and lack of semantic consistency, thus we propose two variants of LBP: Semantic-LBP and Fourier-LBP. Carefully designed experiments demonstrate the superiority of LBP over other traditional features for human detection. Especially we adopt a random ensemble algorithm for better comparison between different descriptors. All experiments are conducted on INRIA human database.

...read moreread less

Journal Article•DOI•

Tracking in Low Frame Rate Video: A Cascade Particle Filter with Discriminative Observers of Different Life Spans

[...]

Yuan Li¹, Haizhou Ai², Takayoshi Yamashita³, Shihong Lao³, Masato Kawade³ - Show less +1 more•Institutions (3)

University of Southern California¹, Tsinghua University², Omron³

01 Oct 2008-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Experiments show significantly improved accuracy of the proposed approach in comparison with existing tracking methods, under the condition of low frame rate data and abrupt motion of both target and camera.

...read moreread less

Abstract: Tracking object in low frame rate video or with abrupt motion poses two main difficulties which most conventional tracking methods can hardly handle: 1) poor motion continuity and increased search space; 2) fast appearance variation of target and more background clutter due to increased search space. In this paper, we address the problem from a view which integrates conventional tracking and detection, and present a temporal probabilistic combination of discriminative observers of different lifespans. Each observer is learned from different ranges of samples, with different subsets of features, to achieve varying level of discriminative power at varying cost. An efficient fusion and temporal inference is then done by a cascade particle filter which consists of multiple stages of importance sampling. Experiments show significantly improved accuracy of the proposed approach in comparison with existing tracking methods, under the condition of low frame rate data and abrupt motion of both target and camera.

...read moreread less

Journal Article•DOI•

3D Urban Scene Modeling Integrating Recognition and Reconstruction

[...]

Nico Cornelis¹, Bastian Leibe², Kurt Cornelis¹, Luc Van Gool²•Institutions (2)

Katholieke Universiteit Leuven¹, ETH Zurich²

01 Jul 2008-International Journal of Computer Vision

TL;DR: A novel city modeling framework which builds upon this philosophy to create 3D content at high speed by integrating it with an object recognition module that automatically detects cars in the input video streams and localizes them in 3D.

...read moreread less

Abstract: Supplying realistically textured 3D city models at ground level promises to be useful for pre-visualizing upcoming traffic situations in car navigation systems. Because this pre-visualization can be rendered from the expected future viewpoints of the driver, the required maneuver will be more easily understandable. 3D city models can be reconstructed from the imagery recorded by surveying vehicles. The vastness of image material gathered by these vehicles, however, puts extreme demands on vision algorithms to ensure their practical usability. Algorithms need to be as fast as possible and should result in compact, memory efficient 3D city models for future ease of distribution and visualization. For the considered application, these are not contradictory demands. Simplified geometry assumptions can speed up vision algorithms while automatically guaranteeing compact geometry models. In this paper, we present a novel city modeling framework which builds upon this philosophy to create 3D content at high speed. Objects in the environment, such as cars and pedestrians, may however disturb the reconstruction, as they violate the simplified geometry assumptions, leading to visually unpleasant artifacts and degrading the visual realism of the resulting 3D city model. Unfortunately, such objects are prevalent in urban scenes. We therefore extend the reconstruction framework by integrating it with an object recognition module that automatically detects cars in the input video streams and localizes them in 3D. The two components of our system are tightly integrated and benefit from each other's continuous input. 3D reconstruction delivers geometric scene context, which greatly helps improve detection precision. The detected car locations, on the other hand, are used to instantiate virtual placeholder models which augment the visual realism of the reconstructed city model.

...read moreread less

Collapse