Showing papers in "IEEE Transactions on Pattern Analysis and Machine Intelligence in 2000"

PDF

Open Access

Journal Article•DOI•

[...]

Jianbo Shi¹, Jitendra Malik²•Institutions (2)

Carnegie Mellon University¹, University of California, Berkeley²

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.

...read moreread less

Abstract: We propose a novel approach for solving the perceptual grouping problem in vision. Rather than focusing on local features and their consistencies in the image data, our approach aims at extracting the global impression of an image. We treat image segmentation as a graph partitioning problem and propose a novel global criterion, the normalized cut, for segmenting the graph. The normalized cut criterion measures both the total dissimilarity between the different groups as well as the total similarity within the groups. We show that an efficient computational technique based on a generalized eigenvalue problem can be used to optimize this criterion. We applied this approach to segmenting static images, as well as motion sequences, and found the results to be very encouraging.

...read moreread less

13,789 citations

Journal Article•DOI•

A flexible new technique for camera calibration

[...]

ZhenQiu Zhang¹•Institutions (1)

Microsoft¹

01 Nov 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A flexible technique to easily calibrate a camera that only requires the camera to observe a planar pattern shown at a few (at least two) different orientations is proposed and advances 3D computer vision one more step from laboratory environments to real world use.

...read moreread less

Abstract: We propose a flexible technique to easily calibrate a camera. It only requires the camera to observe a planar pattern shown at a few (at least two) different orientations. Either the camera or the planar pattern can be freely moved. The motion need not be known. Radial lens distortion is modeled. The proposed procedure consists of a closed-form solution, followed by a nonlinear refinement based on the maximum likelihood criterion. Both computer simulation and real data have been used to test the proposed technique and very good results have been obtained. Compared with classical techniques which use expensive equipment such as two or three orthogonal planes, the proposed technique is easy to use and flexible. It advances 3D computer vision one more step from laboratory environments to real world use.

...read moreread less

13,200 citations

Journal Article•DOI•

Statistical pattern recognition: a review

[...]

Anil K. Jain¹, Robert P. W. Duin², Jianchang Mao³•Institutions (3)

Michigan State University¹, Delft University of Technology², IBM³

01 Jan 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.

...read moreread less

Abstract: The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, neural network techniques and methods imported from statistical learning theory have been receiving increasing attention. The design of a recognition system requires careful attention to the following issues: definition of pattern classes, sensing environment, pattern representation, feature extraction and selection, cluster analysis, classifier design and learning, selection of training and test samples, and performance evaluation. In spite of almost 50 years of research and development in this field, the general problem of recognizing complex patterns with arbitrary orientation, location, and scale remains unsolved. New and emerging applications, such as data mining, web searching, retrieval of multimedia data, face recognition, and cursive handwriting recognition, require robust and efficient pattern recognition techniques. The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.

...read moreread less

6,527 citations

Journal Article•DOI•

Content-based image retrieval at the end of the early years

[...]

Arnold W. M. Smeulders¹, Marcel Worring¹, Simone Santini², Amarnath Gupta², Ramesh Jain - Show less +1 more•Institutions (2)

University of Amsterdam¹, University of California, San Diego²

01 Dec 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap are discussed, as well as aspects of system engineering: databases, system architecture, and evaluation.

...read moreread less

Abstract: Presents a review of 200 references in content-based image retrieval. The paper starts with discussing the working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap. Subsequent sections discuss computational steps for image retrieval systems. Step one of the review is image processing for retrieval sorted by color, texture, and local geometry. Features for retrieval are discussed next, sorted by: accumulative and global features, salient points, object and shape features, signs, and structural combinations thereof. Similarity of pictures and objects in pictures is reviewed for each of the feature types, in close connection to the types and means of feedback the user of the systems is capable of giving by interaction. We briefly discuss aspects of system engineering: databases, system architecture, and evaluation. In the concluding section, we present our view on: the driving force of the field, the heritage from computer vision, the influence on computer vision, the role of similarity and of interaction, the need for databases, the problem of evaluation, and the role of the semantic gap.

...read moreread less

6,447 citations

Journal Article•DOI•

The FERET evaluation methodology for face-recognition algorithms

[...]

P.J. Phillips, Hyeonjoon Moon, Syed A. Rizvi¹, Patrick J. Rauss²•Institutions (2)

College of Staten Island¹, United States Army Research Laboratory²

01 Oct 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Two of the most critical requirements in support of producing reliable face-recognition systems are a large database of facial images and a testing procedure to evaluate systems.

...read moreread less

Abstract: Two of the most critical requirements in support of producing reliable face-recognition systems are a large database of facial images and a testing procedure to evaluate systems. The Face Recognition Technology (FERET) program has addressed both issues through the FERET database of facial images and the establishment of the FERET tests. To date, 14,126 images from 1,199 individuals are included in the FERET database, which is divided into development and sequestered portions of the database. In September 1996, the FERET program administered the third in a series of FERET face-recognition tests. The primary objectives of the third test were to 1) assess the state of the art, 2) identify future areas of research, and 3) measure algorithm performance.

...read moreread less

4,816 citations

Journal Article•DOI•

Statistical Pattern Recognition

[...]

K JainAnil, P W DuinRobert, MaoJianchang

01 Jan 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this paper, the primary goal of pattern recognition is supervised or unsupervised classification, and the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been used.

...read moreread less

4,307 citations

Journal Article•DOI•

Medical image analysis: progress over two decades and the challenges ahead

[...]

James S. Duncan¹, Nicholas Ayache²•Institutions (2)

Yale University¹, French Institute for Research in Computer Science and Automation²

01 Jan 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A look at progress in the field over the last 20 years is looked at and some of the challenges that remain for the years to come are suggested.

...read moreread less

Abstract: The analysis of medical images has been woven into the fabric of the pattern analysis and machine intelligence (PAMI) community since the earliest days of these Transactions. Initially, the efforts in this area were seen as applying pattern analysis and computer vision techniques to another interesting dataset. However, over the last two to three decades, the unique nature of the problems presented within this area of study have led to the development of a new discipline in its own right. Examples of these include: the types of image information that are acquired, the fully three-dimensional image data, the nonrigid nature of object motion and deformation, and the statistical variation of both the underlying normal and abnormal ground truth. In this paper, we look at progress in the field over the last 20 years and suggest some of the challenges that remain for the years to come.

...read moreread less

4,249 citations

Journal Article•DOI•

Learning patterns of activity using real-time tracking

[...]

Chris Stauffer¹, W.E.L. Grimson¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper focuses on motion tracking and shows how one can use observed motion to learn patterns of activity in a site and create a hierarchical binary-tree classification of the representations within a sequence.

...read moreread less

Abstract: Our goal is to develop a visual monitoring system that passively observes moving objects in a site and learns patterns of activity from those observations. For extended sites, the system will require multiple cameras. Thus, key elements of the system are motion tracking, camera coordination, activity classification, and event detection. In this paper, we focus on motion tracking and show how one can use observed motion to learn patterns of activity in a site. Motion segmentation is based on an adaptive background subtraction method that models each pixel as a mixture of Gaussians and uses an online approximation to update the model. The Gaussian distributions are then evaluated to determine which are most likely to result from a background process. This yields a stable, real-time outdoor tracker that reliably deals with lighting changes, repetitive motions from clutter, and long-term scene changes. While a tracking system is unaware of the identity of any object it tracks, the identity remains the same for the entire tracking sequence. Our system leverages this information by accumulating joint co-occurrences of the representations within a sequence. These joint co-occurrence statistics are then used to create a hierarchical binary-tree classification of the representations. This method is useful for classifying sequences, as well as individual instances of activities in a site.

...read moreread less

3,631 citations

Journal Article•DOI•

W/sup 4/: real-time surveillance of people and their activities

[...]

Ismail Haritaoglu¹, D. Harwood², Larry S. Davis²•Institutions (2)

IBM¹, University of Maryland, College Park²

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: W/sup 4/ employs a combination of shape analysis and tracking to locate people and their parts and to create models of people's appearance so that they can be tracked through interactions such as occlusions.

...read moreread less

Abstract: W/sup 4/ is a real time visual surveillance system for detecting and tracking multiple people and monitoring their activities in an outdoor environment. It operates on monocular gray-scale video imagery, or on video imagery from an infrared camera. W/sup 4/ employs a combination of shape analysis and tracking to locate people and their parts (head, hands, feet, torso) and to create models of people's appearance so that they can be tracked through interactions such as occlusions. It can determine whether a foreground region contains multiple people and can segment the region into its constituent people and track them. W/sup 4/ can also determine whether people are carrying objects, and can segment objects from their silhouettes, and construct appearance models for them so they can be identified in subsequent frames. W/sup 4/ can recognize events between people and objects, such as depositing an object, exchanging bags, or removing an object. It runs at 25 Hz for 320/spl times/240 resolution images on a 400 MHz dual-Pentium II PC.

...read moreread less

2,870 citations

Journal Article•DOI•

Online and off-line handwriting recognition: a comprehensive survey

[...]

Réjean Plamondon¹, Sargur N. Srihari²•Institutions (2)

École Normale Supérieure¹, University at Buffalo²

01 Jan 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The nature of handwritten language, how it is transduced into electronic data, and the basic concepts behind written language recognition algorithms are described.

...read moreread less

Abstract: Handwriting has continued to persist as a means of communication and recording information in day-to-day life even with the introduction of new technologies. Given its ubiquity in human transactions, machine recognition of handwriting has practical significance, as in reading handwritten notes in a PDA, in postal addresses on envelopes, in amounts in bank checks, in handwritten fields in forms, etc. This overview describes the nature of handwritten language, how it is transduced into electronic data, and the basic concepts behind written language recognition algorithms. Both the online case (which pertains to the availability of trajectory data during writing) and the off-line case (which pertains to scanned images) are considered. Algorithms for preprocessing, character and word recognition, and performance with practical systems are indicated. Other fields of application, like signature verification, writer authentification, handwriting learning tools are also considered.

...read moreread less

2,653 citations

Journal Article•DOI•

Automatic analysis of facial expressions: the state of the art

[...]

Maja Pantic¹, Léon J. M. Rothkrantz¹•Institutions (1)

Delft University of Technology¹

01 Dec 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The capability of the human visual system with respect to these problems is discussed, and it is meant to serve as an ultimate goal and a guide for determining recommendations for development of an automatic facial expression analyzer.

...read moreread less

Abstract: Humans detect and interpret faces and facial expressions in a scene with little or no effort. Still, development of an automated system that accomplishes this task is rather difficult. There are several related problems: detection of an image segment as a face, extraction of the facial expression information, and classification of the expression (e.g., in emotion categories). A system that performs these operations accurately and in real time would form a big step in achieving a human-like interaction between man and machine. The paper surveys the past work in solving these problems. The capability of the human visual system with respect to these problems is discussed, too. It is meant to serve as an ultimate goal and a guide for determining recommendations for development of an automatic facial expression analyzer.

...read moreread less

Journal Article•DOI•

A Bayesian computer vision system for modeling human interactions

[...]

Nuria Oliver¹, Barbara Rosario², Alex Pentland³•Institutions (3)

Microsoft¹, University of California, Berkeley², Massachusetts Institute of Technology³

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A real-time computer vision and machine learning system for modeling and recognizing human behaviors in a visual surveillance task and demonstrates the ability to use these a priori models to accurately classify real human behaviors and interactions with no additional tuning or training.

...read moreread less

Abstract: We describe a real-time computer vision and machine learning system for modeling and recognizing human behaviors in a visual surveillance task. The system deals in particularly with detecting when interactions between people occur and classifying the type of interaction. Examples of interesting interaction behaviors include following another person, altering one's path to meet another, and so forth. Our system combines top-down with bottom-up information in a closed feedback loop, with both components employing a statistical Bayesian approach. We propose and compare two different state-based learning architectures, namely, HMMs and CHMMs for modeling behaviors and interactions. Finally, a synthetic "Alife-style" training system is used to develop flexible prior models for recognizing human interactions. We demonstrate the ability to use these a priori models to accurately classify real human behaviors and interactions with no additional tuning or training.

...read moreread less

Journal Article•DOI•

Assessing a mixture model for clustering with the integrated completed likelihood

[...]

Christophe Biernacki, Gilles Celeux¹, Gérard Govaert²•Institutions (2)

French Institute for Research in Computer Science and Automation¹, University of Technology of Compiègne²

01 Jul 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: An assessing method of mixture model in a cluster analysis setting with integrated completed likelihood appears to be more robust to violation of some of the mixture model assumptions and it can select a number of dusters leading to a sensible partitioning of the data.

...read moreread less

Abstract: We propose an assessing method of mixture model in a cluster analysis setting with integrated completed likelihood. For this purpose, the observed data are assigned to unknown clusters using a maximum a posteriori operator. Then, the integrated completed likelihood (ICL) is approximated using the Bayesian information criterion (BIC). Numerical experiments on simulated and real data of the resulting ICL criterion show that it performs well both for choosing a mixture model and a relevant number of clusters. In particular, ICL appears to be more robust than BIC to violation of some of the mixture model assumptions and it can select a number of dusters leading to a sensible partitioning of the data.

...read moreread less

Journal Article•DOI•

Geodesic active contours and level sets for the detection and tracking of moving objects

[...]

Nikos Paragios¹, Rachid Deriche•Institutions (1)

Princeton University¹

01 Mar 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A new approach named Hermes is proposed, which exploits aspects from the well-known front propagation algorithms and compares favorably to them, and very promising experimental results are provided using real video sequences.

...read moreread less

Abstract: This paper presents a new variational framework for detecting and tracking multiple moving objects in image sequences. Motion detection is performed using a statistical framework for which the observed interframe difference density function is approximated using a mixture model. This model is composed of two components, namely, the static (background) and the mobile (moving objects) one. Both components are zero-mean and obey Laplacian or Gaussian law. This statistical framework is used to provide the motion detection boundaries. Additionally, the original frame is used to provide the moving object boundaries. Then, the detection and the tracking problem are addressed in a common framework that employs a geodesic active contour objective function. This function is minimized using a gradient descent method. A new approach named Hermes is proposed, which exploits aspects from the well-known front propagation algorithms and compares favorably to them. Very promising experimental results are provided using real video sequences.

...read moreread less

Journal Article•DOI•

Geometric camera calibration using circular control points

[...]

Janne Heikkilä¹•Institutions (1)

University of Oulu¹

01 Oct 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A calibration procedure for precise 3D computer vision applications is described that introduces bias correction for circular control points and a nonrecursive method for reversing the distortion model and indicates improvements in the calibration results in limited error conditions.

...read moreread less

Abstract: Modern CCD cameras are usually capable of a spatial accuracy greater than 1/50 of the pixel size. However, such accuracy is not easily attained due to various error sources that can affect the image formation process. Current calibration methods typically assume that the observations are unbiased, the only error is the zero-mean independent and identically distributed random noise in the observed image coordinates, and the camera model completely explains the mapping between the 3D coordinates and the image coordinates. In general, these conditions are not met, causing the calibration results to be less accurate than expected. In the paper, a calibration procedure for precise 3D computer vision applications is described. It introduces bias correction for circular control points and a nonrecursive method for reversing the distortion model. The accuracy analysis is presented and the error sources that can reduce the theoretical accuracy are discussed. The tests with synthetic images indicate improvements in the calibration results in limited error conditions. In real images, the suppression of external error sources becomes a prerequisite for successful calibration.

...read moreread less

Journal Article•DOI•

Fast and globally convergent pose estimation from video images

[...]

C. P. Lu, Gregory D. Hager¹, Eric Mjolsness²•Institutions (2)

Johns Hopkins University¹, Jet Propulsion Laboratory²

01 Jun 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: It is shown that the pose estimation problem can be formulated as that of minimizing an error metric based on collinearity in object (as opposed to image) space, and an iterative algorithm which directly computes orthogonal rotation matrices and which is globally convergent is derived.

...read moreread less

Abstract: Determining the rigid transformation relating 2D images to known 3D geometry is a classical problem in photogrammetry and computer vision. Heretofore, the best methods for solving the problem have relied on iterative optimization methods which cannot be proven to converge and/or which do not effectively account for the orthonormal structure of rotation matrices. We show that the pose estimation problem can be formulated as that of minimizing an error metric based on collinearity in object (as opposed to image) space. Using object space collinearity error, we derive an iterative algorithm which directly computes orthogonal rotation matrices and which is globally convergent. Experimentally, we show that the method is computationally efficient, that it is no less accurate than the best currently employed optimization methods, and that it outperforms all tested methods in robustness to outliers.

...read moreread less

Journal Article•DOI•

Robust real-time periodic motion detection, analysis, and applications

[...]

Ross Cutler¹, Larry S. Davis¹•Institutions (1)

University of Maryland, College Park¹

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: New techniques to detect and analyze periodic motion as seen from both a static and a moving camera are described and the periodicity is analyzed robustly using the 2D lattice structures inherent in similarity matrices.

...read moreread less

Abstract: We describe new techniques to detect and analyze periodic motion as seen from both a static and a moving camera. By tracking objects of interest, we compute an object's self-similarity as it evolves in time. For periodic motion, the self-similarity measure is also periodic and we apply time-frequency analysis to detect and characterize the periodic motion. The periodicity is also analyzed robustly using the 2D lattice structures inherent in similarity matrices. A real-time system has been implemented to track and classify objects using periodicity. Examples of object classification (people, running dogs, vehicles), person counting, and nonstationary periodicity are provided.

...read moreread less

Journal Article•DOI•

Recognition of visual activities and interactions by stochastic parsing

[...]

Yuri A. Ivanov¹, Aaron F. Bobick²•Institutions (2)

Massachusetts Institute of Technology¹, Georgia Institute of Technology²

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A probabilistic syntactic approach to the detection and recognition of temporally extended activities and interactions between multiple agents and how the system correctly interprets activities of multiple interacting objects is demonstrated.

...read moreread less

Abstract: This paper describes a probabilistic syntactic approach to the detection and recognition of temporally extended activities and interactions between multiple agents. The fundamental idea is to divide the recognition problem into two levels. The lower level detections are performed using standard independent probabilistic event detectors to propose candidate detections of low-level features. The outputs of these detectors provide the input stream for a stochastic context-free grammar parsing mechanism. The grammar and parser provide longer range temporal constraints, disambiguate uncertain low-level detections, and allow the inclusion of a priori knowledge about the structure of temporal events in a given domain. We develop a real-time system and demonstrate the approach in several experiments on gesture recognition and in video surveillance. In the surveillance application, we show how the system correctly interprets activities of multiple interacting objects.

...read moreread less

Journal Article•DOI•

Fast, reliable head tracking under varying illumination: an approach based on registration of texture-mapped 3D models

[...]

M. La Cascia, Stan Sclaroff¹, Vassilis Athitsos¹•Institutions (1)

Boston University¹

01 Apr 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this paper, the head is modeled as a texture mapped cylinder and tracking is formulated as an image registration problem in the cylinder's texture map image, which is solved by regularized weighted least squares error minimization.

...read moreread less

Abstract: A technique for 3D head tracking under varying illumination is proposed. The head is modeled as a texture mapped cylinder. Tracking is formulated as an image registration problem in the cylinder's texture map image. The resulting dynamic texture map provides a stabilized view of the face that can be used as input to many existing 2D techniques for face recognition, facial expressions analysis, lip reading, and eye tracking. To solve the registration problem with lighting variation and head motion, the residual registration error is modeled as a linear combination of texture warping templates and orthogonal illumination templates. Fast stable online tracking is achieved via regularized weighted least-squares error minimization. The regularization tends to limit potential ambiguities that arise in the warping and illumination templates. It enables stable tracking over extended sequences. Tracking does not require a precise initial model fit; the system is initialized automatically using a simple 2D face detector. It is assumed that the target is facing the camera in the first frame. The formulation uses texture mapping hardware. The nonoptimized implementation runs at about 15 frames per second on a SGI O2 graphic workstation. Extensive experiments evaluating the effectiveness of the formulation are reported. The sensitivity of the technique to illumination, regularization parameters, errors in the initial positioning, and internal camera parameters are analyzed. Examples and applications of tracking are reported.

...read moreread less

Journal Article•DOI•

Algorithms for defining visual regions-of-interest: comparison with eye fixations

[...]

C.M. Privitera¹, L.W. Stark¹•Institutions (1)

University of California, Berkeley¹

01 Sep 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper investigates and develops a methodology that serves to automatically identify a subset of aROIs (algorithmically detected ROIs) using different image processing algorithms (IPAs), and appropriate clustering procedures, and compares hROIs with hROI as a criterion for evaluating and selecting bottom-up, context-free algorithms.

...read moreread less

Abstract: Many machine vision applications, such as compression, pictorial database querying, and image understanding, often need to analyze in detail only a representative subset of the image, which may be arranged into sequences of loci called regions-of-interest (ROIs). We have investigated and developed a methodology that serves to automatically identify such a subset of aROIs (algorithmically detected ROIs) using different image processing algorithms (IPAs), and appropriate clustering procedures. In human perception, an internal representation directs top-down, context-dependent sequences of eye movements to fixate on similar sequences of hROIs (human identified ROIs). In the paper, we introduce our methodology and we compare aROIs with hROIs as a criterion for evaluating and selecting bottom-up, context-free algorithms. An application is finally discussed.

...read moreread less

Journal Article•DOI•

A cooperative algorithm for stereo matching and occlusion detection

[...]

C.L. Zitnick¹, Takeo Kanade¹•Institutions (1)

Carnegie Mellon University¹

01 Jul 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Presents a stereo algorithm for obtaining disparity maps with occlusion explicitly detected, and presents the processing results from synthetic and real image pairs, including ones with ground-truth values for quantitative comparison with other methods.

...read moreread less

Abstract: Presents a stereo algorithm for obtaining disparity maps with occlusion explicitly detected. To produce smooth and detailed disparity maps, two assumptions that were originally proposed by Marr and Poggio (1976, 1979) are adopted: uniqueness and continuity. That is, the disparity maps have a unique value per pixel and are continuous almost everywhere. These assumptions are enforced within a three-dimensional array of match values in disparity space. Each match value corresponds to a pixel in an image and a disparity relative to another image. An iterative algorithm updates the match values by diffusing support among neighboring values and inhibiting others along similar lines of sight. By applying the uniqueness assumption, occluded regions can be explicitly identified. To demonstrate the effectiveness of the algorithm, we present the processing results from synthetic and real image pairs, including ones with ground-truth values for quantitative comparison with other methods.

...read moreread less

Journal Article•DOI•

Twenty years of document image analysis in PAMI

[...]

George Nagy¹•Institutions (1)

Rensselaer Polytechnic Institute¹

01 Jan 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The contributions to document image analysis of 99 papers published in the IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) are clustered, summarized, interpolated, interpreted, and evaluated.

...read moreread less

Abstract: The contributions to document image analysis of 99 papers published in the IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) are clustered, summarized, interpolated, interpreted, and evaluated.

...read moreread less

Journal Article•DOI•

[...]

Longin Jan Latecki¹, Rolf Lakämper¹•Institutions (1)

University of Hamburg¹

01 Oct 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work applied a cognitively motivated similarity measure to shape matching of object contours in various image databases and compared it to well-known approaches in the literature, justifying that the shape matching procedure gives an intuitive shape correspondence and is stable with respect to noise distortions.

...read moreread less

Abstract: A cognitively motivated similarity measure is presented and its properties are analyzed with respect to retrieval of similar objects in image databases of silhouettes of 2D objects. To reduce influence of digitization noise, as well as segmentation errors, the shapes are simplified by a novel process of digital curve evolution. To compute our similarity measure, we first establish the best possible correspondence of visual parts (without explicitly computing the visual parts). Then, the similarity between corresponding parts is computed and aggregated. We applied our similarity measure to shape matching of object contours in various image databases and compared it to well-known approaches in the literature. The experimental results justify that our shape matching procedure gives an intuitive shape correspondence and is stable with respect to noise distortions.

...read moreread less

Journal Article•DOI•

Biometric identification through hand geometry measurements

[...]

Raul Sanchez-Reillo¹, Carmen Sanchez-Avila, Ana González-Marcos•Institutions (1)

ETSI¹

01 Oct 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Experimental results, up to a 97 percent rate of success in classification, will show the possibility of using this biometric system in medium/high security environments with full acceptance from all users.

...read moreread less

Abstract: A work in defining and implementing a biometric system based on hand geometry identification is presented here. Hand features are extracted from a color photograph taken when the user has placed his hand on a platform designed for such a task. Different pattern recognition techniques have been tested to be used in classification and/or verification from Euclidean distance to neural networks. Experimental results, up to a 97 percent rate of success in classification, will show the possibility of using this system in medium/high security environments with full acceptance from all users.

...read moreread less

Journal Article•DOI•

Introduction to the special section on video surveillance

[...]

Robert T. Collins¹, Alan J. Lipton, Takeo Kanade¹•Institutions (1)

Carnegie Mellon University¹

01 Jul 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The 11 papers in this special section illustrate topics and techniques at the forefront of video surveillance research, touching on many of the core topics of computer vision, pattern analysis, and aritificial intelligence.

...read moreread less

Abstract: UTOMATED video surveillance addresses real-time observation of people and vehicles within a busy environment, leading to a description of their actions and interactions. The technical issues include moving object detection and tracking, object classification, human motion analysis, and activity understanding, touching on many of the core topics of computer vision, pattern analysis, and aritificial intelligence. Video surveillance has spawned large research projects in the United States, Europe, and Japan, and has been the topic of several international conferences and workshops in recent years. There are immediate needs for automated surveillance systems in commercial, law enforcement, and military applications. Mounting video cameras is cheap, but finding available human resources to observe the output is expensive. Although surveillance cameras are already prevalent in banks, stores, and parking lots, video data currently is used only “after the fact” as a forensic tool, thus losing its primary benefit as an active, real-time medium. What is needed is continuous 24-hour monitoring of surveillance video to alert security officers to a burglary in progress or to a suspicious individual loitering in the parking lot, while there is still time to prevent the crime. In addition to the obvious security applications, video surveillance technology has been proposed to measure traffic flow, detect accidents on highways, monitor pedestrian congestion in public spaces, compile consumer demographics in shopping malls and amusement parks, log routine maintainence tasks at nuclear facilities, and count endangered species. The numerous military applications include patrolling national borders, measuring the flow of refugees in troubled areas, monitoring peace treaties, and providing secure perimeters around bases and embassies. The 11 papers in this special section illustrate topics and techniques at the forefront of video surveillance research. These papers can be loosely organized into three categories. Detection and tracking involves real-time extraction of moving objects from video and continuous tracking over time to form persistent object trajectories. C. Stauffer and W.E.L. Grimson introduce unsupervised statistical learning techniques to cluster object trajectories produced by adaptive background subtraction into descriptions of normal scene activity. Viewpoint-specific trajectory descriptions from multiple cameras are combined into a common scene coordinate system using a calibration technique described by L. Lee, R. Romano, and G. Stein, who automatically determine the relative exterior orientation of overlapping camera views by observing a sparse set of moving objects on flat terrain. Two papers address the accumulation of noisy motion evidence over time. R. Pless, T. Brodský, and Y. Aloimonos detect and track small objects in aerial video sequences by first compensating for the self-motion of the aircraft, then accumulating residual normal flow to acquire evidence of independent object motion. L. Wixson notes that motion in the image does not always signify purposeful travel by an independently moving object (examples of such “motion clutter” are wind-blown tree branches and sun reflections off rippling water) and devises a flow-based salience measure to highlight objects that tend to move in a consistent direction over time. Human motion analysis is concerned with detecting periodic motion signifying a human gait and acquiring descriptions of human body pose over time. R. Cutler and L.S. Davis plot an object’s self-similarity across all pairs of frames to form distinctive patterns that classify bipedal, quadripedal, and rigid object motion. Y. Ricquebourg and P. Bouthemy track apparent contours in XT slices of an XYT sequence volume to robustly delineate and track articulated human body structure. I. Haritaoglu, D. Harwood, and L.S. Davis present W4, a surveillance system specialized to the task of looking at people. The W4 system can locate people and segment their body parts, build simple appearance models for tracking, disambiguate between and separately track multiple individuals in a group, and detect carried objects such as boxes and backpacks. Activity analysis deals with parsing temporal sequences of object observations to produce high-level descriptions of agent actions and multiagent interactions. In our opinion, this will be the most important area of future research in video surveillance. N.M. Oliver, B. Rosario, and A.P. Pentland introduce Coupled Hidden Markov Models (CHMMs) to detect and classify interactions consisting of two interleaved agent action streams and present a training method based on synthetic agents to address the problem of parameter estimation from limited real-world training examples. M. Brand and V. Kettnaker present an entropyminimization approach to estimating HMM topology and

...read moreread less

Journal Article•DOI•

Looking at people: sensing for ubiquitous and wearable computing

[...]

Alex Pentland¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Jan 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The paper examines the mathematical tools that have proven successful, provides a taxonomy of the problem domain, and then examines the state of the art: person identification, surveillance/monitoring, 3D methods, and smart rooms/perceptual user interfaces.

...read moreread less

Abstract: The research topic of looking at people, that is, giving machines the ability to detect, track, and identify people and more generally, to interpret human behavior, has become a central topic in machine vision research. Initially thought to be the research problem that would be hardest to solve, it has proven remarkably tractable and has even spawned several thriving commercial enterprises. The principle driving application for this technology is "fourth generation" embedded computing: "smart" environments and portable or wearable devices. The key technical goals are to determine the computer's context with respect to nearby humans (e.g., who, what, when, where, and why) so that the computer can act or respond appropriately without detailed instructions. The paper examines the mathematical tools that have proven successful, provides a taxonomy of the problem domain, and then examines the state of the art. Four areas receive particular attention: person identification, surveillance/monitoring, 3D methods, and smart rooms/perceptual user interfaces. Finally, the paper discusses some of the research challenges and opportunities.

...read moreread less

Journal Article•DOI•

Monitoring activities from multiple video streams: establishing a common coordinate frame

[...]

L. Lee¹, R. Romano¹, Gideon Stein¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this paper, a planar alignment matrix is used to align the scene's ground plane across multiple views and decompose the alignment matrix to recover the 3D relative camera and ground plane positions.

...read moreread less

Abstract: Monitoring of large sites requires coordination between multiple cameras, which in turn requires methods for relating events between distributed cameras. This paper tackles the problem of automatic external calibration of multiple cameras in an extended scene, that is, full recovery of their 3D relative positions and orientations. Because the cameras are placed far apart, brightness or proximity constraints cannot be used to match static features, so we instead apply planar geometric constraints to moving objects tracked throughout the scene. By robustly matching and fitting tracked objects to a planar model, we align the scene's ground plane across multiple views and decompose the planar alignment matrix to recover the 3D relative camera and ground plane positions. We demonstrate this technique in both a controlled lab setting where we test the effects of errors in the intrinsic camera parameters, and in an uncontrolled, outdoor setting. In the latter, we do not assume synchronized cameras and we show that enforcing geometric constraints enables us to align the tracking data in time. In spite of noise in the intrinsic camera parameters and in the image data, the system successfully transforms multiple views of the scene's ground plane to an overhead view and recovers the relative 3D camera and ground plane positions.

...read moreread less

Journal Article•DOI•

A fingerprint verification system based on triangular matching and dynamic time warping

[...]

Zsolt Miklós Kovács-Vajna¹•Institutions (1)

Brescia University¹

01 Nov 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: An effective fingerprint verification system is presented, which assumes that an existing reference fingerprint image must validate the identity of a person by means of a test fingerprint image acquired online and in real-time using minutiae matching.

...read moreread less

Abstract: An effective fingerprint verification system is presented. It assumes that an existing reference fingerprint image must validate the identity of a person by means of a test fingerprint image acquired online and in real-time using minutiae matching. The matching system consists of two main blocks: The first allows for the extraction of essential information from the reference image off-line, the second performs the matching itself online. The information is obtained from the reference image by filtering and careful minutiae extraction procedures. The fingerprint identification is based on triangular matching to cope with the strong deformation of fingerprint images due to static friction or finger rolling. The matching is finally validated by dynamic time warping. Results reported on the NIST Special Database 4 reference set, featuring 85 percent correct verification (15 percent false negative) and 0.05 percent false positive, demonstrate the effectiveness of the verification technique.

...read moreread less

Journal Article•DOI•

Discovery and segmentation of activities in video

[...]

M. Brand¹, V. Kettnaker²•Institutions (2)

Mitsubishi¹, Rensselaer Polytechnic Institute²

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this article, Hidden Markov Models (HMMs) are used to organize observed activity into meaningful states by minimizing the entropy of the joint distribution of the HMMs' internal state machine.

...read moreread less

Abstract: Hidden Markov models (HMMs) have become the workhorses of the monitoring and event recognition literature because they bring to time-series analysis the utility of density estimation and the convenience of dynamic time warping. Once trained, the internals of these models are considered opaque; there is no effort to interpret the hidden states. We show that by minimizing the entropy of the joint distribution, an HMM's internal state machine can be made to organize observed activity into meaningful states. This has uses in video monitoring and annotation, low bit-rate coding of scene activity, and detection of anomalous behavior. We demonstrate with models of office activity and outdoor traffic, showing how the framework learns principal modes of activity and patterns of activity change. We then show how this framework can be adapted to infer hidden state from extremely ambiguous images, in particular, inferring 3D body orientation and pose from sequences of low-resolution silhouettes.

...read moreread less

Journal Article•DOI•

Learning and design of principal curves

[...]

Balázs Kégl¹, Adam Krzyżak², Tamas Linder¹, Kenneth Zeger³•Institutions (3)

Queen's University¹, Concordia University², University of California, San Diego³

01 Mar 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work defines principal curves as continuous curves of a given length which minimize the expected squared distance between the curve and points of the space randomly chosen according to a given distribution, making it possible to theoretically analyze principal curve learning from training data and it also leads to a new practical construction.

...read moreread less

Abstract: Principal curves have been defined as "self-consistent" smooth curves which pass through the "middle" of a d-dimensional probability distribution or data cloud. They give a summary of the data and also serve as an efficient feature extraction tool. We take a new approach by defining principal curves as continuous curves of a given length which minimize the expected squared distance between the curve and points of the space randomly chosen according to a given distribution. The new definition makes it possible to theoretically analyze principal curve learning from training data and it also leads to a new practical construction. Our theoretical learning scheme chooses a curve from a class of polygonal lines with k segments and with a given total length to minimize the average squared distance over n training points drawn independently. Convergence properties of this learning scheme are analyzed and a practical version of this theoretical algorithm is implemented. In each iteration of the algorithm, a new vertex is added to the polygonal line and the positions of the vertices are updated so that they minimize a penalized squared distance criterion. Simulation results demonstrate that the new algorithm compares favorably with previous methods, both in terms of performance and computational complexity, and is more robust to varying data models.

...read moreread less