scispace - formally typeset
Search or ask a question

Showing papers presented at "British Machine Vision Conference in 1998"


Proceedings ArticleDOI
01 Jan 1998
TL;DR: A new constructive method is described for incrementally adding observations to an eigenspace model to explicitly account for a change in origin as well as achange in the number of eigenvectors needed in the basis set.
Abstract: Eigenspace models are a convenient way to represent sets of observations with widespread applications, including classification. In this paper we describe a new constructive method for incrementally adding observations to an eigenspace model. Our contribution is to explicitly account for a change in origin as well as a change in the number of eigenvectors needed in the basis set. No other method we have seen considers change of origin, yet both are needed if an eigenspace model is to be used for classification purposes. We empirically compare our incremental method with two alternatives from the literature and show our method is the more useful for classification because it computes the smaller eigenspace model representing the observations.

271 citations


Proceedings ArticleDOI
01 Jan 1998
TL;DR: This study found that a modified version of Lucas and Kanade's algorithm has superior performance but produces sparse flow maps, while Proesmans et al.
Abstract: Evaluating the performance of optical flow algorithms has been difficult because of the lack of ground-truth data sets for complex scenes We describe a simple modification to a ray tracer that allows us to generate ground-truth motion fields for scenes of arbitrary complexity The resulting flow maps are used to assist in the comparison of eight optical flow algorithms using three complex, synthetic scenes Our study found that a modified version of Lucas and Kanade’s algorithm has superior performance but produces sparse flow maps Proesmans et al’s algorithm performs slightly worse, on average, but produces a very dense depth map

254 citations


Proceedings ArticleDOI
01 Jan 1998
TL;DR: Variations of the basic algorithm aimed at improving the speed and robustness of search are described, including subsampling and using image residuals to drive the shape rather than full appearance model.
Abstract: An Active Appearance Model (AAM) allows complex models of shape and appearance to be matched to new images rapidly. An AAM contains a statistical model of the shape and grey-level appearance of an object of interest The associated search algorithm exploits the locally linear relationship between model parameter displacements and the residual errors between model instance and image. This relationship can be learnt during a training phase. To match to an image we measure the current residuals and use the model to predict changes to the current parameters. The algorithm converges in a few iterations. In this paper we describe variations of the basic algorithm aimed at improving the speed and robustness of search. These include subsampling and using image residuals to drive the shape rather than full appearance model. We show examples of search and give the results of experiments comparing the performance of the different algorithms.

148 citations


Proceedings ArticleDOI
01 Jan 1998
TL;DR: Experiments show PPHT has, in many circumstances, advantages over the Standard Hough Transform, and is ideally suited for real-time applications with a fixed amount of available processing time.
Abstract: This thesis presents the Progressive Probabilistic Hough Transform (PPHT). Unlike the Probabilistic HT [46] where the Standard HT is performed on a pre-selected fraction of input points, the PPHT minimises the amount of computation needed to detect lines by exploiting the difference in the fraction of votes needed to reliably detect lines with different numbers of supporting points. The fraction of points used for voting need not be specified ad hoc or using a priori knowledge, as in the probabilistic HT; it is a function of the inherent complexity of data. The algorithm is ideally suited for real-time applications with a fixed amount of available processing time, since voting and line detection is interleaved. The most salient features are likely to be detected first. While retaining its robustness, experiments show PPHT has, in many circumstances, advantages over the Standard HT.

112 citations


Proceedings ArticleDOI
01 Jan 1998
TL;DR: Results demonstrate that this method selects views which generate reasonable volumetric models for convex, concave and curved objects.
Abstract: This paper presents a method for solving the Best Next View problem. This problem arises while gathering range data for the purpose of building 3D models of objects. The novelty of our solution is the introduction of a quality criterion in addition to the visibility criterion used by previous researchers. This quality criterion aims at obtaining views that improve the overall range data quality of the imaged surfaces. Results demonstrate that this method selects views which generate reasonable volumetric models for convex, concave and curved objects.

112 citations


Proceedings ArticleDOI
01 Jan 1998
TL;DR: A Kalman tracking algorithm that can track a number of very small, low contrast objects through an image sequence taken from a static camera using a combination of wavelet filtering for detection with an interest operator for testing multiple target hypotheses based within the framework of a Kalman tracker.
Abstract: We present a Kalman tracking algorithm that can track a number of very small, low contrast objects through an image sequence taken from a static camera. The issues that we have addressed to achieve this are twofold. Firstly, the detection of small objects comprising a few pixels only, moving slowly in the image, and secondly, tracking of multiple small targets even though they may be lost either through occlusion or in noisy signal. The approach uses a combination of wavelet filtering for detection with an interest operator for testing multiple target hypotheses based within the framework of a Kalman tracker. We demonstrate the robustness of the approach to occlusion and for multiple targets.

107 citations


Proceedings ArticleDOI
01 Jan 1998
TL;DR: This work addresses the problem of detecting and tracking people with a mobile robot, which arises in many different service robotic applications and the main problems are realtimeconstraints, a changing background, varying illumination conditions and a non-rigid shape of the person to be tracked.
Abstract: We address the problem of detecting and tracking people with a mobile robot. The need for following a person with a mobile robot arises in many different service robotic applications. The main problems of this task are realtimeconstraints, a changing background, varying illumination conditions and a non-rigid shape of the person to be tracked. The presented system has been tested extensively on a mobile robot in our everyday office environment.

104 citations


Proceedings ArticleDOI
01 Jan 1998
TL;DR: A holistic approach to real-time gaze tracking is investigated by means of a well-defined neural network modelling strategy combined with robust image processing algorithms that effectively learns the gaze direction of a human user by modelling implicitly corresponding eye appearance.
Abstract: We investigate a holistic approach to real-time gaze tracking by means of a well-defined neural network modelling strategy combined with robust image processing algorithms. Based on captured greyscale eye images, the system effectively learns the gaze direction of a human user by modelling implicitly corresponding eye appearance ‐ the relative positions of the pupil, cornea, and light reflection inside the eye socket. In operation, the gaze tracker provides a fast, cheap, and flexible means finding the focus of a user’s attention on any of the objects displayed on a computer screen. It works in an open-plan office environment under normal illumination without using any specialised hardware. It can be easily customised to a new user and integrated into an application system that demands an intelligent non-command interface.

87 citations


Proceedings ArticleDOI
01 Jan 1998
TL;DR: Results are presented showing that salient features can be relocated more reliably than features chosen using previous methods, including hand picked features.
Abstract: We present a method for locating salient object features. Salient features are those which have a low probability of being mis-classified with any other feature, and are therefore more easily found in a similar image containing an example of the object. The local image structure can be described by vectors extracted using a standard ‘feature extractor’ at a range of scales. We train statistical models for each feature, using vectors taken from a number of training examples. The feature models can then be used to find the probability of misclassifying a feature with all other features. Low probabilities indicate a salient feature. Results are presented showing that salient features can be relocated more reliably than features chosen using previous methods, including hand picked features.

84 citations


Proceedings ArticleDOI
01 Jan 1998
TL;DR: This work presents a method for self-calibration of a camera which is free to rotate and change its intrinsic parameters, but which cannot translate and gives experimental results using real image sequences for which ground truth data was available.
Abstract: We present a method for self-calibration of a camera which is free to rotate and change its intrinsic parameters, but which cannot translate. The method is based on the so-called infinite homography constraint which leads to a non-linear minimisation routine to find the unknown camera intrinsics over an extended sequence of images. We give experimental results using real image sequences for which ground truth data was available.

80 citations


Proceedings ArticleDOI
01 Jan 1998
TL;DR: It is shown that structures that repeat in the world are related by particular parametrized transformations in perspective images, and these image transformations provide powerful grouping constraints, and can be used at the heart of hypothesize and verify grouping algorithms.
Abstract: The objective of this work is the automatic detection and grouping of imaged elements which repeat on a plane in a scene (for example tiled floorings). It is shown that structures that repeat on a scene plane are related by particular parametrized transformations in perspective images. These image transformations provide powerful grouping constraints, and can be used at the heart of hypothesize and verify grouping algorithms. The parametrized transformations are global across the image plane and may be computed without knowledge of the pose of the plane or camera calibration. Parametrized transformations are given for severalcl asses of repeating operation in the world as well as groupers based on these. These groupers are demonstrated on a number of real images, where both the elements and the grouping are determined automatically. It is shown that the repeating element can be learnt from the image, and hence provides an image descriptor. Also, information on the plane pose, such as its vanishing line, can be recovered from the grouping.

Proceedings ArticleDOI
16 Sep 1998
TL;DR: This paper tackles the problem of obtaining a good initial set of corner matches between two images without resorting to any constraints from motion or structure models and introduces a new technique, the Median Flow Filter, which detects outliers by assuming that the image motion is locally similar.
Abstract: This paper tackles the problem of obtaining a good initial set of corner matches between two images without resorting to any constraints from motion or structure models. Several different matching metrics, both traditional and statistical, are evaluated and the effect of matching using sub-pixel information is studied. It is found that, in most cases, the commonly-used crosscorrelation does not perform as well as some other measures, such as the test or the sum of squared differences, and that it is essential to use sub-pixel accuracy if mismatches are to be avoided. Further, a new technique, the Median Flow Filter, is introduced. This detects outliers by assuming that the image motion is locally similar. Any matches which are in gross disagreement with the local “median flow” are discarded. Experiments show this technique to be particularly effective, typically lowering the percentage of outliers from around 35% to less than 5%, permitting direct model fitting rather than random sampling techniques for any further analysis.

Proceedings ArticleDOI
01 Jan 1998
TL;DR: Different kinds of human gait are recognized using sequences of grey‐level images, which are used to train hidden Markov models (HMMs), one HMM for each kind of gait.
Abstract: In this paper we describe a system for automatic gait analysis. Different kinds of human gait are recognized using sequences of grey‐level images. No markers are needed to get the trajectories of different body parts. The tracking of body parts and the classification are based on statistical models. We model several body parts and the background as mixture densities. The positions are determined iteratively, we begin with the most stable part to find. The anatomy of a human body restricts the area to search for the next one. From the trajectories, features for gait analysis are derived. These are used to train hidden Markov models (HMMs), one HMM for each kind of gait.

Proceedings ArticleDOI
01 Jan 1998
TL;DR: A novel method is described for obtaining superior classification performance over a variable range of classification costs by analysis of a set of existing classifiers using a receiver operating characteristic and a new system is shown to produce the , a powerful technique for improving classification systems in problem domains within which classification costs may not be known.
Abstract: A novel method is described for obtaining superior classification performance over a variable range of classification costs. By analysis of a set of existing classifiers using a receiver operating characteristic ( )c urve, a set of new realisable classifiers may be obtained by a random combination of two of the existing classifiers. These classifiers lie on the convex hull that contains the original points for the existing classifiers. This hull is the maximum realisable ( ). A theorem for this method is derived and proved from an observation about data, and experimental results verify that a superior classification system may be constructed using only the existing classifiers and the information of the original data. This new system is shown to produce the , and as such provides a powerful technique for improving classification systems in problem domains within which classification costs may not be known ap riori. Empirical results are presented for artificial data, and for two real world data sets: an image segmentation task and the diagnosis of abnormal thyroid condition.

Proceedings ArticleDOI
01 Jan 1998
TL;DR: This paper attempts to eliminate the assumption that the written text is fixed by presenting a novel algorithm for automatic text-independent writer identification from non-uniformly skewed handwriting images by taking a global approach based on texture analysis.
Abstract: Many techniques have been reported for handwriting-based writer identification. Most such techniques assume that the written text is fixed (e.g., in signature verification). In this paper we attempt to eliminate this assumption by presenting a novel algorithm for automatic text-independent writer identification from non-uniformly skewed handwriting images. Given that the handwriting of different people is often visually distinctive, we take a global approach based on texture analysis, where each writers’ handwriting is regarded as a different texture. In principle this allows us to apply any standard texture recognition algorithm for the task (e.g., the multi-channel Gabor filtering technique). Results of 96.0% accuracy on the classification of 150 test documents from 10 writers are very promising. The method is shown to be robust to noise and contents.

Proceedings ArticleDOI
01 Jan 1998
TL;DR: This paper presents a novel approach to selecting a minimised number of views that allow each object face to be adequately viewed according to specified constraints on viewpoints and other features.
Abstract: Many machine vision tasks, e.g. object recognition and object inspection, cannot be performed robustly from a single image. For certain tasks (e.g. 3D object recognition and automated inspection) the availability of multiple views of an object is a requirement. This paper presents a novel approach to selecting a minimised number of views that allow each object face to be adequately viewed according to specified constraints on viewpoints and other features. The planner is generic and can be employed for a wide range of multiple view acquisition systems, ranging from camera systems mounted on the end of a robot arm, i.e. an eye-in-hand camera setup, to a turntable and fixed stereo cameras to allow different views of an object to be obtained. The results (both simulated and real) given focus on planning with a fixed camera and turntable.

Proceedings ArticleDOI
01 Jan 1998
TL;DR: An efficient method within an active vision framework for recognizing objects which are ambiguous from certain viewpoints is presented, which uses an appearance based object representation, namely the parametric eigenspace, and augments it by probability distributions.
Abstract: We present an efficient method within an active vision framework for recognizing objects which are ambiguous from certain viewpoints. The system is allowed to reposition the camera to capture additional views and, therefore, to resolve the classification result obtained from a single view. The approach uses an appearance based object representation, namely the parametric eigenspace, and augments it by probability distributions. This captures possible variations in the input images due to errors in the pre-processing chain or the imaging system. Furthermore, the use of probability distributions gives us a gauge to view planning. View planning is shown to be of great use in reducing the number of images to be captured when compared to a random strategy.

Proceedings ArticleDOI
01 Sep 1998
TL;DR: A model based approach to human body tracking is presented in which the 2D silhouette of a moving human and the corresponding 3D skeletal structure are encapsulated within a non-linear Point Distribution Model.
Abstract: This paper presents a model based approach to human body tracking in which the 2D silhouette of a moving human and the corresponding 3D skeletal structure are encapsulated within a non-linear Point Distribution Model. This statistical model allows a direct mapping to be achieved between the external boundary of a human and the anatomical position. It is shown how this information, along with the position of landmark features such as the hands and head can be used to reconstruct information about the pose and structure of the human body from a monoscopic view of a scene.

Proceedings ArticleDOI
01 Jan 1998
TL;DR: A variational technique for finding low curvature smooth approximations to trajectories in the plane to find the sequence of internal states for which the observed behaviour of the vehicle has the highest probability.
Abstract: We present a variational technique for finding low curvature smooth approximations to trajectories in the plane. The method is applied to short segments of a vehicle trajectory in a known ground plane. Estimates of the speed and steering angle are obtained for each segment and the motion during the segment is assigned to one of the four classes: ahead, left, right, stop. A hidden Markov model for the motion of the car is constructed and the Viterbi algorithm is used to find the sequence of internal states for which the observed behaviour of the vehicle has the highest probability.

Proceedings ArticleDOI
01 Jan 1998
TL;DR: This paper aims to highlight the usefulness of the concepts of controllability and observability during the design stage of the filter and uses a practical vision application to illustrate a useful special case where these methods may be applied to a non-linear system.
Abstract: Kalman’s optimum linear filter has proved to be immensely popular in the field of computer vision. A less often quoted contribution of Kalman’s to the control theory literature is that of the concepts of controllability and observability which may be used to analyse the state transition and observation equations and give insights into the filter’s viability. This paper aims to highlight the usefulness of these two ideas during the design stage of the filter and, as well as presenting the standard solutions for linear systems, uses a practical vision application (that of tracking plants for an autonomous crop protection vehicle) to illustrate a useful special case where these methods may be applied to a non-linear system. The application of tests for controllability and observability to the practical non-linear system give not only confirmation that the filter will be able to produce stable estimates, but also gives a lower bound on the number of features required from each image for it to do so.

Proceedings ArticleDOI
01 Sep 1998
TL;DR: A correlation based image registration method which is able to register images related by a single global affine transformation or by a transformation field which is approximately piecewise affine.
Abstract: This paper describes a correlation based image registration method which is able to register images related by a single global affine transformation or by a transformation field which is approximately piecewise affine. The method has two key elements: an affine estimator, which derives estimates of the six affine parameters relating two image regions by aligning their Fourier spectra prior to correlating; and a multiresolution search process, which determines the global transformation field in terms of a set of local affine estimates at appropriate spatial resolutions. The method is computationally efficient and performs well for a range of different images and transformations.

Proceedings ArticleDOI
01 Jan 1998
TL;DR: A novel approach to learning longterm spatio-temporal patterns of objects in image sequences, using a neural network paradigm to predict future behaviour in response to a predator is presented.
Abstract: Rule-based systems employed to model complex object behaviours, do not necessarily provide a realistic portrayal of true behaviour. To capture the real characteristics in a specific environment, a better model may be learnt from observation. This paper presents a novel approach to learning long-term spatio-temporal patterns of objects in image sequences, using a neural network paradigm to predict future behaviour. The results demonstrate the application of our approach to the problem of predicting animal behaviour in response to a predator.

Proceedings ArticleDOI
01 Jan 1998
TL;DR: The facial surface is captured by a system based on structured light and adapted to face to deliver a cheap, fast and sufficiently precise solution to deliver automatic face authentication based on facial surface analysis.
Abstract: This paper presents automatic face authentication based on facial surface analysis. The success of a previous profile-based approach, exclusively relying on geometrical features of the external contour, led us to consider the full facial surface. This motivation was further supported by the independence of viewpoint and lighting conditions of 3D information. The geometry also carries information which is complementary to grey-level based approaches, supporting the combination with those techniques. The facial surface is captured by a system based on structured light and adapted to face to deliver a cheap, fast and sufficiently precise solution. Typical applications concern security in cooperative situations.

Proceedings ArticleDOI
16 Sep 1998
TL;DR: A system for reliably establishing correspondences between printed words and their electronic counterparts, without performing optical character recognition, which might have interesting applications in document database retrieval, since it allows an electronic document to be indexed by a printed version of itself.
Abstract: A common authoring technique involves making annotations on a printed draft and then typing the corrections into a computer at a later date. In this paper, we describe a system that goes some way towards automating this process. The author simply passes the annotated documents through a sheetfeed scanner and then brings up the electronic document in a text editor. The system then works out where the annotated words are and allows the author to skip from one annotation to the next at the touch of a key. At the heart of the system lies a procedure for reliably establishing correspondences between printed words and their electronic counterparts, without performing optical character recognition. This procedure might have interesting applications in document database retrieval, since it allows an electronic document to be indexed by a printed version of itself.

Proceedings ArticleDOI
01 Jan 1998
TL;DR: An approach to gesture recognition is presented in which gestures are modelled probabilistically as sequences of visual events matched to visual input using probabilistic models estimated from motion feature trajectories.
Abstract: An approach to gesture recognition is presented in which gestures are modelled probabilistically as sequences of visual events. These events are matched to visual input using probabilistic models estimated from motion feature trajectories. The features used are motion image moments. The method was applied to a set of gestures defined within the context of an application in visually mediated interaction in which they would be used to control an active teleconferencing camera. The approach is computationally efficient, allowing real-time performance to be obtained.

Proceedings ArticleDOI
01 Jan 1998
TL;DR: Improved algorithms which give more accurate and more robust results are described which can be used in a extension of an existing framework for establishing dense correspondences between a set of training examples to build a 3D Point Distribution Model.
Abstract: A previous publication has described a method of pairwise 3D surface correspondence for the automated generation of landmarks on a set of examples from a class of shape [3]. In this paper we describe a set of improved algorithms which give more accurate and more robust results. We show how the pairwise corresponder can be used in a extension of an existing framework for establishing dense correspondences between a set of training examples [4] to build a 3D Point Distribution Model. Examples are given for both synthetic and real data.

Proceedings ArticleDOI
01 Jan 1998
TL;DR: The method is shown to enhance model-based tracking, particularly in the presence of clutter and occlusion, and to provide a basis for identification and to be compared to previous (edge-based) iconic evaluation techniques.
Abstract: This paper presents an enhanced hypothesis verification strategy for 3D object recognition. A new learning methodology is presented which integrates the traditional dichotomic object-centred and appearance-based representations in computer vision giving improved hypothesis verification under iconic matching. The "appearance" of a 3D object is learnt using an eigenspace representation obtained as it is tracked through a scene. The feature representation implicitly models the background and the objects observed enabling the segmentation of the objects from the background. The method is shown to enhance model-based tracking, particularly in the presence of clutter and occlusion, and to provide a basis for identification. The unified approach is discussed in the context of the traffic surveillance domain. The approach is demonstrated on real-world image sequences and compared to previous (edge-based) iconic evaluation techniques.

Proceedings ArticleDOI
01 Jan 1998
TL;DR: An automatic face data acquisition system based on a magnetic sensor and a calibrated camera enabled us to obtain systematically a database of face images with labelled 3D poses across a view-sphere of yaw and tilt at intervals of .
Abstract: We present a method for learning appearance models that can be used to recognise and track both 3D head pose and identities of novel subjects with continuous head movement across the view-sphere. We describe an automatic face data acquisition system based on a magnetic sensor and a calibrated camera. The system enabled us to obtain systematically a database of face images with labelled 3D poses across a view-sphere of yaw and tilt at intervals of . The database was used to learn appearance models of unseen faces based on similarity measures to prototype faces. The method is computationally efficient and enables real-time performance.

Proceedings ArticleDOI
01 Jan 1998
TL;DR: This work considers the problem of obtaining the 3D trajectory of a ball from a sequence of images taken with a camera which is possibly rotating and zooming (but not translating), and develops techniques to compute the component of image motion of the ball due to camera rotation and zoom using optic flow.
Abstract: We consider the problem of obtaining the 3D trajectory of a ball from a sequence of images taken with a camera which is possibly rotating and zooming (but not translating). Techniques are developed to compute the component of image motion of the ball due to camera rotation and zoom, using optic flow. The 3D location of the ball in each frame of the sequence is then determined using a novel geometric construction which makes use of shadows on the known ground plane in order to compute the vertical projection of the ball onto the ground, and the height of the ball above the ground.

Proceedings ArticleDOI
01 Sep 1998
TL;DR: Experimental results show that spatial templates, horizontal- flow templates and the combined horizontal-flow and vertical-flow templates are better than vertical-flows templates for gait recognition, and the recognition performance for four kinds of template features has been evaluated.
Abstract: To recognize people by their gait from a sequence of images, we have proposed a statistical approach which combined eigenspace transformation (EST) with canonical space transformation (CST) for feature transformation of spatial templates. This approach is used to reduce data dimensionality and to optimize the class separability of different gait sequences simultaneously. Good recognition rates have been achieved. Here, we incorporate temporal information from optical flows into three kinds of temporal templates and use them as features for gait recognition in addition to the spatial templates. The recognition performance for four kinds of template features has been evaluated in this paper. Experimental results show that spatial templates, horizontal-flow templates and the combined horizontal-flow and vertical-flow templates are better than vertical-flow templates for gait recognition.