scispace - formally typeset
Search or ask a question

Showing papers presented at "British Machine Vision Conference in 1997"


Proceedings Article
01 Jan 1997
TL;DR: The shape variation displayed by a class of objects can be represented as probability density function, allowing us to determine plausible and implausible examples of the class, and how this distribution can be used in image search to locateExamples of the modelled object in new images.
Abstract: The shape variation displayed by a class of objects can be represented as probability density function, allowing us to determine plausible and implausible examples of the class. Given a training set of example shapes we can align them into a common co-ordinate frame and use kernel-based density estimation techniques to represent this distribution. Such an estimate is complex and expensive, so we generate a simpler approximation using a mixture of gaussians. We show how to calculate the distribution, and how it can be used in image search to locate examples of the modelled object in new images.

326 citations


Proceedings Article
Paul L. Rosin1
01 Jan 1997
TL;DR: In this paper, the authors describe four different methods for selecting thresholds that work on very different principles: either the noise or the signal is modeled, and the model covers either the spatial or intensity distribution characteristics.
Abstract: Image differencing is used for many applications involving change detection. Although it is usually followed by a thresholding operation to isolate regions of change there are few methods available in the literature specific to (and appropriate for) change detection. We describe four different methods for selecting thresholds that work on very different principles. Either the noise or the signal is modeled, and the model covers either the spatial or intensity distribution characteristics. The methods are as follows: (1) a Normal model is used for the noise intensity distribution, (2) signal intensities are tested by making local intensity distribution comparisons in the two image frames (i.e., the difference map is not used), (3) the spatial properties of the noise are modeled by a Poisson distribution, and (4) the spatial properties of the signal are modeled as a stable number of regions (or stable Euler number).

313 citations


Proceedings Article
01 Jan 1997
TL;DR: In this article, a new system for the automatic determination of the position, size and pose of the head of a human figure in a camera image is presented, which is an extension of the well-known face recognition system [15] to pose estimation.
Abstract: We present a new system for the automatic determination of the position, size and pose of the head of a human figure in a camera image. The system is an extension of the well-known face recognition system [15] to pose estimation. The pose estimation system is characterized by a certain reliability and speed. We improve this performance and speed with the help of statistical estimation methods. In order to make these applicable, we reduce the originally very high dimensionality of our system with the help of a number of a priori principles. We discuss a possible extension of the learning algorithm aiming an autonomous object recognition system at the end of the paper.

111 citations


Proceedings Article
01 Jan 1997
TL;DR: In this article, a model-based method for visual tracking of vehicles for use in a real-time system intended to provide continuous monitoring and classification of traffic from a fixed camera on a busy multi-lane motorway.
Abstract: This paper reports the current state of work to simplify our previous model-based methods for visual tracking of vehicles for use in a realtime system intended to provide continuous monitoring and classification of traffic from a fixed camera on a busy multi-lane motorway. The main constraints of the system design were: (i) all low level processing is to be carried out by low-cost auxiliary hardware; (ii) all 3-D reasoning is to be carried out automatically off-line, at set-up time. The system developed uses three main stages: (i) pose and model hypothesis using 1-D templates, (ii) hypothesis tracking, and (iii) hypothesis verification, using 2-D templates. Stages (i) and (iii) have radically different computing performance and computational costs. and need to be carefully balanced for efficiency. Together, they provide an effective way to locate, track and classify vehicles.

111 citations


Proceedings Article
08 Sep 1997
TL;DR: A novel integrated vision system in which two autonomous visual modules are combined to interpret a dynamic scene and a novel method for handling occlusion between objects within the context of this hybrid tracking system is described.
Abstract: The paper describes a novel integrated vision system in which two autonomous visual modules are combined to interpret a dynamic scene. The first module employs a 3D model-based scheme to track rigid objects such as vehicles. The second module uses a 2D deformable model to track non-rigid objects such as people. The principal contribution is a novel method for handling occlusion between objects within the context of this hybrid tracking system. The practical aim of the work is to derive a scene description that is sufficiently rich to be used in a range of surveillance tasks. The paper describes each of the modules in outline before detailing the method of integration and the handling of occlusion in particular. Experimental results are presented to illustrate the performance of the system in a dynamic outdoor scene involving cars and people.

110 citations


Proceedings Article
01 Jan 1997
TL;DR: A linear rectiication algorithm for general, unconstrained stereo rigs that requires the two projection matrices of the original cameras, and enforces explicitly all constraints necessary and suucient to achieve a unique pair of rectiied projectionMatrices.
Abstract: We present a linear rectiication algorithm for general, unconstrained stereo rigs. The algorithm requires the two projection matrices of the original cameras, and enforces explicitly all constraints necessary and suucient to achieve a unique pair of rectiied projection matrices. We report tests proving the correct behaviour of our method, as well as the negligible decrease of the accuracy of 3-D reconstruction if performed from the rectiied images directly. To maximise reproducibility and usefulness, we give a working, 22-line Matlab code, and a URL where code, example data and a user guide can be found. Stereo reconstruction systems are very popular in vision research and applications, hence the usefulness of a general and easily accessible rectiication algorithm.

87 citations


Proceedings Article
01 Jan 1997
TL;DR: In this article, a method of updating a first-order global estimate of identity by learning the class-specific correlation between the estimate and the residual variation during a sequence is proposed, which results in robust tracking and a more stable estimate of facial identity under changing conditions.
Abstract: We address the problem of robust face identification in the presence of pose, lighting, and expression variation. Previous approaches to the problem have assumed similar models of variation for each individual, estimated from pooled training data. We describe a method of updating a first order global estimate of identity by learning the class-specific correlation between the estimate and the residual variation during a sequence. This is integrated with an optimal tracking scheme, in which identity variation is decoupled from pose, lighting and expression variation. The method results in robust tracking and a more stable estimate of facial identity under changing conditions.

85 citations


Proceedings Article
01 Jan 1997
TL;DR: The combin ation of simple shape descriptors is shown to have good recognition capabilit ies and low computation costs and the combination of simpleshape descriptors for characterization of irregular objects is tested.
Abstract: This paper focuses on recognition powers and computational efforts of thr ee different shape coding techniques, namely the chain code histogram (CCH), the pairwise geometric histogram (PGH), and the combination of simpl e shape descriptors, for characterization of irregular objects. In recognizin irregular objects the essential task is to design efficient measures based on relatively small prior knowledge on geometrical constraints of possibl e target objects. Three rather different approaches are evaluated and discussed by the means of the self-organizing map (SOM). A database retrieval problem is also assumed to further test their discriminatory powers. As a case stu dy, natural irregular objects have been used. Grouping of these objects based on their visual similarity is the main topic in this paper. The combin ation of simple shape descriptors is shown to have good recognition capabilit ies and low computation costs.

82 citations


Proceedings Article
01 Jan 1997
TL;DR: A piecewise linear method for applying constraints within model shape space whereby principal component analysis is used on training data clusters in shape space to generate lower dimensional overlapping subspaces thus improving the speci city of the model.
Abstract: The Point Distribution Model PDM has proved useful for many tasks involving the location and tracking of deformable objects A principal limitation is non speci city in constructing a model to in clude all valid object shapes the inclusion of some invalid shapes is unavoidable due to the linear nature of the approach Bregler and Omohundro describe a piecewise linear method for applying constraints within model shape space whereby principal component analysis is used on training data clusters in shape space to generate lower dimensional overlapping subspaces Object shapes are constrained to lie within the union of these subspaces thus improving the speci city of the model This is an important development in itself but its most useful qual ity is that it lends itself to automated training Manual annotation of training examples has previously been necessary to ensure good speci city in PDMs requiring expertise and time and thus limiting the amount of training data that can feasibly be collected The use of shape space constraints means that such accurate annotation is unnec essary and automated training becomes signi cantly more successful In this paper we expand on Bregler and Omohundro s work sug gesting an alternative representation for the linear pieces and showing how a two level hierarchy in shape space can be used to improve e ciency and reduce noise We perform an evaluation on both synthetic and automatically trained real models

80 citations


Proceedings Article
01 Jan 1997
TL;DR: A new method based on texture analysis for script identification which does not require character segmentation is presented, which shows robustness with respect to noise, the presence of foreign characters or numerals, and can be applied to very small amounts of text.
Abstract: In this paper we present a detailed review of current script and language identification techniques. The main criticism of the existing techniques is that most of them rely on either connected component analysis or character segmentation. We go on to present a new method based on texture analysis for script identification which does not require character segmentation. A uniform text block on which texture analysis can be performed is produced from a document image via simple processing. Multiple channel (Gabor) filters and grey level co-occurrence matrices are used in independent experiments in order to extract texture features. Classification of test documents is made based on the features of training documents using the K-NN classifier. Initial results of over 95% accuracy on the classification of 105 rest decrements from 7 scripts are very promising. The method shows robustness with respect to noise, the presence of foreign characters or numerals, and can be applied to very small amounts of text.

68 citations


Proceedings Article
01 Jan 1997
TL;DR: An operator for the detection of the true location and orientation of corners is introduced and its strength in dealing with junctions as well as corners is demonstrated.


Proceedings Article
01 Jan 1997
TL;DR: An integrated system for the acquisition, normalisation and recognition of moving faces in dynamic scenes using mixture models and the use of Gaussian colour mixtures for face detection and tracking is introduced.
Abstract: An integrated system for the acquisition, normalisation and recognition of moving faces in dynamic scenes is introduced. Four face recognition tasks are deened and it is argued that modelling person-speciic probability densities in a generic face space using mixture models provides a technique applicable to all four tasks. The use of Gaussian colour mixtures for face detection and tracking is also described. Results are presented using data from the integrated system.

Proceedings Article
01 Jan 1997
TL;DR: In this paper, the authors proposed a method for fast face localisation and verification (identification) based on a robust form of correlation, where the correlation was estimated from a set of samples drawn from a Sobol sequence.
Abstract: We propose a method for fast face localisation and verification (identification) based on a robust form of correlation. Geometric and photometric normalisation of face images was achieved by direct minimisation. During optimisation, the correlation was estimated from a set of samples drawn from a Sobol sequence. This Monte Carlo technique speeds the evaluation of correlation approximately twenty five times and makes the optimisation process near-real time. In recognition experiments, the optimised robust correlation outperformed two standard techniques based on the dynamic link architecture [M.Lades et al., IEEE Transactions on Computers, 42(3) (1993) 300–311].


Proceedings Article
01 Jan 1997
TL;DR: Results on a widely used face database show the continuous n-tuple classifier to be as accurate as any method reported in the literature, while having the advantages of speed and simplicity over other methods.
Abstract: The continuous n-tuple classifier was recently proposed by the author as a new type of n-tuple classifier that is ideally suited to problems where the input is continuous or multi-level rather than binary. Results on a widely used face database show the continuous n-tuple classifier to be as accurate as any method reported in the literature, while having the advantages of speed and simplicity over other methods. This paper summarises the previous work, provides fresh insight into how the system works and discusses its applicability to real-time face recognition. (7 pages)

Proceedings Article
01 Jan 1997
TL;DR: A new simple method for achieving feature correspondence across a pair of images which requires no calibration information and draws from the method proposed by Scott and Longuet Higgins is presented.
Abstract: image analysis, feature correspondence, stereo, singular value, decomposition This paper presents a new simple method for achieving feature correspondence across a pair of images which requires no calibration information and draws from the method proposed by Scott and Longuet Higgins [8] Despite the well-known combinatorial complexity of the problem, this work shows that an acceptably good solution can be obtained directly by singular value decomposition of an appropriate image-based correspondence strength matrix The paper includes several experiments and discusses the method and draws comparisons with a related relaxation-based method by [14] Given its tremendous performance / complexity figure, the method is particularly suitable for research purposes where an off the shelf but reliable feature correspondence is needed For this reason, a succinct MATLAB implementation of the method is included and a C version will be soon available on the WEB

Proceedings Article
01 Jan 1997
TL;DR: In this article, the problem of how best to select a single mapping from this feasible set as an estimate of the unknown illuminant is addressed. But the problem is not solved by inverting the perspective transform.
Abstract: The requirement that the sensor responses of a camera to a given surface reflectance be constant under changing illumination conditions has led to the development of the so called colour constancy algorithms. Given an image recorded under an unknown illuminant, the task for a colour constancy algorithm is to recover an estimate of the scene illuminant. One such algorithm developed by D.A. Forsyth, A novel algorithm for colour constancy, International Journal of Computer Vision 5 (1) (1990) 5–36 [1] and later extended by G.D. Finlayson, Color in perspective, IEEE Transactions on Pattern Analysis and Machine Intelligence 18(10) (1996) 1034–1038 [2] exploits the constraint that under a canonical illuminant all surface colours fall within a maximal convex set-the canonical gamut . Given a set of image colours Forsyth’s algorithm recovers the set of mappings which take these colours into the canonical gamut. This feasible set of mappings represents all illuminants, which are consistent with the recorded image colours. In this article we address the question of how best to select a single mapping from this feasible set as an estimate of the unknown illuminant. We develop our approach in the context of Finlayson’s colour -in- perspective algorithm. This algorithm performs a perspective transform on the sensor data to discard intensity information which, without unrealistic constraints (uniform illumination and no specularities) being placed on the world, cannot be recovered accurately. Unfortunately, the feasible set of mappings recovered by this algorithm is also perspectively distorted. Here, we argue that this distortion must be removed prior to carrying out map selection and show that this is easily achieved by inverting the perspective transform. A mean-selection criterion operating on non-perspective mapping space provides good colour constancy for a variety of synthetic and real images. Importantly, constancy performance surpasses all other existing methods.

Proceedings Article
01 Jan 1997
TL;DR: A vision- based kerb detection system is described which uses the Hough Transform to find clusters of parallel lines in the image as evidence for a kerb, combined with a stereo vision-based obstacle detection algorithm.
Abstract: A key component of a technological aid for the partially sighted (TAPS) is a system to detect kerbs and steps. In this paper, a vision-based kerb detection system is described which uses the Hough Transform to find clusters of parallel lines in the image as evidence for a kerb. This is combined with a stereo vision-based obstacle detection algorithm. Experiments show that kerb regions are identified correctly from the images. An error analysis of the obstacle detection algorithm enables the kerb height, and its uncertainty, to be determined.

Proceedings Article
01 Jan 1997
TL;DR: An evaluation of a robust visual image tracker on echocardiographic image sequences shows how the tracking framework can be customised to define an appropriate shape-space that describes heart shape deformations that can be learnt from a training data set.

Proceedings Article
01 Jan 1997
TL;DR: In this paper, the authors describe methods to measure the following properties of gray level corners: subtended angle, orientation, contrast, bluntness (or rounding of the apex), and boundary curvature (for cusps).
Abstract: We describe methods to measure the following properties of gray level corners: subtended angle, orientation, contrast, bluntness (or rounding of the apex), and boundary curvature (for cusps). Unlike most of the published methods for extracting these properties these new methods are relatively simple, efficient, and robust. They rely on the corner being predetected by a standard operator, thus making the measurement problem more tractable. Using 13,000 synthetic images the methods are assessed over a range of conditions: corners of varying orientations and subtended angles, as well as different degrees of noise.

Proceedings Article
01 Jan 1997
TL;DR: The motivation here is not in producing a PDM to assist in tracking, but one that describes the real shape behaviour of a group of animals, which may be used to measure the accuracy of simulations.
Abstract: Models of collective animal behaviour, such as the ocking of birds, are usually based upon simulations that appear to exhibit properties of the animals in question. In this paper, we describe an alternative approach; that of automatically extracting a model of animal behaviour from video sequences of real animals. Point distribution models (PDMs) are used to describe the shape of a ock of ducks in response to a robot predator. Additional parameters that govern this interaction are chosen and included in the model by extending the PDM. A suitable method for scaling their innuence in the PDM is described. The motivation here is not in producing a PDM to assist in tracking, but one that describes the real shape behaviour of a group of animals, which may be used to measure the accuracy of simulations.


Proceedings Article
01 Sep 1997
TL;DR: A surface segmentation method which uses a simulated inflating balloon model to segment surface structure from volumetric data using a triangular mesh is presented, which provides a technique robust to noise regardless of the feature detection scheme used.
Abstract: This paper presents a surface segmentation method which uses a simulated inflating balloon model to segment surface structure from volumetric data using a triangular mesh. The model employs simulated surface tension and an inflationary force to grow from within an object and find its boundary. Mechanisms are described which allow both evenly spaced and minimal polygonal count surfaces to be generated. The work is based on inflating balloon models by Terzopolous[8]. Simplifications are made to the model, and an approach proposed which provides a technique robust to noise regardless of the feature detection scheme used. The proposed technique uses no explicit attraction to data features, and as such is less dependent on the initialisation of the model and parameters. The model grows under its own forces, and is never anchored to boundaries, but instead constrained to remain inside the desired object. Results are presented which demonstrate the technique’s ability and speed at the segmentation of a complex, concave object with narrow features.

Proceedings Article
01 Jan 1997
TL;DR: It is found that the standard diffusion-based systems are not as insensitive to noise and occlusion as one might wish.
Abstract: There are several methods for forming a scale-space and they may be classified as being based on diffusion or morphology. However it is rare for the methods to be compared. Here we outline a method for such a comparison based on robustness and give results for linear diffusion, the most widely studied method, and a sieve (a new morphological method). We find that the standard diffusion-based systems are not as insensitive to noise and occlusion as one might wish.


Proceedings Article
01 Jan 1997
TL;DR: In this article, the authors presented Point Distribution Models (PDMs) constructed from Magnetic Resonance scanned foetal livers and investigated their use in reconstructing 3D shapes from sparse data, as an aid to volume estimation.
Abstract: In this article we will present Point Distribution Models (PDMs) constructed from Magnetic Resonance scanned foetal livers and will investigate their use in reconstructing 3D shapes from sparse data, as an aid to volume estimation. A solution of the model to data matching problem will be presented that is based on a hybrid Genetic Algorithm (GA). The GA has amongst its genetic operators, elements that extend the general Iterative Closest Point (ICP) algorithm to include deformable shape parameters. Results from using the GA to estimate volumes from two sparse sampling schemes will be presented. We will show how the algorithm can estimate liver volumes in the range of 10.26 to 28.84 cc with an accuracy of 0.17 +/- 4.44% when using only three sections through the liver volume. (C) 1999 Published by Elsevier Science B.V. All rights reserved.

Proceedings Article
01 Jan 1997
TL;DR: This work investigates the possibility of using stereo vision to provide the range information, in conjunction with a scanning laser radar sensor, and presents details of the real-time implementation of the vision system on a network of C40 DSPs.
Abstract: An important component of the drive towards intelligent vehicles is the ability to maintain a fixed distance from a lead vehicle using feedback provided by range sensors. We investigate the possibility of using stereo vision to provide the range information, in conjunction with a scanning laser radar sensor. The tracker utilizes a layered architecture wherein the bottom layer computes motion in both images using a simple correlation algorithm, and the upper level performs stereo fixation and reconstruction using an algorithm designed for active vision systems. We present details of the real-time implementation of the vision system on a network of C40 DSPs, and some initial results comparing the quality of range measurements provided by a vision system with the laser radar system.

Proceedings Article
01 Jan 1997
TL;DR: A method for corresponding the triangulated mesh surface representations of two shapes with a global Euclidean measure of similarity and results are presented for the production of a binary tree of merged biological shapes which may be used as a basis for automated landmarking.
Abstract: A method for corresponding the triangulated mesh surface representations of two shapes is presented. The algorithm produces a matching pair of sparse polyhedral approximations, one for each shape surface, using a global Euclidean measure of similarity. A method of surface patch parameterisation is presented and its use in the interpolation of surfaces for the construction of a merged mean shape with a densely triangulated surface is described. Results are presented for the production of a binary tree of merged biological shapes which may be used as a basis for automated landmarking.

Proceedings Article
01 Jan 1997
TL;DR: The model based approach is shown to accurately describe the pollen grains used in the test but requires considerable refinement to be useful whereas the neural network provides excellent results comparable to a human operator.
Abstract: The analysis of pollen grains taken from core samples is an extremely valuable technique for climate reconstruction. There is a great need for an automated classification system which can provide a swift and accurate analysis of the relative amounts of pollen on a microscope slide. Pollen grains have a complex three-dimensional structure and can appear on the microscope slide in any orientation. Despite efforts to improve the preparation of the slides a large amount of debris is also present. This paper describes work being conducted to reliably separate pollen grains from debris. Two methods of pollen identification are compared, a model based approach and a self-training, deformable template neural network. The model based approach is shown to accurately describe the pollen grains used in the test but requires considerable refinement to be useful whereas the neural network provides excellent results comparable to a human operator.