Stephen J. Maybank
Bio: Stephen J. Maybank is an academic researcher from Birkbeck, University of London. The author has contributed to research in topics: Video tracking & Motion estimation. The author has an hindex of 50, co-authored 166 publications receiving 15225 citations. Previous affiliations of Stephen J. Maybank include University of Oxford & University of Reading.
Papers published on a yearly basis
••01 Aug 2004
TL;DR: This paper reviews recent developments and general strategies of the processing framework of visual surveillance in dynamic scenes, and analyzes possible research directions, e.g., occlusion handling, a combination of two and three-dimensional tracking, and fusion of information from multiple sensors, and remote surveillance.
Abstract: Visual surveillance in dynamic scenes, especially for humans and vehicles, is currently one of the most active research topics in computer vision. It has a wide spectrum of promising applications, including access control in special areas, human identification at a distance, crowd flux statistics and congestion analysis, detection of anomalous behaviors, and interactive surveillance using multiple cameras, etc. In general, the processing framework of visual surveillance in dynamic scenes includes the following stages: modeling of environments, detection of motion, classification of moving objects, tracking, understanding and description of behaviors, human identification, and fusion of data from multiple cameras. We review recent developments and general strategies of all these stages. Finally, we analyze possible research directions, e.g., occlusion handling, a combination of twoand three-dimensional tracking, a combination of motion analysis and biometrics, anomaly detection and behavior prediction, content-based retrieval of surveillance videos, behavior understanding and natural language description, fusion of information from multiple sensors, and remote surveillance.
TL;DR: A general tensor discriminant analysis (GTDA) is developed as a preprocessing step for LDA for face recognition and achieves good performance for gait recognition based on image sequences from the University of South Florida (USF) HumanID Database.
Abstract: Traditional image representations are not suited to conventional classification methods such as the linear discriminant analysis (LDA) because of the undersample problem (USP): the dimensionality of the feature space is much higher than the number of training samples. Motivated by the successes of the two-dimensional LDA (2DLDA) for face recognition, we develop a general tensor discriminant analysis (GTDA) as a preprocessing step for LDA. The benefits of GTDA, compared with existing preprocessing methods such as the principal components analysis (PCA) and 2DLDA, include the following: 1) the USP is reduced in subsequent classification by, for example, LDA, 2) the discriminative information in the training tensors is preserved, and 3) GTDA provides stable recognition rates because the alternating projection optimization algorithm to obtain a solution of GTDA converges, whereas that of 2DLDA does not. We use human gait recognition to validate the proposed GTDA. The averaged gait images are utilized for gait representation. Given the popularity of Gabor-function-based image decompositions for image understanding and object recognition, we develop three different Gabor-function-based image representations: 1) GaborD is the sum of Gabor filter responses over directions, 2) GaborS is the sum of Gabor filter responses over scales, and 3) GaborSD is the sum of Gabor filter responses over scales and directions. The GaborD, GaborS, and GaborSD representations are applied to the problem of recognizing people from their averaged gait images. A large number of experiments were carried out to evaluate the effectiveness (recognition rate) of gait recognition based on first obtaining a Gabor, GaborD, GaborS, or GaborSD image representation, then using GDTA to extract features and, finally, using LDA for classification. The proposed methods achieved good performance for gait recognition based on image sequences from the University of South Florida (USF) HumanID Database. Experimental comparisons are made with nine state-of-the-art classification methods in gait recognition.
TL;DR: A comprehensive survey of knowledge distillation from the perspectives of knowledge categories, training schemes, teacher-student architecture, distillation algorithms, performance comparison and applications can be found in this paper.
Abstract: In recent years, deep neural networks have been successful in both industry and academia, especially for computer vision tasks. The great success of deep learning is mainly due to its scalability to encode large-scale data and to maneuver billions of model parameters. However, it is a challenge to deploy these cumbersome deep models on devices with limited resources, e.g., mobile phones and embedded devices, not only because of the high computational complexity but also the large storage requirements. To this end, a variety of model compression and acceleration techniques have been developed. As a representative type of model compression and acceleration, knowledge distillation effectively learns a small student model from a large teacher model. It has received rapid increasing attention from the community. This paper provides a comprehensive survey of knowledge distillation from the perspectives of knowledge categories, training schemes, teacher-student architecture, distillation algorithms, performance comparison and applications. Furthermore, challenges in knowledge distillation are briefly reviewed and comments on future research are discussed and forwarded.
••19 May 1992
TL;DR: It is shown, using experiments with noisy data, that it is possible to calibrate a camera just by pointing it at the environment, selecting points of interest and then tracking them in the image as the camera moves.
Abstract: The problem of finding the internal orientation of a camera (camera calibration) is extremely important for practical applications. In this paper a complete method for calibrating a camera is presented. In contrast with existing methods it does not require a calibration object with a known 3D shape. The new method requires only point matches from image sequences. It is shown, using experiments with noisy data, that it is possible to calibrate a camera just by pointing it at the environment, selecting points of interest and then tracking them in the image as the camera moves. It is not necessary to know the camera motion.
TL;DR: The feasibility of camera calibration based on the epipolar transformation is demonstrated and two curves of degree six can be obtained in the dual plane such that one of the real intersections of the two yields the correct camera calibration.
Abstract: There is a close connection between the calibration of a single camera and the epipolar transformation obtained when the camera undergoes a displacement. The epipolar transformation imposes two algebraic constraints on the camera calibration. If two epipolar transformations, arising from different camera displacements, are available then the compatible camera calibrations are parameterized by an algebraic curve of genus four. The curve can be represented either by a space curve of degree seven contained in the intersection of two cubic surfaces, or by a curve of degree six in the dual of the image plane. The curve in the dual plane has one singular point of order three and three singular points of order two. If three epipolar transformations are available, then two curves of degree six can be obtained in the dual plane such that one of the real intersections of the two yields the correct camera calibration. The two curves have a common singular point of order three. Experimental results are given to demonstrate the feasibility of camera calibration based on the epipolar transformation. The real intersections of the two dual curves are found by locating the zeros of a function defined on the interval [0, 2π]. The intersection yielding the correct camera calibration is picked out by referring back to the three epipolar transformations.
01 Jan 2001
TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.
Abstract: Downloading the book in this website lists can give you more advantages. It will show you the best book collections and completed collections. So many books can be found in this website. So, this is not only this multiple view geometry in computer vision. However, this book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts. This is simple, read the soft file of the book and you get it.
TL;DR: A flexible technique to easily calibrate a camera that only requires the camera to observe a planar pattern shown at a few (at least two) different orientations is proposed and advances 3D computer vision one more step from laboratory environments to real world use.
Abstract: We propose a flexible technique to easily calibrate a camera. It only requires the camera to observe a planar pattern shown at a few (at least two) different orientations. Either the camera or the planar pattern can be freely moved. The motion need not be known. Radial lens distortion is modeled. The proposed procedure consists of a closed-form solution, followed by a nonlinear refinement based on the maximum likelihood criterion. Both computer simulation and real data have been used to test the proposed technique and very good results have been obtained. Compared with classical techniques which use expensive equipment such as two or three orthogonal planes, the proposed technique is easy to use and flexible. It advances 3D computer vision one more step from laboratory environments to real world use.
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.
University of Vermont1, University of Minnesota2, University of Texas at Austin3, Hong Kong University of Science and Technology4, Osaka University5, University of Queensland6, Griffith University7, University of Illinois at Chicago8, IBM9, Nanjing University10, Imperial College London11, University of Salford12
TL;DR: This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART.
Abstract: This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. These top 10 algorithms are among the most influential data mining algorithms in the research community. With each algorithm, we provide a description of the algorithm, discuss the impact of the algorithm, and review current and further research on the algorithm. These 10 algorithms cover classification, clustering, statistical learning, association analysis, and link mining, which are all among the most important topics in data mining research and development.
30 Sep 2010
TL;DR: Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images and takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene.
Abstract: Humans perceive the three-dimensional structure of the world with apparent ease. However, despite all of the recent advances in computer vision research, the dream of having a computer interpret an image at the same level as a two-year old remains elusive. Why is computer vision such a challenging problem and what is the current state of the art? Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply to their own personal photos and videos. More than just a source of recipes, this exceptionally authoritative and comprehensive textbook/reference also takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene. These problems are also analyzed using statistical models and solved using rigorous engineering techniques Topics and features: structured to support active curricula and project-oriented courses, with tips in the Introduction for using the book in a variety of customized courses; presents exercises at the end of each chapter with a heavy emphasis on testing algorithms and containing numerous suggestions for small mid-term projects; provides additional material and more detailed mathematical topics in the Appendices, which cover linear algebra, numerical techniques, and Bayesian estimation theory; suggests additional reading at the end of each chapter, including the latest research in each sub-field, in addition to a full Bibliography at the end of the book; supplies supplementary course material for students at the associated website, http://szeliski.org/Book/. Suitable for an upper-level undergraduate or graduate-level course in computer science or engineering, this textbook focuses on basic techniques that work under real-world conditions and encourages students to push their creative boundaries. Its design and exposition also make it eminently suitable as a unique reference to the fundamental techniques and current research literature in computer vision.