scispace - formally typeset
Search or ask a question

Showing papers on "Feature extraction published in 2000"


Journal ArticleDOI
TL;DR: The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.
Abstract: The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, neural network techniques and methods imported from statistical learning theory have been receiving increasing attention. The design of a recognition system requires careful attention to the following issues: definition of pattern classes, sensing environment, pattern representation, feature extraction and selection, cluster analysis, classifier design and learning, selection of training and test samples, and performance evaluation. In spite of almost 50 years of research and development in this field, the general problem of recognizing complex patterns with arbitrary orientation, location, and scale remains unsolved. New and emerging applications, such as data mining, web searching, retrieval of multimedia data, face recognition, and cursive handwriting recognition, require robust and efficient pattern recognition techniques. The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.

6,527 citations


Journal ArticleDOI
TL;DR: In this paper, the primary goal of pattern recognition is supervised or unsupervised classification, and the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been used.
Abstract: The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical ap...

4,307 citations


Proceedings Article
01 Jan 2000
TL;DR: A database designed to evaluate the performance of speech recognition algorithms in noisy conditions and recognition results are presented for the first standard DSR feature extraction scheme that is based on a cepstral analysis.
Abstract: This paper describes a database designed to evaluate the performance of speech recognition algorithms in noisy conditions. The database may either be used for the evaluation of front-end feature extraction algorithms using a defined HMM recognition back-end or complete recognition systems. The source speech for this database is the TIdigits, consisting of connected digits task spoken by American English talkers (downsampled to 8kHz) . A selection of 8 different real-world noises have been added to the speech over a range of signal to noise ratios and special care has been taken to control the filtering of both the speech and noise. The framework was prepared as a contribution to the ETSI STQ-AURORA DSR Working Group [1]. Aurora is developing standards for Distributed Speech Recognition (DSR) where the speech analysis is done in the telecommunication terminal and the recognition at a central location in the telecom network. The framework is currently being used to evaluate alternative proposals for front-end feature extraction. The database has been made publicly available through ELRA so that other speech researchers can evaluate and compare the performance of noise robust algorithms. Recognition results are presented for the first standard DSR feature extraction scheme that is based on a cepstral analysis.

1,909 citations


Proceedings Article
Mark Hall1
29 Jun 2000
TL;DR: In this article, a fast, correlation-based filter algorithm that can be applied to continuous and discrete problems is described, which often outperforms the ReliefF attribute estimator when used as a preprocessing step for naive Bayes, instance-based learning, decision trees, locally weighted regression, and model trees.
Abstract: Algorithms for feature selection fall into two broad categories: wrappers that use the learning algorithm itself to evaluate the usefulness of features and filters that evaluate features according to heuristics based on general characteristics of the data. For application to large databases, filters have proven to be more practical than wrappers because they are much faster. However, most existing filter algorithms only work with discrete classification problems. This paper describes a fast, correlation-based filter algorithm that can be applied to continuous and discrete problems. The algorithm often outperforms the well-known ReliefF attribute estimator when used as a preprocessing step for naive Bayes, instance-based learning, decision trees, locally weighted regression, and model trees. It performs more feature selection than ReliefF does—reducing the data dimensionality by fifty percent in most cases. Also, decision and model trees built from the preprocessed data are often significantly smaller.

1,511 citations


Journal ArticleDOI
TL;DR: It is proved that the most expressive vectors derived in the null space of the within-class scatter matrix using principal component analysis (PCA) are equal to the optimal discriminant vectorsderived in the original space using LDA.

1,447 citations


Journal ArticleDOI
TL;DR: This work presents a new approach to feature extraction in which feature selection and extraction and classifier training are performed simultaneously using a genetic algorithm, and employs this technique in combination with the k nearest neighbor classification rule.
Abstract: Pattern recognition generally requires that objects be described in terms of a set of measurable features. The selection and quality of the features representing each pattern affect the success of subsequent classification. Feature extraction is the process of deriving new features from original features to reduce the cost of feature measurement, increase classifier efficiency, and allow higher accuracy. Many feature extraction techniques involve linear transformations of the original pattern vectors to new vectors of lower dimensionality. While this is useful for data visualization and classification efficiency, it does not necessarily reduce the number of features to be measured since each new feature may be a linear combination of all of the features in the original pattern vector. Here, we present a new approach to feature extraction in which feature selection and extraction and classifier training are performed simultaneously using a genetic algorithm. The genetic algorithm optimizes a feature weight vector used to scale the individual features in the original pattern vectors. A masking vector is also employed for simultaneous selection of a feature subset. We employ this technique in combination with the k nearest neighbor classification rule, and compare the results with classical feature selection and extraction techniques, including sequential floating forward feature selection, and linear discriminant analysis. We also present results for the identification of favorable water-binding sites on protein surfaces.

849 citations


Journal ArticleDOI
TL;DR: In this paper, a denoising method based on wavelet analysis is applied to feature extraction for mechanical vibration signals, which is an advanced version of the famous soft thresholding denoizing method proposed by Donoho and Johnstone.

823 citations


Proceedings ArticleDOI
05 Jun 2000
TL;DR: A large improvement in word recognition performance is shown by combining neural-net discriminative feature processing with Gaussian-mixture distribution modeling.
Abstract: Hidden Markov model speech recognition systems typically use Gaussian mixture models to estimate the distributions of decorrelated acoustic feature vectors that correspond to individual subword units. By contrast, hybrid connectionist-HMM systems use discriminatively-trained neural networks to estimate the probability distribution among subword units given the acoustic observations. In this work we show a large improvement in word recognition performance by combining neural-net discriminative feature processing with Gaussian-mixture distribution modeling. By training the network to generate the subword probability posteriors, then using transformations of these estimates as the base features for a conventionally-trained Gaussian-mixture based system, we achieve relative error rate reductions of 35% or more on the multicondition Aurora noisy continuous digits task.

803 citations


Proceedings ArticleDOI
01 Jun 2000
TL;DR: A robust method for automatically matching features in images corresponding to the same physical point on an object seen from two arbitrary viewpoints that is optimised for a structure-from-motion application where it wishes to ignore unreliable matches at the expense of reducing the number of feature matches.
Abstract: We present a robust method for automatically matching features in images corresponding to the same physical point on an object seen from two arbitrary viewpoints. Unlike conventional stereo matching approaches we assume no prior knowledge about the relative camera positions and orientations. In fact in our application this is the information we wish to determine from the image feature matches. Features are detected in two or more images and characterised using affine texture invariants. The problem of window effects is explicitly addressed by our method-our feature characterisation is invariant to linear transformations of the image data including rotation, stretch and skew. The feature matching process is optimised for a structure-from-motion application where we wish to ignore unreliable matches at the expense of reducing the number of feature matches.

738 citations


Journal ArticleDOI
TL;DR: This work presents algorithms for detecting and tracking text in digital video that implements a scale-space feature extractor that feeds an artificial neural processor to detect text blocks.
Abstract: Text that appears in a scene or is graphically added to video can provide an important supplemental source of index information as well as clues for decoding the video's structure and for classification. In this work, we present algorithms for detecting and tracking text in digital video. Our system implements a scale-space feature extractor that feeds an artificial neural processor to detect text blocks. Our text tracking scheme consists of two modules: a sum of squared difference (SSD) based module to find the initial position and a contour-based module to refine the position. Experiments conducted with a variety of video sources show that our scheme can detect and track text robustly.

635 citations


Journal ArticleDOI
TL;DR: A speech recognition system that uses both acoustic and visual speech information to improve recognition performance in noisy environments and is demonstrated on a large multispeaker database of continuously spoken digits.
Abstract: This paper describes a speech recognition system that uses both acoustic and visual speech information to improve recognition performance in noisy environments. The system consists of three components: a visual module; an acoustic module; and a sensor fusion module. The visual module locates and tracks the lip movements of a given speaker and extracts relevant speech features. This task is performed with an appearance-based lip model that is learned from example images. Visual speech features are represented by contour information of the lips and grey-level information of the mouth area. The acoustic module extracts noise-robust features from the audio signal. Finally the sensor fusion module is responsible for the joint temporal modeling of the acoustic and visual feature streams and is realized using multistream hidden Markov models (HMMs). The multistream method allows the definition of different temporal topologies and levels of stream integration and hence enables the modeling of temporal dependencies more accurately than traditional approaches. We present two different methods to learn the asynchrony between the two modalities and how to incorporate them in the multistream models. The superior performance for the proposed system is demonstrated on a large multispeaker database of continuously spoken digits. On a recognition task at 15 dB acoustic signal-to-noise ratio (SNR), acoustic perceptual linear prediction (PLP) features lead to 56% error rate, noise robust RASTA-PLP (relative spectra) acoustic features to 7.2% error rate and combined noise robust acoustic features and visual features to 2.5% error rate.

Journal ArticleDOI
TL;DR: The wavelet packet transform (WPT) is introduced as an alternative means of extracting time-frequency information from vibration signatures and significantly reduces the long training time that is often associated with the neural network classifier and improves its generalization capability.
Abstract: Condition monitoring of dynamic systems based on vibration signatures has generally relied upon Fourier-based analysis as a means of translating vibration signals in the time domain into the frequency domain. However, Fourier analysis provided a poor representation of signals well localized in time. In this case, it is difficult to detect and identify the signal pattern from the expansion coefficients because the information is diluted across the whole basis. The wavelet packet transform (WPT) is introduced as an alternative means of extracting time-frequency information from vibration signatures. The resulting WPT coefficients provide one with arbitrary time-frequency resolution of a signal. With the aid of statistical-based feature selection criteria, many of the feature components containing little discriminant information could be discarded, resulting in a feature subset having a reduced number of parameters without compromising the classification performance. The extracted reduced dimensional feature vector is then used as input to a neural network classifier. This significantly reduces the long training time that is often associated with the neural network classifier and improves its generalization capability.

Proceedings ArticleDOI
01 Sep 2000
TL;DR: A new system for personal identification based on iris patterns is presented, which employs the rich 2D information of the iris and is translation, rotation, and scale invariant.
Abstract: A new system for personal identification based on iris patterns is presented in this paper. It is composed of iris image acquisition, image preprocessing, feature extraction and classifier design. The algorithm for iris feature extraction is based on texture analysis using multichannel Gabor filtering and wavelet transform. Compared with existing methods, our method employs the rich 2D information of the iris and is translation, rotation, and scale invariant.

Journal ArticleDOI
TL;DR: An effective fingerprint verification system is presented, which assumes that an existing reference fingerprint image must validate the identity of a person by means of a test fingerprint image acquired online and in real-time using minutiae matching.
Abstract: An effective fingerprint verification system is presented. It assumes that an existing reference fingerprint image must validate the identity of a person by means of a test fingerprint image acquired online and in real-time using minutiae matching. The matching system consists of two main blocks: The first allows for the extraction of essential information from the reference image off-line, the second performs the matching itself online. The information is obtained from the reference image by filtering and careful minutiae extraction procedures. The fingerprint identification is based on triangular matching to cope with the strong deformation of fingerprint images due to static friction or finger rolling. The matching is finally validated by dynamic time warping. Results reported on the NIST Special Database 4 reference set, featuring 85 percent correct verification (15 percent false negative) and 0.05 percent false positive, demonstrate the effectiveness of the verification technique.

Journal ArticleDOI
TL;DR: Overall, using the Gabor "lter magnitude response given a frequency bandwidth and spacing of one octave and orientation bandwidth and spaced of 303 augmented by a measure of the texture complexity generated preferred results.

Patent
19 Oct 2000
TL;DR: In this article, a natural language interface control system for operating a plurality of devices (114) consisting of a first microphone array (108), a feature extraction module (202) coupled to the first microphone arrays, and a speech recognition module (204) coupled with the feature extraction modules, wherein the speech recognition model utilizes hidden Markov models.
Abstract: A natural language interface control system (206) for operating a plurality of devices (114) consists of a first microphone array (108), a feature extraction module (202) coupled to the first microphone array, and a speech recognition module (204) coupled to the feature extraction module, wherein the speech recognition module utilizes hidden Markov models. The system also comprises a natural language interface module (222) coupled to the speech recognition module (204) and a device interface (210) coupled to the natural language interface module (222), wherein the natural language interface module is for operating a plurality of devices coupled to the device interface based upon non-prompted, open-ended natural language requests from a user.

Proceedings ArticleDOI
26 Mar 2000
TL;DR: A novel face recognition algorithm based on the point signature-a representation for free-form surfaces that can be quickly and efficiently identified and ranked according to their similarity with the test face.
Abstract: We present a novel face recognition algorithm based on the point signature-a representation for free-form surfaces. We treat the face recognition problem as a non-rigid object recognition problem. The rigid parts of the face of one person are extracted after registering the range data sets of faces having different facial expressions. These rigid parts are used to create a model library for efficient indexing. For a test face, models are indexed from the library and the most appropriate models are ranked according to their similarity with the test face. Verification of each model face can be quickly and efficiently identified. Experimental results with range data involving six human subjects, each with four different facial expressions, have demonstrated the validity and effectiveness of our algorithm.

Proceedings ArticleDOI
26 Mar 2000
TL;DR: The estimation of head pose, which is achieved by using the support vector regression technique, provides crucial information for choosing the appropriate face detector, which helps to improve the accuracy and reduce the computation in multi-view face detection compared to other methods.
Abstract: A support vector machine-based multi-view face detection and recognition framework is described. Face detection is carried out by constructing several detectors, each of them in charge of one specific view. The symmetrical property of face images is employed to simplify the complexity of the modelling. The estimation of head pose, which is achieved by using the support vector regression technique, provides crucial information for choosing the appropriate face detector. This helps to improve the accuracy and reduce the computation in multi-view face detection compared to other methods. For video sequences, further computational reduction can be achieved by using a pose change smoothing strategy. When face detectors find a face in frontal view, a support vector machine-based multi-class classifier is activated for face recognition. All the above issues are integrated under a support vector machine framework. Test results on four video sequences are presented, among them the detection rate is above 95%, recognition accuracy is above 90%, average pose estimation error is around 10/spl deg/, and the full detection and recognition speed is up to 4 frames/second on a Pentium II 300 PC.

Journal ArticleDOI
TL;DR: The similarities between these results and the observed properties of simple cells in the primary visual cortex are further evidence for the hypothesis that visual cortical neurons perform some type of redundancy reduction, which was one of the original motivations for ICA in the first place.
Abstract: Previous work has shown that independent component analysis (ICA) applied to feature extraction from natural image data yields features resembling Gabor functions and simple-cell receptive fields. This article considers the effects of including chromatic and stereo information. The inclusion of colour leads to features divided into separate red/green, blue/yellow, and bright/dark channels. Stereo image data, on the other hand, leads to binocular receptive fields which are tuned to various disparities. The similarities between these results and the observed properties of simple cells in the primary visual cortex are further evidence for the hypothesis that visual cortical neurons perform some type of redundancy reduction, which was one of the original motivations for ICA in the first place. In addition, ICA provides a principled method for feature extraction from colour and stereo images; such features could be used in image processing operations such as denoising and compression, as well as in pattern recognition.

Journal ArticleDOI
TL;DR: Two simple ways to use a genetic algorithm (GA) to design a multiple-classifier system are suggested that can be made less prone to overtraining by including penalty terms in the fitness function accounting for the number of features used.
Abstract: We suggest two simple ways to use a genetic algorithm (GA) to design a multiple-classifier system. The first GA version selects disjoint feature subsets to be used by the individual classifiers, whereas the second version selects (possibly) overlapping feature subsets, and also the types of the individual classifiers. The two GAs have been tested with four real data sets: heart, Satimage, letters, and forensic glasses. We used three-classifier systems and basic types of individual classifiers (the linear and quadratic discriminant classifiers and the logistic classifier). The multiple-classifier systems designed with the two GAs were compared against classifiers using: all features; the best feature subset found by the sequential backward selection method; and the best feature subset found by a CA. The GA design can be made less prone to overtraining by including penalty terms in the fitness function accounting for the number of features used.

Proceedings ArticleDOI
26 Mar 2000
TL;DR: A state-based approach to gesture learning and recognition is proposed, using spatial clustering and temporal alignment to build a finite state machine (FSM) recognizer.
Abstract: We propose a state-based approach to gesture learning and recognition. Using spatial clustering and temporal alignment, each gesture is defined to be an ordered sequence of states in spatial-temporal space. The 2D image positions of the centers of the head and both hands of the user are used as features; these are located by a color-based tracking method. From training data of a given gesture, we first learn the spatial information and then group the data into segments that are automatically aligned temporally. The temporal information is further integrated to build a finite state machine (FSM) recognizer. Each gesture has a FSM corresponding to it. The computational efficiency of the FSM recognizers allows us to achieve real-time on-line performance. We apply this technique to build an experimental system that plays a game of "Simon Says" with the user.

Proceedings ArticleDOI
01 Sep 2000
TL;DR: Two methods for fingerprint image enhancement are proposed using a unique anisotropic filter for direct grayscale enhancement and show some improvement in the minutiae detection process in terms of either efficiency or time required.
Abstract: Extracting minutiae from fingerprint images is one of the most important steps in automatic fingerprint identification and classification. Minutiae are local discontinuities in the fingerprint pattern, mainly terminations and bifurcations. In this work we propose two methods for fingerprint image enhancement. The first one is carried out using local histogram equalization, Wiener filtering, and image binarization. The second method use a unique anisotropic filter for direct grayscale enhancement. The results achieved are compared with those obtained through some other methods. Both methods show some improvement in the minutiae detection process in terms of either efficiency or time required.

Proceedings ArticleDOI
11 Jun 2000
TL;DR: The authors formulate feature registration problems as maximum likelihood or Bayesian maximum a posteriori estimation problems using mixture models and embedding of the EM algorithm within a deterministic annealing scheme in order to directly control the fuzziness of the correspondences.
Abstract: The authors formulate feature registration problems as maximum likelihood or Bayesian maximum a posteriori estimation problems using mixture models. An EM-like algorithm is proposed to jointly solve for the feature correspondences as well as the geometric transformations. A novel aspect of the authors' approach is the embedding of the EM algorithm within a deterministic annealing scheme in order to directly control the fuzziness of the correspondences. The resulting algorithm-termed mixture point matching (MPM)-can solve for both rigid and high dimensional (thin-plate spline-based) non-rigid transformations between point sets in the presence of noise and outliers. The authors demonstrate the algorithm's performance on 2D and 3D data.

Journal ArticleDOI
TL;DR: A new subband-based classification scheme is developed for classifying underwater mines and mine-like targets from the acoustic backscattered signals using a feature extractor using wavelet packets in conjunction with linear predictive coding, a feature selection scheme, and a backpropagation neural-network classifier.
Abstract: In this paper, a new subband-based classification scheme is developed for classifying underwater mines and mine-like targets from the acoustic backscattered signals. The system consists of a feature extractor using wavelet packets in conjunction with linear predictive coding (LPC), a feature selection scheme, and a backpropagation neural-network classifier. The data set used for this study consists of the backscattered signals from six different objects: two mine-like targets and four nontargets for several aspect angles. Simulation results on ten different noisy realizations and for signal-to-noise ratio (SNR) of 12 dB are presented. The receiver operating characteristic (ROC) curve of the classifier generated based on these results demonstrated excellent classification performance of the system. The generalization ability of the trained network was demonstrated by computing the error and classification rate statistics on a large data set. A multiaspect fusion scheme was also adopted in order to further improve the classification performance.

Proceedings ArticleDOI
04 Nov 2000
TL;DR: This paper investigates the use of video content analysis and feature extraction and clustering techniques for further video semantic classifications and a supervised rule based video classification system is proposed that can be applied to applications such as on-line video indexing, filtering and video summaries, etc.
Abstract: Current information and communication technologies provide the infrastructure to send bits anywhere, but do not presume to handle information at the semantic level. This paper investigates the use of video content analysis and feature extraction and clustering techniques for further video semantic classifications and a supervised rule based video classification system is proposed. This system can be applied to the applications such as on-line video indexing, filtering and video summaries, etc. As an experiment, basketball video structure will be examined and categorized into different classes according to distinct visual and motional characteristics features by rule-based classifier. The semantics classes, the visual/motional feature descriptors and their statistical relationship are then studied in detail and experiment results based on basketball video will be provided and analyzed.

Journal ArticleDOI
TL;DR: This article presents an effective traffic feature‐extraction model using discrete wavelet transform (DWT) and linear discriminant analysis (LDA), which is used as input to a neural network model for traffic incident detection.
Abstract: An effective traffic incident detection algorithm must be able to extract incident-related features from traffic patterns in order to eliminate false alarms. A robust feature-extraction algorithm also helps to reduce the dimension of the input space for a neural network model without any significant loss of related traffic information, resulting in a substantial reduction in the network size, the effect of random traffic fluctuations, the number of required training samples, and the computational resources required to train the neural network. This paper offers an effective travel feature-extraction model using discrete wavelet transform (DWT) and linear discriminant analysis (LDA). The DWT is first applied to raw traffic data, and the finest resolution coefficients representing the random fluctuations of traffic are discarded. Next, the LDA is applied to the filtered signal for further feature extraction and to reduce the dimensionality of the problem. Results of LDA are used as input to a neural network model for traffic incident detection.

Journal ArticleDOI
TL;DR: The developed feature extraction system takes as a input a STEP file defining the geometry and topology of a part and generates as output aSTEP file with form-feature information in AP224 format for form feature-based process planning.

Proceedings ArticleDOI
28 Mar 2000
TL;DR: The system is trained from examples to classify faces on the basis of high-level attributes, such as sex, "race", and expression, using linear discriminant analysis (LDA), using the Gabor representation.
Abstract: A method for automatically classifying facial images is proposed. Faces are represented using elastic graphs labelled with with 2D Gabor wavelet features. The system is trained from examples to classify faces on the basis of high-level attributes, such as sex, "race", and expression, using linear discriminant analysis (LDA). Use of the Gabor representation relaxes the requirement for precise normalization of the face: approximate registration of a facial graph is sufficient. LDA allows simple and rapid training from examples, as well as straightforward interpretation of the role of the input features for classification. The algorithm is tested on three different facial image datasets, one of which was acquired under relatively uncontrolled conditions, on tasks of sex, "race" and expression classification. Results of these tests are presented. The discriminant vectors may be interpreted in terms of the saliency of the input features for the different classification tasks, which we portray visually with feature saliency maps for node position as well as filter spatial frequency and orientation.

Journal ArticleDOI
TL;DR: This article presents a computational model for automatic traffic incident detection using discrete wavelet transform, linear discriminant analysis, and neural networks and yields a detection rate of nearly 100 percent and a false‐alarm rate of about 1 percent for two‐ or three‐lane freeways.
Abstract: Artificial neural networks are known to be effective in solving problems involving pattern recognition and classification. The traffic incident-detection problem can be viewed as the recognition of incident patterns from incident-free patterns. A neural network classifier must be trained first using incident and incident-free traffic data. The dimensionality of the training input is high, and the embedded incident characteristics are not readily detectable. This paper presents a computational model for automatic traffic incident detection using discrete wavelet transform (DWT), linear discriminant analysis (LDA), and neural networks. DWT and LDA are used for feature extraction, denoising, and effective preprocessing of data before an adaptive neural network model is used for traffic incident detection. Simulated and actual traffic data are used to test the model. For incidents with a duration of more than 5 minutes, the model yields a detection rate of nearly 100% and a false-alarm rate of about 1% for 2- or 3-lane freeways.

Journal ArticleDOI
TL;DR: This paper proposes a fast scene change detection algorithm using direct feature extraction from MPEG compressed videos, and evaluates this technique using sample video data, and shows that the proposed algorithm is faster or more accurate than the previously known scene changes detection algorithms.
Abstract: In order to process video data efficiently, a video segmentation technique through scene change detection must be required. This is a fundamental operation used in many digital video applications such as digital libraries, video on demand (VOD), etc. Many of these advanced video applications require manipulations of compressed video signals. So, the scene change detection process is achieved by analyzing the video directly in the compressed domain, thereby avoiding the overhead of decompressing video into individual frames in the pixel domain. In this paper, we propose a fast scene change detection algorithm using direct feature extraction from MPEG compressed videos, and evaluate this technique using sample video data, First, we derive binary edge maps from the AC coefficients in blocks which were discrete cosine transformed. Second, we measure edge orientation, strength and offset using correlation between the AC coefficients in the derived binary edge maps. Finally, we match two consecutive frames using these two features (edge orientation and strength). This process was made possible by a new mathematical formulation for deriving the edge information directly from the discrete cosine transform (DCT) coefficients. We have shown that the proposed algorithm is faster or more accurate than the previously known scene change detection algorithms.