scispace - formally typeset
Search or ask a question

Showing papers in "Pattern Analysis and Applications in 2007"


Journal ArticleDOI
TL;DR: An off-line, text independent system for writer identification and verification of handwritten text lines using Hidden Markov Model (HMM) based recognizers is presented.
Abstract: In this paper, an off-line, text independent system for writer identification and verification of handwritten text lines using Hidden Markov Model (HMM) based recognizers is presented. For each writer, an individual recognizer is built and trained on text lines of that writer. This results in a number of recognizers, each of which is an expert on the handwriting of exactly one writer. In the identification and verification phase, a text line of unknown origin is presented to each of these recognizers and each one returns a transcription that includes the log-likelihood score for the generated output. These scores are sorted and the resulting ranking is used for both identification and verification. Several confidence measures are defined on this ranking. The proposed writer identification and verification system is evaluated using different experimental setups.

95 citations


Journal ArticleDOI
TL;DR: The present analysis focuses on the use of some data complexity measures to describe class overlapping, feature space dimensionality and class density, and discover their relation with the practical accuracy of this classifier.
Abstract: The k-nearest neighbors (k-NN) classifier is one of the most popular supervised classification methods. It is very simple, intuitive and accurate in a great variety of real-world domains. Nonetheless, despite its simplicity and effectiveness, practical use of this rule has been historically limited due to its high storage requirements and the computational costs involved. On the other hand, the performance of this classifier appears strongly sensitive to training data complexity. In this context, by means of several problem difficulty measures, we try to characterize the behavior of the k-NN rule when working under certain situations. More specifically, the present analysis focuses on the use of some data complexity measures to describe class overlapping, feature space dimensionality and class density, and discover their relation with the practical accuracy of this classifier.

89 citations


Journal ArticleDOI
TL;DR: This work describes a model for understanding people motion in video sequences using Voronoi diagrams, focusing on group detection and classification, and determines the temporal evolution of some sociological and psychological parameters that are used to compute individual characteristics.
Abstract: This work describes a model for understanding people motion in video sequences using Voronoi diagrams, focusing on group detection and classification. We use the position of each individual as a site for the Voronoi diagram at each frame, and determine the temporal evolution of some sociological and psychological parameters, such as distance to neighbors and personal spaces. These parameters are used to compute individual characteristics (such as perceived personal space and comfort levels), that are analyzed to detect the formation of groups and their classification as voluntary or involuntary. Experimental results based on videos obtained from real life as well as from a crowd simulator were analyzed and discussed.

56 citations


Journal ArticleDOI
TL;DR: The main focus in this paper is on integrated color, texture and shape extraction methods for CBIR that uses Gabor filtration for determining the number of regions of interest (ROIs), in which fast and effective feature extraction is performed.
Abstract: Feature extraction and the use of the features as query terms are crucial problems in content-based image retrieval (CBIR) systems. The main focus in this paper is on integrated color, texture and shape extraction methods for CBIR. We have developed original CBIR methodology that uses Gabor filtration for determining the number of regions of interest (ROIs), in which fast and effective feature extraction is performed. In the ROIs extracted, texture features based on thresholded Gabor features, color features based on histograms, color moments in YUV space, and shape features based on Zernike moments are then calculated. The features presented proved to be efficient in determining similarity between images. Our system was tested on postage stamp images and Corel photo libraries and can be used in CBIR applications such as postal services.

51 citations


Journal ArticleDOI
TL;DR: A decision support system to distinguish among hematology cases directly from microscopic specimens using an image database containing digitized specimens from normal and four different hematologic malignancies is described.
Abstract: We describe a decision support system to distinguish among hematology cases directly from microscopic specimens. The system uses an image database containing digitized specimens from normal and four different hematologic malignancies. Initially, the nuclei and cytoplasmic components of the specimens are segmented using a robust color gradient vector flow active contour model. Using a few cell images from each class, the basic texture elements (textons) for the nuclei and cytoplasm are learned, and the cells are represented through texton histograms. We propose to use support vector machines on the texton histogram based cell representation and achieve major improvement over the commonly used classification methods in texture research. Experiments with 3,691 cell images from 105 patients which originated from four different hospitals indicate more than 84% classification performance for individual cells and 89% for case based classification for the five class problem.

46 citations


Journal ArticleDOI
Shigeo Abe1
TL;DR: It is shown that training and testing of kernel-based methods can be done in the empirical feature space and that training of LS SVMs in the theoretical feature space results in solving a set of linear equations.
Abstract: In this paper we discuss sparse least squares support vector machines (sparse LS SVMs) trained in the empirical feature space, which is spanned by the mapped training data. First, we show that the kernel associated with the empirical feature space gives the same value with that of the kernel associated with the feature space if one of the arguments of the kernels is mapped into the empirical feature space by the mapping function associated with the feature space. Using this fact, we show that training and testing of kernel-based methods can be done in the empirical feature space and that training of LS SVMs in the empirical feature space results in solving a set of linear equations. We then derive the sparse LS SVMs restricting the linearly independent training data in the empirical feature space by the Cholesky factorization. Support vectors correspond to the selected training data and they do not change even if the value of the margin parameter is changed. Thus for linear kernels, the number of support vectors is the number of input variables at most. By computer experiments we show that we can reduce the number of support vectors without deteriorating the generalization ability.

42 citations


Journal ArticleDOI
TL;DR: The technique was tested on artificial images, based on images of living cells and on real sequences acquired from microscope observations of neutrophils and lymphocytes as well as on a sequence of MRI images, showing that the method is both effective and practical.
Abstract: This paper describes a segmentation method combining a texture based technique with a contour based method. The technique is designed to enable the study of cell behaviour over time by segmenting brightfield microscope image sequences. The technique was tested on artificial images, based on images of living cells and on real sequences acquired from microscope observations of neutrophils and lymphocytes as well as on a sequence of MRI images. The results of the segmentation are compared with the results of the watershed and snake segmentation methods. The results show that the method is both effective and practical.

41 citations


Journal ArticleDOI
TL;DR: A novel neural network architecture suitable for image processing applications and comprising three interconnected fuzzy layers of neurons and devoid of any back-propagation algorithm for weight adjustment is proposed in this article.
Abstract: A novel neural network architecture suitable for image processing applications and comprising three interconnected fuzzy layers of neurons and devoid of any back-propagation algorithm for weight adjustment is proposed in this article. The fuzzy layers of neurons represent the fuzzy membership information of the image scene to be processed. One of the fuzzy layers of neurons acts as an input layer of the network. The two remaining layers viz. the intermediate layer and the output layer are counter-propagating fuzzy layers of neurons. These layers are meant for processing the input image information available from the input layer. The constituent neurons within each layer of the network architecture are fully connected to each other. The intermediate layer neurons are also connected to the corresponding neurons and to a set of neighbors in the input layer. The neurons at the intermediate layer and the output layer are also connected to each other and to the respective neighbors of the corresponding other layer following a neighborhood based connectivity. The proposed architecture uses fuzzy membership based weight assignment and subsequent updating procedure. Some fuzzy cardinality based image context sensitive information are used for deciding the thresholding capabilities of the network. The network self organizes the input image information by counter-propagation of the fuzzy network states between the intermediate and the output layers of the network. The attainment of stability of the fuzzy neighborhood hostility measures at the output layer of the network or the corresponding fuzzy entropy measures determine the convergence of the network operation. An application of the proposed architecture for the extraction of binary objects from various degrees of noisy backgrounds is demonstrated using a synthetic and a real life image.

38 citations


Journal ArticleDOI
TL;DR: The novelty of this algorithm includes improving the speed and accuracy of the iris segmentation process, assessing theiris image quality such that only the clear images are accepted so as to reduce the recognition error, and producing a feature vector with discriminating texture features and a proper dimensionality to improve the recognition accuracy and computational efficiency.
Abstract: In general, a typical iris recognition system includes iris imaging, iris liveness detection, iris image quality assessment, and iris recognition. This paper presents an algorithm focusing on the last two steps. The novelty of this algorithm includes improving the speed and accuracy of the iris segmentation process, assessing the iris image quality such that only the clear images are accepted so as to reduce the recognition error, and producing a feature vector with discriminating texture features and a proper dimensionality so as to improve the recognition accuracy and computational efficiency. The Hough transform, polynomial fitting technique, and some morphological operations are used for the segmentation process. The phase data from 1D Log-Gabor filter is extracted and encoded efficiently to produce a proper feature vector. Experimental tests were performed using CASIA iris database (756 samples). These tests prove that the proposed algorithm has an encouraging performance.

30 citations


Journal ArticleDOI
TL;DR: Experiments show that MSD classifier can compete with top-performance linear classifiers such as linear support vector machines, and is better than or equivalent to combinations of well known facial feature extraction methods, such as eigenfaces, Fisherfaces, orthogonal complementary space, nullspace, direct linear discriminant analysis, and the nearest neighbor classifier.
Abstract: As an effective technique for feature extraction and pattern classification Fisher linear discriminant (FLD) has been successfully applied in many fields. However, for a task with very high-dimensional data such as face images, conventional FLD technique encounters a fundamental difficulty caused by singular within-class scatter matrix. To avoid the trouble, many improvements on the feature extraction aspect of FLD have been proposed. In contrast, studies on the pattern classification aspect of FLD are quiet few. In this paper, we will focus our attention on the possible improvement on the pattern classification aspect of FLD by presenting a novel linear discriminant criterion called maximum scatter difference (MSD). Theoretical analysis demonstrates that MSD criterion is a generalization of Fisher discriminant criterion, and is the asymptotic form of discriminant criterion: large margin linear projection. The performance of MSD classifier is tested in face recognition. Experiments performed on the ORL, Yale, FERET and AR databases show that MSD classifier can compete with top-performance linear classifiers such as linear support vector machines, and is better than or equivalent to combinations of well known facial feature extraction methods, such as eigenfaces, Fisherfaces, orthogonal complementary space, nullspace, direct linear discriminant analysis, and the nearest neighbor classifier.

26 citations


Journal ArticleDOI
TL;DR: The proposed serial multi-stage intrusion detection system behaves significantly better than other multiple classifier systems performing classification in a single stage and is tested on three different services of a standard database used for benchmarking intrusion detection systems.
Abstract: A serial multi-stage classification system for facing the problem of intrusion detection in computer networks is proposed The whole decision process is organized into successive stages, each one using a set of features tailored for recognizing a specific attack category All the stages employ suitable criteria for estimating the reliability of the performed classification, so that, in case of uncertainty, information related to a possible attack are only logged for further processing, without raising an alert for the system manager This permits to reduce the number of false alarms On the other hand, in order to keep low the number of missed detections, the proposed system declares a connection as normal traffic only if all the stages do not detect an attack The proposed multi-stage intrusion detection system has been tested on three different services (http, telnet and ftp) of a standard database used for benchmarking intrusion detection systems and also on real network traffic data The experimental analysis highlights the effectiveness of the approach: the proposed system behaves significantly better than other multiple classifier systems performing classification in a single stage

Journal ArticleDOI
TL;DR: In this paper a method dealing with correct location of the borders in the epi-metaphyseal regions of interest is described and a meaningful improvement in terms of ultimate contour location and smoothing has been observed in regions with cartilage or bone convexity developed near the bottom region of the epiphysis.
Abstract: Segmentation of anatomical structures in radiological images is one of the important steps in the computerized approach to the bone age assessment. In this paper a method dealing with correct location of the borders in the epi-metaphyseal regions of interest is described. The well segmented bone structures are obtained utilizing the Gibbs random fields as the first segmentation step; however this method does not prove to be adequate in the correct outline of other tissues in the epi-metaphyseal area. In order to correct delineation of cartilage in this region, the second segmentation step utilizing the active contours serving as a post-segmentation edge location technique is applied. Controlling of tension and bending of the active contour requires a set of weights in the energy functional to be set. To adjust the weights and to initially test the methodology a model of region of interest containing three different anatomical structures corrupted with Gaussian noise has been designed. Combined methodology of Gibbs random fields and active contours with the final set of weights was applied to 200 regions of interest randomly selected from 1100 left hand radiographs. A meaningful improvement in terms of ultimate contour location and smoothing has been observed in regions with cartilage or bone convexity developed near the bottom region of the epiphysis.

Journal ArticleDOI
TL;DR: It is demonstrated that by decreasing the multiplicity of the eigenvalues of the AdNN’s control system, the system can effectively drive the system into chaos, and it is shown that such a Modified AdNN (M-AdNN) has the desirable property that it recognizes various input patterns.
Abstract: Traditional pattern recognition (PR) systems work with the model that the object to be recognized is characterized by a set of features, which are treated as the inputs. In this paper, we propose a new model for PR, namely one that involves chaotic neural networks (CNNs). To achieve this, we enhance the basic model proposed by Adachi (Neural Netw 10:83–98, 1997), referred to as Adachi’s Neural Network (AdNN), which though dynamic, is not chaotic. We demonstrate that by decreasing the multiplicity of the eigenvalues of the AdNN’s control system, we can effectively drive the system into chaos. We prove this result here by eigenvalue computations and the evaluation of the Lyapunov exponent. With this premise, we then show that such a Modified AdNN (M-AdNN) has the desirable property that it recognizes various input patterns. The way that this PR is achieved is by the system essentially sympathetically “resonating” with a finite periodicity whenever these samples (or their reasonable resemblances) are presented. In this paper, we analyze the M-AdNN for its periodicity, stability and the length of the transient phase of the retrieval process. The M-AdNN has been tested for Adachi’s dataset and for a real-life PR problem involving numerals. We believe that this research also opens a host of new research avenues.

Journal ArticleDOI
TL;DR: A chain of mathematical morphology operations over the longest common subsequence between two strings is developed leading to the detection of the most frequent video transitions, namely, cut, fade, and wipe.
Abstract: The detection of shot boundaries in videos captures the structure of the image sequences by the identification of transitional effects. This task is important in the video indexing and retrieval domain. The video slice or visual rhythm is a single two-dimensional image sampling that has been used to detect several types of video events, including transitions. We use the longest common subsequence (LCS) between two strings to transform the video slice into one-dimensional signals obtaining a highly simplified representation of the video content. We also developed a chain of mathematical morphology operations over these signals leading to the detection of the most frequent video transitions, namely, cut, fade, and wipe. The algorithms are tested with success with various genres of videos.

Journal ArticleDOI
TL;DR: Fractal scale wavelet analysis is applied to describe and automatically recognize gait, and by introducing the Mallat algorithm of wavelet, it reduces the computation complexity.
Abstract: Gait is an identifying biometric feature. Video-based gait recognition has now become a new challenging topic in computer vision. In this paper, fractal scale wavelet analysis is applied to describe and automatically recognize gait. Fractal scale, which is based on wavelet analysis, represents the self-similarity of signals, and improves the flexibility of wavelet moments. It is translation, scale and rotation invariant, and has anti-noise and occlusion handling performance. Moreover, by introducing the Mallat algorithm of wavelet, it reduces the computation complexity. Experiments on three databases show that fractal scale has simple computation and is an efficient descriptor for gait recognition.

Journal ArticleDOI
Shumeet Baluja1
TL;DR: A novel system, based on combining the outputs of hundreds of classifiers trained with AdaBoost, to determine the upright orientation of an image, which surpasses similar methods based on Support Vector Machines, in terms of both accuracy and feasibility of deployment.
Abstract: With the proliferation of digital cameras and self-publishing of photos, automatic detection of image orientation has become an important part of photo-management systems. In this paper, we present a novel system, based on combining the outputs of hundreds of classifiers trained with AdaBoost, to determine the upright orientation of an image. We thoroughly test our system on photos gathered from professional and amateur photo collections that have been taken with a variety of cameras (digital, film, camera phones). The test images include photos that are in color and black and white, realistic and abstract, and outdoor and indoor. As this system is intended for mass consumer deployment, efficiency in use and accessibility is paramount. Results show that the presented method surpasses similar methods based on Support Vector Machines, in terms of both accuracy and feasibility of deployment.

Journal ArticleDOI
TL;DR: The finding is that a pairwise selection may improve over traditional procedures and some artificial and real-world examples are presented to support this claim and it is discovered that the set of problems for which the pairwiseselection may be effective is small.
Abstract: Feature selection methods are often used to determine a small set of informative features that guarantee good classification results. Such procedures usually consist of two components: a separability criterion and a selection strategy. The most basic choices for the latter are individual ranking, forward search and backward search. Many intermediate methods such as floating search are also available. The forward as well as backward selection may cause lossy evaluation of the criterion and/or overtraining of the final classifier in case of high-dimensional spaces and small sample size problems. Backward selection may also become computationally prohibitive. Individual ranking, on the other hand, suffers as it neglects dependencies between features. A new strategy based on a pairwise evaluation has recently been proposed by Bo and Jonassen (Genome Biol 3, 2002) and Pa?kalska et al. (International Conference on Computer Recognition Systems, Poland, pp 271---278, 2005). Since it considers interactions between features, but always restricted to two-dimensional spaces, it may circumvent the small sample size problem. In this paper, we evaluate this idea in a more general framework for the selection of features as well as prototypes. Our finding is that such a pairwise selection may improve over traditional procedures and we present some artificial and real-world examples to support this claim. Additionally, we have also discovered that the set of problems for which the pairwise selection may be effective is small.

Journal ArticleDOI
TL;DR: It is argued that in order to understand which features are used by humans to group textures, one must start by computing thousands of features of diverse nature, and select from those features those that allow the reproduction of perceptual groups or perceptual ranking created by humans.
Abstract: We argue that in order to understand which features are used by humans to group textures, one must start by computing thousands of features of diverse nature, and select from those features those that allow the reproduction of perceptual groups or perceptual ranking created by humans. We use the Trace transform to produce such features here. We compare these features with those produced from the co-occurrence matrix and its variations. We show that when one is not interested in reproducing human behaviour, the elements of the co-occurrence matrix used as features perform best in terms of texture classification accuracy. However, these features cannot be "trained" or "selected" to imitate human ranking, while the features produced from the Trace transform can. We attribute this to the diverse nature of the features computed from the Trace transform.

Journal ArticleDOI
TL;DR: It is shown how performance can be improved by exploiting the ability of superscalar processors to issue multiple instructions per cycle and by using the memory hierarchy adequately and regular codes could be performed faster than more complex irregular codes using standard data sets.
Abstract: Modern computers provide excellent opportunities for performing fast computations. They are equipped with powerful microprocessors and large memories. However, programs are not necessarily able to exploit those computer resources effectively. In this paper, we present the way in which we have implemented a nearest neighbor classification. We show how performance can be improved by exploiting the ability of superscalar processors to issue multiple instructions per cycle and by using the memory hierarchy adequately. This is accomplished by the use of floating-point arithmetic which usually outperforms integer arithmetic, and block (tiled) algorithms which exploit the data locality of programs allowing for an efficient use of the data stored in the cache memory. Our results are validated with both an analytical model and empirical results. We show that regular codes could be performed faster than more complex irregular codes using standard data sets.

Journal ArticleDOI
TL;DR: A battery of computer-based hand-drawing tests is developed and shown to be effective in distinguishing between stroke subjects with and without neglect, providing a novel diagnostic capability which results in increased test sensitivity, a more objective assessment and a reduction in overall evaluation time.
Abstract: Visuo-spatial neglect (VSN) is a post-stroke condition in which a patient fails to respond to stimuli on one side of the visual field. Using an established pencil-and-paper-based method for the assessment of VSN (the Rivermead Behavioural Inattention Test) as a reference, a battery of computer-based hand-drawing tests is developed and shown to be effective in distinguishing between stroke subjects with and without neglect. The novel approach adopts measurements both of the outcome and the process by which the drawing tasks are executed. This approach provides a novel diagnostic capability which results in increased test sensitivity, a more objective assessment and a reduction in overall evaluation time. The paper describes the development of a binary assessment system using the computer-based acquisition and analysis of task data alongside feature selection techniques to maximise performance.

Journal ArticleDOI
TL;DR: This paper presents a robust approach to extracting content from instructional videos for handwritten recognition, indexing and retrieval, and other e-learning applications by combining top-hat morphological processing with a gradient-based adaptive thresholding technique to retrieve content pixels from the board regions.
Abstract: This paper presents a robust approach to extracting content from instructional videos for handwritten recognition, indexing and retrieval, and other e-learning applications. For the instructional videos of chalkboard presentations, retrieving the handwritten content (e.g., characters, drawings, figures) on boards is the first and prerequisite step towards further exploration of instructional video content. However, content extraction in instructional videos is still challenging due to video noise, non-uniformity of the color in board regions, light condition changes in a video session, camera movements, and unavoidable occlusions by instructors. To solve this problem, we first segment video frames into multiple regions and estimate the parameters of the board regions based on statistical analysis of the pixels in dominant regions. Then we accurately separate the board regions from irrelevant regions using a probabilistic classifier. Finally, we combine top-hat morphological processing with a gradient-based adaptive thresholding technique to retrieve content pixels from the board regions. Evaluation of the content extraction results on four full-length instructional videos shows the high performance of the proposed method. The extraction of content text facilitates the research on full exploitation of instructional videos, such as content enhancement, indexing, and retrieval.

Journal ArticleDOI
TL;DR: This work presents an algorithm that is based on applying eigendecomposition to a quadtree representation of the image dataset used to describe the appearance of an object that allows decisions concerning the pose of anobject to be based on only those portions of theimage in which the algorithm has determined that the object is not occluded.
Abstract: Eigendecomposition-based techniques are popular for a number of computer vision problems, e.g., object and pose estimation, because they are purely appearance based and they require few on-line computations. Unfortunately, they also typically require an unobstructed view of the object whose pose is being detected. The presence of occlusion and background clutter precludes the use of the normalizations that are typically applied and significantly alters the appearance of the object under detection. This work presents an algorithm that is based on applying eigendecomposition to a quadtree representation of the image dataset used to describe the appearance of an object. This allows decisions concerning the pose of an object to be based on only those portions of the image in which the algorithm has determined that the object is not occluded. The accuracy and computational efficiency of the proposed approach is evaluated on 16 different objects with up to 50% of the object being occluded and on images of ships in a dockyard.

Journal ArticleDOI
TL;DR: This paper shows how to optimize this dictionary-based syntactic pattern recognition of strings computation by incorporating breadth first search schemes on the underlying graph structure, and demonstrates marked improvements with regard to the operations needed up to 21%, while at the same time maintaining the same accuracy.
Abstract: Dictionary-based syntactic pattern recognition of strings attempts to recognize a transmitted string X *, by processing its noisy version, Y, without sequentially comparing Y with every element X in the finite, (but possibly, large) dictionary, H. The best estimate X + of X *, is defined as that element of H which minimizes the generalized Levenshtein distance (GLD) D(X, Y) between X and Y, for all X ?H. The non-sequential PR computation of X + involves a compact trie-based representation of H. In this paper, we show how we can optimize this computation by incorporating breadth first search schemes on the underlying graph structure. This heuristic emerges from the trie-based dynamic programming recursive equations, which can be effectively implemented using a new data structure called the linked list of prefixes that can be built separately or "on top of" the trie representation of H. The new scheme does not restrict the number of errors in Y to be merely a small constant, as is done in most of the available methods. The main contribution is that our new approach can be used for generalized GLDs and not merely for 0/1 costs. It is also applicable when all possible correct candidates need to be known, and not just the best match. These constitute the cases when the "cutoffs" cannot be used in the DFS trie-based technique (Shang and Merrettal in IEEE Trans Knowl Data Eng 8(4):540---547, 1996). The new technique is compared with the DFS trie-based technique (Risvik in United Patent 6377945 B1, 23 April 2002; Shang and Merrettal in IEEE Trans Knowl Data Eng 8(4):540---547, 1996) using three large and small benchmark dictionaries with different errors. In each case, we demonstrate marked improvements with regard to the operations needed up to 21%, while at the same time maintaining the same accuracy. Additionally, some further improvements can be obtained by introducing the knowledge of the maximum number or percentage of errors in Y.

Journal ArticleDOI
TL;DR: The proposed algorithm was able to detect salient humans regardless of the amount of movement, and also distinguish salient humans from non-salient humans, and can be easily applied to human robot interfaces for human-like vision systems.
Abstract: In this paper, we propose a salient human detection method that uses pre-attentive features and a support vector machine (SVM) for robot vision. From three pre-attentive features (color, luminance and motion), we extracted three feature maps and combined them as a salience map. By using these features, we estimated a given object’s location without pre-assumptions or semi-automatic interaction. We were able to choose the most salient object even if multiple objects existed. We also used the SVM to decide whether a given object was human (among the candidate object regions). For the SVM, we used a new feature extraction method to reduce the feature dimensions and reflect the variations of local features to classifiers by using an edged-mosaic image. The main advantage of the proposed method is that our algorithm was able to detect salient humans regardless of the amount of movement, and also distinguish salient humans from non-salient humans. The proposed algorithm can be easily applied to human robot interfaces for human-like vision systems.

Journal ArticleDOI
TL;DR: The material specimen pictures are binarized in the analysis of ductile cast iron specimen pictures, in order to provide a quantitative evaluation of the graphite nodules shape, and can be formulated as an optimal segmentation problem.
Abstract: This work aims to characterize different objects on a scene by means of some of their morphological properties. The leading application consists in the analysis of ductile cast iron specimen pictures, in order to provide a quantitative evaluation of the graphite nodules shape; to this aim the material specimen pictures are binarized. Such a binarization process can be formulated as an optimal segmentation problem. The search for the optimal solution is solved efficiently by training a neural network on a suitable set of binary templates. A robust procedure is obtained, amenable for parallel or hardware implementation, so that real-time applications can be effectively dealt with. The method was developed as the core of an expert system aimed at the unsupervised analysis of ductile cast iron mechanical properties that are influenced by the microstructure and the peculiar morphology of graphite elements.

Journal ArticleDOI
TL;DR: An innovative architecture to segment a news video into the so-called “stories” by both using the included video and audio information by using a novel anchor shot detection method based on features extracted from the audio track is proposed.
Abstract: In this paper, we propose an innovative architecture to segment a news video into the so-called "stories" by both using the included video and audio information. Segmentation of news into stories is one of the key issues for achieving efficient treatment of news-based digital libraries. While the relevance of this research problem is widely recognized in the scientific community, we are in presence of a few established solutions in the field. In our approach, the segmentation is performed in two steps: first, shots are classified by combining three different anchor shot detection algorithms using video information only. Then, the shot classification is improved by using a novel anchor shot detection method based on features extracted from the audio track. Tests on a large database confirm that the proposed system outperforms each single video-based method as well as their combination.

Journal ArticleDOI
TL;DR: This paper robustly estimate registration parameters for the high dynamic range global mosaic, simultaneously estimating scene radiances and distortion parameters in a single framework using a computationally optimized Levenberg–Marquardt approach.
Abstract: In this paper, we present a global approach for constructing high dynamic range mosaics from multiple images with large exposure differences To minimize registration errors caused by intensity mismatches in the image intensity space with low dynamic range, we propose the use of a scene radiance space with high dynamic range By relating image intensities to scene radiances with a convenient distortion model, we robustly estimate registration parameters for the high dynamic range global mosaic, simultaneously estimating scene radiances and distortion parameters in a single framework using a computationally optimized Levenberg–Marquardt approach


Journal ArticleDOI
TL;DR: Experimental results demonstrate that the combined NNSRM-LDA method can achieve a better performance than NN by selecting different distances and a comparable performance with SVM but costing less computational time.
Abstract: NNSRM is an implementation of the structural risk minimization (SRM) principle using the nearest neighbor (NN) rule, and linear discriminant analysis (LDA) is a dimension-reducing method, which is usually used in classifications. This paper combines the two methods for face recognition. We first project the face images into a PCA subspace, then project the results into a much lower-dimensional LDA subspace, and then use an NNSRM classifier to recognize them in the LDA subspace. Experimental results demonstrate that the combined method can achieve a better performance than NN by selecting different distances and a comparable performance with SVM but costing less computational time.