scispace - formally typeset
Journal ArticleDOI

Supervised chromosome clustering and image classification

01 Apr 2011-Future Generation Computer Systems (North-Holland)-Vol. 27, Iss: 4, pp 372-376

TL;DR: In this paper, the distance and similarities play an important role, where the greater the dissimilarity measure or distance of genes, the more dissimilar are the two chromosomes.

AbstractIn this paper we propose handwritten signature classification using a supervised chromosome clustering technique. Due to the time variant nature of handwriting of human being, a set of hundred sample handwritten signatures were first collected from the user or individual in form of the same sized grayscale images. These grayscale handwritten signature images will be used as the training set in our classification algorithm. Our proposed algorithm will then decide whether the future incoming handwritten signature of an individual can be a member of the training set or not. In this paper, the distance and similarities play an important role, where the greater the dissimilarity measure or distance of genes, the more dissimilar are the two chromosomes.

...read more


Citations
More filters
Journal ArticleDOI
01 Jul 2014
TL;DR: The proposed GOA first combines multiple well-known FS techniques to yield a possible optimal feature subsets across different traffic datasets; then the proposed adaptive threshold, which is based on entropy to extract the stable features.
Abstract: There is significant interest in the network management community about the need to identify the most optimal and stable features for network traffic data. In practice, feature selection techniques are used as a pre-processing step to eliminate meaningless features, and also as a tool to reveal the set of optimal features. Unfortunately, such techniques are often sensitive to a small variation in the traffic data. Thus, obtaining a stable feature set is crucial in enhancing the confidence of network operators. This paper proposes an robust approach, called the Global Optimization Approach (GOA), to identify both optimal and stable features, relying on multi-criterion fusion-based feature selection technique and an information-theoretic method. The proposed GOA first combines multiple well-known FS techniques to yield a possible optimal feature subsets across different traffic datasets; then the proposed adaptive threshold, which is based on entropy to extract the stable features. A new goodness measure is proposed within a Random Forest framework to estimate the final optimum feature subset. Experimental studies on network traffic data in spatial and temporal domains show that the proposed GOA approach outperforms the commonly used feature selection techniques for traffic classification task.

43 citations

01 Jan 2015
TL;DR: In recent years, knowing what information is passing through the networks is rapidly becoming more and more complex due to the ever-growing list of applications shaping today's Internet traffic.
Abstract: In recent years, knowing what information is passing through the networks is rapidly becoming more and more complex due to the ever-growing list of applications shaping today's Internet traffic. Consequently, traffic monitoring and analysis have become cr

5 citations

28 Feb 2016
TL;DR: The experimental results demonstrate that the non-fuzzy algorithms have higher accuracies in compared to the fuzzy algorithms, especially when dealing with large data sizes and different types of images.
Abstract: This paper classifies different digital images using two types of clustering algorithms. The first type is the fuzzy clustering methods, while the second type considers the non-fuzzy methods. For the performance comparisons, we apply four clustering algorithms with two from the fuzzy type and the other two from the non-fuzzy (partitonal) clustering type. The automatic partitional clustering algorithm and the partitional k-means algorithm are chosen as the two examples of the non-fuzzy clustering techniques, while the automatic fuzzy algorithm and the fuzzy C-means clustering algorithm are taken as the examples of the fuzzy clustering techniques. The evaluation among the four algorithms are done by implementing these algorithms to three different types of image databases, based on the comparison criteria of: dataset size, cluster number, execution time and classification accuracy and k-cross validation. The experimental results demonstrate that the non-fuzzy algorithms have higher accuracies in compared to the fuzzy algorithms, especially when dealing with large data sizes and different types of images. Three types of image databases of human face images, handwritten digits and natural scenes are used for the performance evaluation.

References
More filters
Journal ArticleDOI
TL;DR: It is proved the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density.
Abstract: A general non-parametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure: the mean shift. For discrete data, we prove the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density. The relation of the mean shift procedure to the Nadaraya-Watson estimator from kernel regression and the robust M-estimators; of location is also established. Algorithms for two low-level vision tasks discontinuity-preserving smoothing and image segmentation - are described as applications. In these algorithms, the only user-set parameter is the resolution of the analysis, and either gray-level or color images are accepted as input. Extensive experimental results illustrate their excellent performance.

11,014 citations


"Supervised chromosome clustering an..." refers background in this paper

  • ...Image segmentation is the decomposition of a graylevel or color image into homogeneous tiles [5]....

    [...]

Book
01 Jan 1974
TL;DR: This fourth edition of the highly successful Cluster Analysis represents a thorough revision of the third edition and covers new and developing areas such as classification likelihood and neural networks for clustering.
Abstract: Cluster analysis comprises a range of methods for classifying multivariate data into subgroups. By organising multivariate data into such subgroups, clustering can help reveal the characteristics of any structure or patterns present. These techniques are applicable in a wide range of areas such as medicine, psychology and market research. This fourth edition of the highly successful Cluster Analysis represents a thorough revision of the third edition and covers new and developing areas such as classification likelihood and neural networks for clustering. Real life examples are used throughout to demonstrate the application of the theory, and figures are used extensively to illustrate graphical techniques. The book is comprehensive yet relatively non-mathematical, focusing on the practical aspects of cluster analysis.

9,845 citations


"Supervised chromosome clustering an..." refers background in this paper

  • ...Everitt suggested [2] that if using a term such as cluster produces an answer of a value to the investigators, then it is all that is required....

    [...]

Book
01 Jan 2007
TL;DR: A circular cribbage board having a circular base plate on which a circular counter disc, bearing a circular scale having 122 divisions numbered consecutively from 0, is mounted for rotation.
Abstract: From the Publisher: Dramatically updating and extending the first edition, published in 1995, the second edition of The Handbook of Brain Theory and Neural Networks presents the enormous progress made in recent years in the many subfields related to the two great questions: How does the brain work? and, How can we build intelligent machines? Once again, the heart of the book is a set of almost 300 articles covering the whole spectrum of topics in brain theory and neural networks. The first two parts of the book, prepared by Michael Arbib, are designed to help readers orient themselves in this wealth of material. Part I provides general background on brain modeling and on both biological and artificial neural networks. Part II consists of "Road Maps" to help readers steer through articles in part III on specific topics of interest. The articles in part III are written so as to be accessible to readers of diverse backgrounds. They are cross-referenced and provide lists of pointers to Road Maps, background material, and related reading. The second edition greatly increases the coverage of models of fundamental neurobiology, cognitive neuroscience, and neural network approaches to language. It contains 287 articles, compared to the 266 in the first edition. Articles on topics from the first edition have been updated by the original authors or written anew by new authors, and there are 106 articles on new topics.

3,442 citations

Journal ArticleDOI
TL;DR: The problems of determining the number of clusters and the clustering method are solved simultaneously by choosing the best model, and the EM result provides a measure of uncertainty about the associated classification of each data point.
Abstract: We consider the problem of determining the structure of clustered data, without prior knowledge of the number of clusters or any other information about their composition. Data are represented by a mixture model in which each component corresponds to a different cluster. Models with varying geometric properties are obtained through Gaussian components with different parametrizations and cross-cluster constraints. Noise and outliers can be modelled by adding a Poisson process component. Partitions are determined by the expectation-maximization (EM) algorithm for maximum likelihood, with initial values from agglomerative hierarchical clustering. Models are compared using an approximation to the Bayes factor based on the Bayesian information criterion (BIC); unlike significance tests, this allows comparison of more than two models at the same time, and removes the restriction that the models compared be nested. The problems of determining the number of clusters and the clustering method are solved simultaneously by choosing the best model. Moreover, the EM result provides a measure of uncertainty about the associated classification of each data point. Examples are given, showing that this approach can give performance that is much better than standard procedures, which often fail to identify groups that are either overlapping or of varying sizes and shapes.

2,422 citations

Book
12 Jul 2007
TL;DR: Clustering, Data and Similarity Measures: 1. data clustering 2. data types 3. scale conversion 4. data standardization and transformation 5. data visualization 6. Similarity and dissimilarity measures 7. clustering Algorithms.
Abstract: Preface Part I. Clustering, Data and Similarity Measures: 1. Data clustering 2. DataTypes 3. Scale conversion 4. Data standardization and transformation 5. Data visualization 6. Similarity and dissimilarity measures Part II. Clustering Algorithms: 7. Hierarchical clustering techniques 8. Fuzzy clustering algorithms 9. Center Based Clustering Algorithms 10. Search based clustering algorithms 11. Graph based clustering algorithms 12. Grid based clustering algorithms 13. Density based clustering algorithms 14. Model based clustering algorithms 15. Subspace clustering 16. Miscellaneous algorithms 17. Evaluation of clustering algorithms Part III. Applications of Clustering: 18. Clustering gene expression data Part IV. Matlab and C++ for Clustering: 19. Data clustering in Matlab 20. Clustering in C/C++ A. Some clustering algorithms B. Thekd-tree data structure C. Matlab Codes D. C++ Codes Subject index Author index.

1,273 citations


"Supervised chromosome clustering an..." refers background in this paper

  • ...In data clustering, the classes are also to be defined [1]....

    [...]

  • ...In cluster analysis [1], the terms cluster, group, and class have been used in an essentially intuitive manner without a uniform definition....

    [...]