scispace - formally typeset
Search or ask a question
Author

Guojun Gan

Bio: Guojun Gan is an academic researcher from University of Connecticut. The author has contributed to research in topics: Cluster analysis & Canopy clustering algorithm. The author has an hindex of 21, co-authored 75 publications receiving 2995 citations. Previous affiliations of Guojun Gan include University of Toronto & York University.


Papers
More filters
Book
12 Jul 2007
TL;DR: Clustering, Data and Similarity Measures: 1. data clustering 2. data types 3. scale conversion 4. data standardization and transformation 5. data visualization 6. Similarity and dissimilarity measures 7. clustering Algorithms.
Abstract: Preface Part I. Clustering, Data and Similarity Measures: 1. Data clustering 2. DataTypes 3. Scale conversion 4. Data standardization and transformation 5. Data visualization 6. Similarity and dissimilarity measures Part II. Clustering Algorithms: 7. Hierarchical clustering techniques 8. Fuzzy clustering algorithms 9. Center Based Clustering Algorithms 10. Search based clustering algorithms 11. Graph based clustering algorithms 12. Grid based clustering algorithms 13. Density based clustering algorithms 14. Model based clustering algorithms 15. Subspace clustering 16. Miscellaneous algorithms 17. Evaluation of clustering algorithms Part III. Applications of Clustering: 18. Clustering gene expression data Part IV. Matlab and C++ for Clustering: 19. Data clustering in Matlab 20. Clustering in C/C++ A. Some clustering algorithms B. Thekd-tree data structure C. Matlab Codes D. C++ Codes Subject index Author index.

1,367 citations

Journal ArticleDOI
TL;DR: This paper proposes a k-means-type algorithm that is able to provide data clustering and outlier detection simultaneously by incorporating an additional cluster into the objective function and designs an iterative procedure to optimize the objectivefunction of the proposed algorithm and establish the convergence of the Iterative procedure.

167 citations

Proceedings ArticleDOI
14 Jun 2020
TL;DR: Weight Alignment (WA) as mentioned in this paper proposes weight alignment to correct the biased weights in the last fully connected (FC) layer after normal training process, which does not require any extra parameters or validation set in advance.
Abstract: Deep neural networks (DNNs) have been applied in class incremental learning, which aims to solve common real-world problems of learning new classes continually. One drawback of standard DNNs is that they are prone to catastrophic forgetting. Knowledge distillation (KD) is a commonly used technique to alleviate this problem. In this paper, we demonstrate it can indeed help the model to output more discriminative results within old classes. However, it cannot alleviate the problem that the model tends to classify objects into new classes, causing the positive effect of KD to be hidden and limited. We observed that an important factor causing catastrophic forgetting is that the weights in the last fully connected (FC) layer are highly biased in class incremental learning. In this paper, we propose a simple and effective solution motivated by the aforementioned observations to address catastrophic forgetting. Firstly, we utilize KD to maintain the discrimination within old classes. Then, to further maintain the fairness between old classes and new classes, we propose Weight Aligning (WA) that corrects the biased weights in the FC layer after normal training process. Unlike previous work, WA does not require any extra parameters or a validation set in advance, as it utilizes the information provided by the biased weights themselves. The proposed method is evaluated on ImageNet-1000, ImageNet-100, and CIFAR-100 under various settings. Experimental results show that the proposed method can effectively alleviate catastrophic forgetting and significantly outperform state-of-the-art methods.

137 citations

Posted Content
TL;DR: Weight Aligning (WA) is proposed that corrects the biased weights in the FC layer after normal training process that causes catastrophic forgetting in class incremental learning and significantly outperform state-of-the-art methods.
Abstract: Deep neural networks (DNNs) have been applied in class incremental learning, which aims to solve common real-world problems of learning new classes continually. One drawback of standard DNNs is that they are prone to catastrophic forgetting. Knowledge distillation (KD) is a commonly used technique to alleviate this problem. In this paper, we demonstrate it can indeed help the model to output more discriminative results within old classes. However, it cannot alleviate the problem that the model tends to classify objects into new classes, causing the positive effect of KD to be hidden and limited. We observed that an important factor causing catastrophic forgetting is that the weights in the last fully connected (FC) layer are highly biased in class incremental learning. In this paper, we propose a simple and effective solution motivated by the aforementioned observations to address catastrophic forgetting. Firstly, we utilize KD to maintain the discrimination within old classes. Then, to further maintain the fairness between old classes and new classes, we propose Weight Aligning (WA) that corrects the biased weights in the FC layer after normal training process. Unlike previous work, WA does not require any extra parameters or a validation set in advance, as it utilizes the information provided by the biased weights themselves. The proposed method is evaluated on ImageNet-1000, ImageNet-100, and CIFAR-100 under various settings. Experimental results show that the proposed method can effectively alleviate catastrophic forgetting and significantly outperform state-of-the-art methods.

136 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
TL;DR: A thorough exposition of community structure, or clustering, is attempted, from the definition of the main elements of the problem, to the presentation of most methods developed, with a special focus on techniques designed by statistical physicists.
Abstract: The modern science of networks has brought significant advances to our understanding of complex systems. One of the most relevant features of graphs representing real systems is community structure, or clustering, i. e. the organization of vertices in clusters, with many edges joining vertices of the same cluster and comparatively few edges joining vertices of different clusters. Such clusters, or communities, can be considered as fairly independent compartments of a graph, playing a similar role like, e. g., the tissues or the organs in the human body. Detecting communities is of great importance in sociology, biology and computer science, disciplines where systems are often represented as graphs. This problem is very hard and not yet satisfactorily solved, despite the huge effort of a large interdisciplinary community of scientists working on it over the past few years. We will attempt a thorough exposition of the topic, from the definition of the main elements of the problem, to the presentation of most methods developed, with a special focus on techniques designed by statistical physicists, from the discussion of crucial issues like the significance of clustering and how methods should be tested and compared against each other, to the description of applications to real networks.

9,057 citations

Journal ArticleDOI
TL;DR: A thorough exposition of the main elements of the clustering problem can be found in this paper, with a special focus on techniques designed by statistical physicists, from the discussion of crucial issues like the significance of clustering and how methods should be tested and compared against each other, to the description of applications to real networks.

8,432 citations

Journal ArticleDOI

6,278 citations