Home
/
Topics
/
Cluster analysis

Topic

Cluster analysis

About: Cluster analysis is a research topic. Over the lifetime, 146546 publications have been published within this topic receiving 2962017 citations. The topic is also known as: clustering & cluster analysis in marketing.

...read moreread less

Papers published on a yearly basis

2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Network-aware behavior clustering of Internet end hosts

[...]

Kuai Xu¹, Feng Wang¹, Lin Gu²•Institutions (2)

Arizona State University¹, Hong Kong University of Science and Technology²

10 Apr 2011

TL;DR: A simple and efficient spectral clustering algorithm is applied to perform network-aware clustering of end hosts in the same prefixes into different behavior clusters that exhibit distinct traffic characteristics which provides improved interpretations of the separated traffic compared with the aggregated traffic of the prefixes.

...read moreread less

Abstract: This paper explores the behavior similarity of Internet end hosts in the same network prefixes. We use bipartite graphs to model network traffic, and then construct one-mode projection graphs for capturing social-behavior similarity of end hosts. By applying a simple and efficient spectral clustering algorithm, we perform network-aware clustering of end hosts in the same prefixes into different behavior clusters. Based on information-theoretical measures, we find that the clusters exhibit distinct traffic characteristics which provides improved interpretations of the separated traffic compared with the aggregated traffic of the prefixes. Finally, we demonstrate the applications of exploring behavior similarity in profiling network behaviors and detecting anomalous behaviors through synthetic traffic that combines Internet backbone traffic and packet traces from real scenarios of worm propagations and denial of service attacks.

...read moreread less

57 citations

Journal Article•DOI•

Comparison of different strategies of utilizing fuzzy clustering in structure identification

[...]

Kemal Kilic¹, O. Uncu², I. Burhan Turksen³•Institutions (3)

Sabancı University¹, Simon Fraser University², TOBB University of Economics and Technology³

01 Dec 2007-Information Sciences

TL;DR: This study discusses each of the algorithms in great detail and offers a thorough comparative analysis and compares the performances of these algorithms in a medical diagnosis classification problem, namely Aachen Aphasia Test.

...read moreread less

57 citations

Posted Content•

Minimax Theory for High-dimensional Gaussian Mixtures with Sparse Mean Separation

[...]

Martin Azizyan¹, Aarti Singh¹, Larry Wasserman¹•Institutions (1)

Carnegie Mellon University¹

09 Jun 2013-arXiv: Machine Learning

TL;DR: This paper provides precise information theoretic bounds on the clustering accuracy and sample complexity of learning a mixture of two isotropic Gaussians in high dimensions under small mean separation.

...read moreread less

Abstract: While several papers have investigated computationally and statistically efficient methods for learning Gaussian mixtures, precise minimax bounds for their statistical performance as well as fundamental limits in high-dimensional settings are not well-understood. In this paper, we provide precise information theoretic bounds on the clustering accuracy and sample complexity of learning a mixture of two isotropic Gaussians in high dimensions under small mean separation. If there is a sparse subset of relevant dimensions that determine the mean separation, then the sample complexity only depends on the number of relevant dimensions and mean separation, and can be achieved by a simple computationally efficient procedure. Our results provide the first step of a theoretical basis for recent methods that combine feature selection and clustering.

...read moreread less

57 citations

Journal Article•DOI•

A comparative dimensionality reduction study in telecom customer segmentation using deep learning and PCA

[...]

Maha Alkhayrat¹, Mohamad Aljnidi¹, Kadan Aljoumaa¹•Institutions (1)

Higher Institute for Applied Sciences and Technology¹

01 Feb 2020-Journal of Big Data

TL;DR: This paper aims to explore dimensionality reduction on a real telecom dataset and evaluate customers’ clustering in reduced and latent space, compared to original space in order to achieve better quality clustering results.

...read moreread less

Abstract: Telecom Companies logs customer’s actions which generate a huge amount of data that can bring important findings related to customer’s behavior and needs. The main characteristics of such data are the large number of features and the high sparsity that impose challenges to the analytics steps. This paper aims to explore dimensionality reduction on a real telecom dataset and evaluate customers’ clustering in reduced and latent space, compared to original space in order to achieve better quality clustering results. The original dataset contains 220 features that belonging to 100,000 customers. However, dimensionality reduction is an important data preprocessing step in the data mining process specially with the presence of curse of dimensionality. In particular, the aim of data reduction techniques is to filter out irrelevant features and noisy data samples. To reduce the high dimensional data, we projected it down to a subspace using well known Principal Component Analysis (PCA) decomposition and a novel approach based on Autoencoder Neural Network, performing in this way dimensionality reduction of original data. Then K-Means Clustering is applied on both-original and reduced data set. Different internal measures were performed to evaluate clustering for different numbers of dimensions and then we evaluated how the reduction method impacts the clustering task.

...read moreread less

57 citations

Proceedings Article•DOI•

From Videos to Verbs: Mining Videos for Activities using a Cascade of Dynamical Systems

[...]

Pavan Turaga¹, Ashok Veeraraghavan¹, Rama Chellappa¹•Institutions (1)

University of Maryland, College Park¹

17 Jun 2007

TL;DR: This work builds a generative model for activities (in video) using a cascade of dynamical systems and shows that this model is able to capture and represent a diverse class of activities.

...read moreread less

Abstract: Clustering video sequences in order to infer and extract activities from a single video stream is an extremely important problem and has significant potential in video indexing, surveillance, activity discovery and event recognition. Clustering a video sequence into activities requires one to simultaneously recognize activity boundaries (activity consistent subsequences) and cluster these activity subsequences. In order to do this, we build a generative model for activities (in video) using a cascade of dynamical systems and show that this model is able to capture and represent a diverse class of activities. We then derive algorithms to learn the model parameters from a video stream and also show how a single video sequence may be clustered into different clusters where each cluster represents an activity. We also propose a novel technique to build affine, view, rate invariance of the activity into the distance metric for clustering. Experiments show that the clusters found by the algorithm correspond to semantically meaningful activities.

...read moreread less

57 citations

Collapse

Network Information

Performance

Metrics

171,579

Papers

3,476,127

Citations

No. of papers in the topic in previous years
Year	Papers
2024	16
2023	7,685
2022	17,389
2021	9,145
2020	10,460
2019	11,543

Cluster analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics