scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

A Multi-view Non-parametric Clustering Approach to Mobile Subscriber Segmentation

TL;DR: This paper proposes a novel method for performing multi-view clustering wherein multiple groupings are generated using a non-parametric clustering algorithm and are then combined and visualized using a cross-tab/sunburst based visualization scheme.
Abstract: Marketers use segmentation as an important tool to better understand and effectively target customers by adopting marketing strategies catering to the needs and characteristics of each segment. Traditionally, customer segmentation using clustering algorithms is performed using a single attribute set. However, there may be multiple meaningful natural customer groupings possible if independent subsets of attributes were considered while discovering the clusters. Multi-view clustering provides a meaningful way to combine groupings in different feature subsets by assigning customers to multiple segments, representing different perspectives of customer behavior. In this paper we propose a novel method for performing multi-view clustering wherein multiple groupings are generated using a non-parametric clustering algorithm and are then combined and visualized using a cross-tab/sunburst based visualization scheme. We also demonstrate the effectiveness of the proposed approach by applying it to a variety of real-world problems related to mobile subscriber segmentation.
Citations
More filters
Journal ArticleDOI
TL;DR: This work proposes using a subscriber-centric clustering approach, based on subscribers’ behavior, leading to the concept of intelligent 5G networks, ultimately resulting in relevant advantages and improvements to the cellular planning process.
Abstract: This work focuses on providing enhanced capacity planning and resource management for 5G networks through bridging data science concepts with usual network planning processes. For this purpose, we propose using a subscriber-centric clustering approach, based on subscribers’ behavior, leading to the concept of intelligent 5G networks, ultimately resulting in relevant advantages and improvements to the cellular planning process. Such advanced data-science-related techniques provide powerful insights into subscribers’ characteristics that can be extremely useful for mobile network operators. We demonstrate the advantages of using such techniques, focusing on the particular case of subscribers’ behavior, which has not yet been the subject of relevant studies. In this sense, we extend previously developed work, contributing further by showing that by applying advanced clustering, two new behavioral clusters appear, whose traffic generation and capacity demand profiles are very relevant for network planning and resource management and, therefore, should be taken into account by mobile network operators. As far as we are aware, for network, capacity, and resource management planning processes, it is the first time that such groups have been considered. We also contribute by demonstrating that there are extensive advantages for both operators and subscribers by performing advanced subscriber clustering and analytics.

2 citations


Cites background or result from "A Multi-view Non-parametric Cluster..."

  • ...All resulting clusters should allow the MNO to distinguish them apart [13], allowing it to respond and plan the aspects of the whole network differently....

    [...]

  • ...Demographic clustering can be performed by looking at subscriber’s information such as age, gender, marital status, and also financial income [12,13,15]....

    [...]

  • ...ll l i l s l all the MNO to distinguish them apart [13], allowing it to respond and plan the aspects of the whole network differently....

    [...]

  • ...This is a process that has not been substantially explored in this area in question, even though there is some previous work [12,13] that can be found about techniques to cluster subscribers....

    [...]

Proceedings ArticleDOI
24 Sep 2020
TL;DR: Dimension reduction via Principal Component Analysis can, therefore, be used to achieve the segmentation of existing customers and also been used to classify new customers.
Abstract: In today's competitive environment, companies must identify their most profitable customer groups and the groups that have the biggest potential to become as such. By identifying these critical groups, they can target their actions, such as launching tailored products and target one-to-one marketing to meet customer expectations. With the profound advancements in clustering algorithms, segmentation has emerged as the method of choice for isolating the various groups of interest. However, the quality of segments of the groups of interest is affected by the type of input data to the clustering algorithms and associated high dimensionality In this study, Principal Component Analysis has been used to solve the high dimensionality of data problem. Subscriber data from nine transactions were first tested for suitability for factor analysis. Principal component analysis was then used to reduce the nine variables to five inputs. The factored data was then to cluster the various customers into segments. The elbow criterion was used to determine the optimum number of clusters. The data was then clustered via several methods; K-means, FCM, PCM, and Hierarchal. Results showed that k-means was not just the simplest method but also performed best with dimensionally reduced data. By using real case data, the study was able to verify that dimensional reduction can be applied before clustering algorithms. The dimension reduction of telecom data can thus be solved via Principal Component Analysis. The study was extended to include the classification of new subscribers basing on the dimensionally reduced data. For that purpose, a perceptron neural network was created. Using the k-means clusters as targets, a perceptron capable of classifying was created and validated. The perceptron was able to classify new subscribers with acceptable accuracy. Dimension reduction via Principal Component Analysis can, therefore, be used to achieve the segmentation of existing customers and also be used to classify new customers. The use of a perceptron is also important for automating the process of customer classification. Companies can therefore easily identify profitable customers from both old and new customers.

1 citations

References
More filters
Journal ArticleDOI
01 May 2007
TL;DR: The IPython project as mentioned in this paper provides an enhanced interactive environment that includes, among other features, support for data visualization and facilities for distributed and parallel computation for interactive work and a comprehensive library on top of which more sophisticated systems can be built.
Abstract: Python offers basic facilities for interactive work and a comprehensive library on top of which more sophisticated systems can be built. The IPython project provides on enhanced interactive environment that includes, among other features, support for data visualization and facilities for distributed and parallel computation

3,355 citations

Journal ArticleDOI
TL;DR: Findings of this paper indicate that the research area of customer retention received most research attention and classification and association models are the two commonly used models for data mining in CRM.
Abstract: Despite the importance of data mining techniques to customer relationship management (CRM), there is a lack of a comprehensive literature review and a classification scheme for it. This is the first identifiable academic literature review of the application of data mining techniques to CRM. It provides an academic database of literature between the period of 2000-2006 covering 24 journals and proposes a classification scheme to classify the articles. Nine hundred articles were identified and reviewed for their direct relevance to applying data mining techniques to CRM. Eighty-seven articles were subsequently selected, reviewed and classified. Each of the 87 selected papers was categorized on four CRM dimensions (Customer Identification, Customer Attraction, Customer Retention and Customer Development) and seven data mining functions (Association, Classification, Clustering, Forecasting, Regression, Sequence Discovery and Visualization). Papers were further classified into nine sub-categories of CRM elements under different data mining techniques based on the major focus of each paper. The review and classification process was independently verified. Findings of this paper indicate that the research area of customer retention received most research attention. Of these, most are related to one-to-one marketing and loyalty programs respectively. On the other hand, classification and association models are the two commonly used models for data mining in CRM. Our analysis provides a roadmap to guide future research and facilitate knowledge accumulation and creation concerning the application of data mining techniques in CRM.

1,135 citations


"A Multi-view Non-parametric Cluster..." refers methods in this paper

  • ...Customer segmentation using clustering techniques has been well studied in literature [2]....

    [...]

Proceedings ArticleDOI
01 Nov 2004
TL;DR: It is found empirically that the multi-view versions of k-means and EM greatly improve on their single-view counterparts, and negative results for agglomerative hierarchicalmulti-view clustering are obtained.
Abstract: We consider clustering problems in which the available attributes can be split into two independent subsets, such that either subset suffices for learning. Example applications of this multi-view setting include clustering of Web pages which have an intrinsic view (the pages themselves) and an extrinsic view (e.g., anchor texts of inbound hyperlinks); multi-view learning has so far been studied in the context of classification. We develop and study partitioning and agglomerative, hierarchical multi-view clustering algorithms for text data. We find empirically that the multi-view versions of k-means and EM greatly improve on their single-view counterparts. By contrast, we obtain negative results for agglomerative hierarchical multi-view clustering. Our analysis explains this surprising phenomenon.

741 citations


"A Multi-view Non-parametric Cluster..." refers methods in this paper

  • ...The multi-view versions of partitioning and agglomerative, hierarchical methods for text data have been developed and studied [3]....

    [...]

Proceedings Article
26 Jun 2012
TL;DR: In this article, the authors revisited the k-means clustering algorithm from a Bayesian nonparametric viewpoint and showed that a Gibbs sampling algorithm for the Dirichlet process mixture approaches a hard clustering in the limit, and further that the resulting algorithm monotonically minimizes an elegant underlying k-mean-like clustering objective that includes a penalty for the number of clusters.
Abstract: Bayesian models offer great flexibility for clustering applications--Bayesian nonparametrics can be used for modeling infinite mixtures, and hierarchical Bayesian models can be utilized for sharing clusters across multiple data sets. For the most part, such flexibility is lacking in classical clustering methods such as k-means. In this paper, we revisit the k-means clustering algorithm from a Bayesian nonparametric viewpoint. Inspired by the asymptotic connection between k-means and mixtures of Gaussians, we show that a Gibbs sampling algorithm for the Dirichlet process mixture approaches a hard clustering algorithm in the limit, and further that the resulting algorithm monotonically minimizes an elegant underlying k-means-like clustering objective that includes a penalty for the number of clusters. We generalize this analysis to the case of clustering multiple data sets through a similar asymptotic argument with the hierarchical Dirichlet process. We also discuss further extensions that highlight the benefits of our analysis: i) a spectral relaxation involving thresholded eigenvectors, and ii) a normalized cut graph clustering algorithm that does not fix the number of clusters in the graph.

326 citations

Proceedings ArticleDOI
13 Dec 2010
TL;DR: This tutorial describes several real world application scenarios for multiple clustering solutions, describes state-of-the-art paradigms, highlights specific techniques, and gives an overview of this topic by providing a taxonomy of the existing clustering methods.
Abstract: Traditional clustering algorithms identify just a single clustering of the data. Today's complex data, however, allow multiple interpretations leading to several valid groupings hidden in different views of the database. Each of these multiple clustering solutions is valuable and interesting as different perspectives on the same data and several meaningful groupings for each object are given. Especially for high dimensional data where each object is described by multiple attributes, alternative clusters in different attribute subsets are of major interest. In this tutorial, we describe several real world application scenarios for multiple clustering solutions. We abstract from these scenarios and provide the general challenges in this emerging research area. We describe state-of-the-art paradigms, we highlight specific techniques, and we give an overview of this topic by providing a taxonomy of the existing methods. By focusing on open challenges, we try to attract young researchers for participating in this emerging research field.

57 citations


"A Multi-view Non-parametric Cluster..." refers background in this paper

  • ...According to the taxonomy for multiple clustering solutions discussed in [1], multiple clusterings could be discovered: • in the original feature space • by orthogonal space transformations • by different subspace projections • in multiple views/sources The algorithms to perform multiple…...

    [...]

  • ...One way to cope with this effect is to identify relevant dimen- sions (views/subspaces/space transformations) and restrict distance computation to these views [1]....

    [...]