Showing papers by "Michael K. Ng published in 2004"

PDF

Open Access

Journal Article•DOI•

An optimization algorithm for clustering using weighted dissimilarity measures

[...]

Elaine Y. Chan¹, Wai-Ki Ching¹, Michael K. Ng¹, Joshua Zhexue Huang¹•Institutions (1)

01 May 2004-Pattern Recognition

TL;DR: A new approach is developed, which allows the use of the k-means-type paradigm to efficiently cluster large data sets by using weighted dissimilarity measures for objects.

...read moreread less

237 citations

Book•

Iterative Methods for Toeplitz Systems

[...]

Michael K. Ng¹•Institutions (1)

Hong Kong Baptist University¹

16 Dec 2004

TL;DR: This chapter discusses Iterative methods, a method for solving the differential equations of toeplitz systems, and its applications to ordinary and partial differential equations.

...read moreread less

Abstract: 1 Notations and definitions 2 Iterative methods THEORY 3 Toeplitz systems 4 Circulant preconditioners 5 Non-circulant type preconditioners 6 Ill-conditioned Toeplitz systems 7 Structured systems APPLICATIONS 8 Applications to ordinary and partial differential equations 9 Applications to queuing networks 10 Applications to signal processing 11 Applications to image processing 12 Applications to integral equations

...read moreread less

236 citations

Journal Article•DOI•

HARP: a practical projected clustering algorithm

[...]

Kevin Y. Yip¹, David W. Cheung¹, Michael K. Ng¹•Institutions (1)

University of Hong Kong¹

01 Nov 2004-IEEE Transactions on Knowledge and Data Engineering

TL;DR: A new algorithm is proposed that exploits the clustering status to adjust the internal thresholds dynamically without the assistance of user parameters and has excellent accuracy and usability.

...read moreread less

Abstract: In high-dimensional data, clusters can exist in subspaces that hide themselves from traditional clustering methods. A number of algorithms have been proposed to identify such projected clusters, but most of them rely on some user parameters to guide the clustering process. The clustering accuracy can be seriously degraded if incorrect values are used. Unfortunately, in real situations, it is rarely possible for users to supply the parameter values accurately, which causes practical difficulties in applying these algorithms to real data. In this paper, we analyze the major challenges of projected clustering and suggest why these algorithms need to depend heavily on user parameters. Based on the analysis, we propose a new algorithm that exploits the clustering status to adjust the internal thresholds dynamically without the assistance of user parameters. According to the results of extensive experiments on real and synthetic data, the new method has excellent accuracy and usability. It outperformed the other algorithms even when correct parameter values were artificially supplied to them. The encouraging results suggest that projected clustering can be a practical tool for various kinds of real applications.

...read moreread less

165 citations

Journal Article•DOI•

The use of super‐resolution techniques to reduce slice thickness in functional MRI

[...]

Ronald R. Peeters¹, Pierre Kornprobst², Mila Nikolova, Stefan Sunaert¹, Thierry Viéville², Grégoire Malandain², Rachid Deriche², Olivier Faugeras², Michael K. Ng³, Paul Van Hecke¹ - Show less +6 more•Institutions (3)

Katholieke Universiteit Leuven¹, French Institute for Research in Computer Science and Automation², University of Hong Kong³

01 Jan 2004-International Journal of Imaging Systems and Technology

TL;DR: It is concluded that the proposed super‐resolution techniques can both improve the signal‐to‐noise ratio and augment the detectability of small activated areas in fMRI image sets acquired with thicker slices.

...read moreread less

Abstract: The problem of increasing the slice resolution of functional MRI (fMRI) images without a loss in signal-to-noise ratio is considered. In standard fMRI experiments, increasing the slice resolution by a certain factor decreases the signal-to-noise ratio of the images with the same factor. For this purpose an adapted EPI MRI acquisition protocol is proposed, allowing one to acquire slice-shifted images from which one can generate interpolated super-resolution images, with an increased resolution in the slice direction. To solve the problem of correctness and robustness of the created super-resolution images from these slice-shifted datasets, the use of discontinuity preserving regularization methods is proposed. Tests on real morphological, synthetic functional, and real functional MR datasets have been performed, by comparing the obtained super-resolution datasets with high-resolution reference datasets. In the morphological experiments the image spatial resolution of the different types of images are compared. In the synthetic and real fMRI experiments, on the other hand, the quality of the different datasets is studied as function of their resulting activation maps. From the results obtained in this study, we conclude that the proposed super-resolution techniques can both improve the signal-to-noise ratio and augment the detectability of small activated areas in fMRI image sets acquired with thicker slices.

...read moreread less

78 citations

Journal Article•DOI•

Higher-order Markov chain models for categorical data sequences*

[...]

Wai-Ki Ching¹, Eric S. Fung¹, Michael K. Ng¹•Institutions (1)

University of Hong Kong¹

01 Jun 2004-Naval Research Logistics

TL;DR: This paper applies the developed higher‐order Markov chain model for analyzing categorical data sequences to the server logs data to model the users' behavior in accessing information and to predict their behavior in the future.

...read moreread less

Abstract: In this paper we study higher-order Markov chain models for analyzing categorical data sequences. We propose an efficient estimation method for the model parameters. Data sequences such as DNA and sales demand are used to illustrate the predicting power of our proposed models. In particular, we apply the developed higher-order Markov chain model to the server logs data. The objective here is to model the users' behavior in accessing information and to predict their behavior in the future. Our tests are based on a realistic web log and our model shows an improvement in prediction. © 2004 Wiley Periodicals, Inc. Naval Research Logistics, 2004

...read moreread less

77 citations

Journal Article•DOI•

Computing Moore-Penrose inverses of Toeplitz matrices by Newton's iteration

[...]

Yimin Wei¹, Jian-Feng Cai¹, Michael K. Ng²•Institutions (2)

Fudan University¹, University of Hong Kong²

01 Jul 2004-Mathematical and Computer Modelling

TL;DR: This work modifications the algorithm of [1], based on Newton's iteration and on the concept of @e-displacement rank, to the computation of the Moore-Penrose inverse of a rank-deficient Toeplitz matrix.

...read moreread less

52 citations

Journal Article•DOI•

Customer lifetime value: Stochastic optimization approach

[...]

Wai-Ki Ching¹, Michael K. Ng¹, Ka-Kuen Wong¹, Eitan Altman²•Institutions (2)

University of Hong Kong¹, French Institute for Research in Computer Science and Automation²

14 Apr 2004-Journal of the Operational Research Society

TL;DR: A stochastic dynamic programming model with a Markov chain for the optimization of CLV is proposed and then applied to practical data of a computer service company.

...read moreread less

Abstract: Since the early 1980s, the concept of relationship marketing has been becoming important in general marketing, especially in the area of direct and interactive marketing. The core of relationship marketing is the maintenance of long-term relationships with the customers. However, the relationship marketing is costly and therefore, the determination of the customer lifetime value (CLV) is an important element in making strategic decisions in both advertising and promotion. In this paper, we propose a stochastic dynamic programming model with a Markov chain for the optimization of CLV. Both cases of infinite horizon and finite horizon are discussed. The model is then applied to practical data of a computer service company.

...read moreread less

49 citations

Journal Article•DOI•

Fast inversion of triangular Toeplitz matrices

[...]

Fu-Rong Lin¹, Wai-Ki Ching¹, Michael K. Ng¹•Institutions (1)

University of Hong Kong¹

06 May 2004-Theoretical Computer Science

TL;DR: This paper presents an approximate inversion method for triangular Toeplitz matrices based on trigonometric polynomial interpolation and revise the approximate method proposed by Bini.

...read moreread less

37 citations

Journal Article•DOI•

Hidden Markov models and their applications to customer relationship management

[...]

Wai-Ki Ching, Michael K. Ng, Ka-Kuen Wong

01 Jan 2004-Ima Journal of Management Mathematics

29 citations

Journal Article•DOI•

Wavelet algorithms for deblurring models

[...]

Michael K. Ng¹, C. K. Sze¹, S. P. Yung¹•Institutions (1)

University of Hong Kong¹

01 Jan 2004-International Journal of Imaging Systems and Technology

TL;DR: An iterative deblurring algorithm is derived from a wavelet framework and a methodology to finddeblurring filters and it is proved its convergence.

...read moreread less

Abstract: Blur removal is an important problem in signal and image processing. In this article, we formulate the deblurring problem within a wavelet framework and design a methodology to find deblurring filters. Using these deblurring filters, we derive an iterative deblurring algorithm and prove its convergence. Simulation results are reported to illustrate the proposed framework and methodology. © 2004 Wiley Periodicals, Inc. Int J Imaging Syst Technol, 14, 113–121, 2004; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/ima.20014

...read moreread less

14 citations

Journal Article•DOI•

Identifying projected clusters from gene expression profiles

[...]

Kevin Y. Yip¹, David W. Cheung¹, Michael K. Ng¹, Kei-Hoi Cheung²•Institutions (2)

University of Hong Kong¹, Yale University²

01 Oct 2004

TL;DR: Experimental results show that the algorithm is capable of identifying some interesting projected clusters from real microarray data and has a low dependency on user parameters while allowing users to input some domain knowledge should they be available.

...read moreread less

Abstract: In microarray gene expression data, clusters may hide in subspaces. Traditional clustering algorithms that make use of similarity measurements in the full input space may fail to detect the clusters. In recent years a number of algorithms have been proposed to identify this kind of projected clusters, but many of them rely on some critical parameters whose proper values are hard for users to determine. In this paper a new algorithm that dynamically adjusts its internal thresholds is proposed. It has a low dependency on user parameters while allowing users to input some domain knowledge should they be available. Experimental results show that the algorithm is capable of identifying some interesting projected clusters from real microarray data.

...read moreread less

Journal Article•DOI•

Building higher-order Markov chain models with EXCEL

[...]

Wai-Ki Ching, Eric S. Fung, Michael K. Ng

01 Nov 2004-International Journal of Mathematical Education in Science and Technology

TL;DR: This note presents higher-order Markov chain models for modelling categorical data sequences with an efficient algorithm for solving the model parameters that can be implemented easily in a Microsoft EXCEL worksheet.

...read moreread less

Abstract: Categorical data sequences occur in many applications such as forecasting, data mining and bioinformatics. In this note, we present higher-order Markov chain models for modelling categorical data sequences with an efficient algorithm for solving the model parameters. The algorithm can be implemented easily in a Microsoft EXCEL worksheet. We give a detailed description for the implementation which is accessible and useful to anyone who is interested in the applications of higher-order Markov chain models and has some knowledge of EXCEL.

...read moreread less

Book Chapter•DOI•

An Efficient Algorithm for Dense Regions Discovery from Large-Scale Data Streams

[...]

Andy M. Yip¹, Edmond H. C. Wu², Michael K. Ng², Tony F. Chan¹•Institutions (2)

University of California¹, University of Hong Kong²

26 May 2004

TL;DR: Efficient and effective algorithms for identifying dense region as distinct and meaningful patterns from given data are presented and extensions of the algorithms for handling data streams are discussed.

...read moreread less

Abstract: We introduce the notion of dense region as distinct and meaningful patterns from given data. Efficient and effective algorithms for identifying such regions are presented. Next, we discuss extensions of the algorithms for handling data streams. Finally, experiments on large-scale data streams such as clickstreams are given which confirm that the usefulness of our algorithms.

...read moreread less

Journal Article•DOI•

On sufficient and necessary conditions for the Jacobi matrix inverse eigenvalue problem

[...]

Linzhang Lu¹, Michael K. Ng²•Institutions (2)

Xiamen University¹, University of Hong Kong²

01 Jul 2004-Numerische Mathematik

TL;DR: A simple method for the reconstruction of a Jacobi matrix from eigenvalues is developed and some necessary conditions for such inverse eigenvalue problem to have solutions are given.

...read moreread less

Abstract: In this paper, we study the inverse eigenvalue problem of a specially structured Jacobi matrix, which arises from the discretization of the differential equation governing the axial of a rod with varying cross section (Ram and Elhay 1998 Commum. Numer. Methods Engng. 14 597-608). We give a sufficient and some necessary conditions for such inverse eigenvalue problem to have solutions. Based on these results, a simple method for the reconstruction of a Jacobi matrix from eigenvalues is developed. Numerical examples are given to demonstrate our results.

...read moreread less

Book Chapter•DOI•

On Improving Website Connectivity by Using Web-Log Data Streams

[...]

Edmond H. C. Wu¹, Michael K. Ng¹, Joshua Zhexue Huang¹•Institutions (1)

University of Hong Kong¹

17 Mar 2004

TL;DR: In this article, an efficient algorithm is introduced to detect user access patterns using Website topology and Web-log stream data, which can online modify a website topology so that the new topology can improve the Website connectivity to adapt current visitors' access patterns.

...read moreread less

Abstract: When people visit Websites, they desire to efficiently and exactly access the contents they are interested in without delay. However, due to the constant changes of site contents and user patterns, the access efficiency of Websites cannot be optimized, especially in peak hours. In this paper, we first address the problems of access efficiency in Websites during peak hours and then propose new measures to evaluate access efficiency. An efficient algorithm is introduced to detect user access patterns using Website topology and Web-log stream data. Adopting this method, we can online modify a Website topology so that the new topology can improve the Website connectivity to adapt current visitors’ access patterns. A real sports Website is used to evaluate the effectiveness of our proposed method of accelerating user access to related contents. The results of the evaluation presented in this paper suggest that this method is feasible to online improve the connectivity of a Website intelligently.

...read moreread less

Journal Article•DOI•

Localization of Perron roots

[...]

Linzhang Lu¹, Michael K. Ng²•Institutions (2)

Xiamen University¹, University of Hong Kong²

15 Nov 2004-Linear Algebra and its Applications

TL;DR: In this paper, a method that utilizes the relationship between the Perron root of a nonnegative matrix and the estimates of the row sums of its generalized Perron complement is presented.

...read moreread less

Journal Article•DOI•

A note on the stability of Toeplitz matrix inversion formulas

[...]

You-Wei Wen¹, Michael K. Ng¹, Wai-Ki Ching¹, Hong Liu²•Institutions (2)

University of Hong Kong¹, Chinese Academy of Sciences²

01 Aug 2004-Applied Mathematics Letters

TL;DR: It is shown that if the Toeplitz matrix is nonsingular and well-conditioned, then they are numerically forward stable.

...read moreread less

Book Chapter•DOI•

Mining of web-page visiting patterns with continuous-time Markov models

[...]

Qiming Huang¹, Qiang Yang², Joshua Zhexue Huang³, Michael K. Ng³•Institutions (3)

Peking University¹, Hong Kong University of Science and Technology², University of Hong Kong³

26 May 2004

TL;DR: A new prediction model for predicting when an online customer leaves a current page and which next Web page the customer will visit, which is based on the Kolmogorov’s backward equations is presented.

...read moreread less

Abstract: This paper presents a new prediction model for predicting when an online customer leaves a current page and which next Web page the customer will visit. The model can forecast the total number of visits of a given Web page by all incoming users at the same time. The prediction technique can be used as a component for many Web based applications . The prediction model regards a Web browsing session as a continuous-time Markov process where the transition probability matrix can be computed from Web log data using the Kolmogorov’s backward equations. The model is tested against real Web-log data where the scalability and accuracy of our method are analyzed.

...read moreread less

Journal Article•DOI•

A Data Warehousing and Data Mining Framework for Web Usage Management

[...]

Joshua Zhexue Huang¹, Michael K. Ng², Edmond H. C. Wu¹•Institutions (2)

University of Hong Kong¹, Hong Kong Baptist University²

01 Jan 2004-Communications in information and systems

TL;DR: A novel integrated data warehousing and data mining framework for Website management and patterns discovery is introduced to analyze Web user behavior and some statistical indexes and practical solutions are proposed to intelligently discover interesting user access patterns.

...read moreread less

Abstract: A new challenge in Web usage analysis is how to manage and discover informative patterns from various types of Web data stored in structured or unstructured databases for system monitoring and decision making. In this paper, a novel integrated data warehousing and data mining framework for Website management and patterns discovery is introduced to analyze Web user behavior. The merit of the framework is that it combines multidimensional Web databases to support online analytical processing for improving Web services. Based on the model, we propose some statistical indexes and practical solutions to intelligently discover interesting user access patterns for Website optimization, Web personalization and recommendation etc. We use the Web data from a sports Website as data sources to evaluate the effectiveness of the model. The results show that this integrated data warehousing and mining model is effective and efficient to apply into practical Web applications.

...read moreread less

Customer migration, campaign budgeting, revenue estimation: the elasticity of Markov decision process on customer lifetime value

[...]

Wai-Ki Ching¹, Michael K. Ng¹, M.C. So•Institutions (1)

University of Hong Kong¹

01 May 2004

TL;DR: In this paper, an extension model for MDP, Higher-Order Markov Decision Model (HMDP), was proposed to overcome the limitation of MDP in predicting the profitability of a customer.

...read moreread less

Abstract: To predict the profitability of a customer, today’s firms have to practice Customer Lifetime Value (CLV) computation. Different approaches are proposed in the last ten years to analyze the complex customer phenomenon. One of them is Markov Decision Process (MDP) model. The class of Markov Models is an effective and a flexibility decision model. Whereas the use of MDP model is limited by its assumption, in this paper, we attempt to introduce an extension model for MDP: Higher-order Markov Decision Model (HMDP). HMDP can perform excellently in CLV calculation and overcome the limitation of MDP. By using a real application, we will demonstrate how it can be used efficiently in a firm’s daily operations.

...read moreread less

Journal Article•

A clustering model for mining evolving Web user patterns in data stream environment

[...]

Edmond H. C. Wu¹, Michael K. Ng¹, Andy M. Yip², Tony F. Chan²•Institutions (2)

University of Hong Kong¹, University of California²

01 Jan 2004-Lecture Notes in Computer Science

TL;DR: A new clustering model for generating and maintaining clusters efficiently which represent the changing Web user patterns in Websites is purposeed which can be employed in different Web applications such as personalization and recommendation systems.

...read moreread less

Abstract: With the fast growing of the Internet and its Web users all over the world, how to manage and discover useful patterns from tremendous and evolving Web information sources become new challenges to our data engineering researchers. Also, there is a great demand on designing scalable and flexible data mining algorithms for various time-critical and data-intensive Web applications. In this paper, we purpose a new clustering model for generating and maintaining clusters efficiently which represent the changing Web user patterns in Websites. With effective pruning process, the clusters can be fast discovered and updated to reflect the current or changing user patterns to Website administrators. This model can also be employed in different Web applications such as personalization and recommendation systems.

...read moreread less

Journal Article•DOI•

Weighted Tikhonov filter matrices for ill-posed problems

[...]

Yimin Wei¹, Michael K. Ng²•Institutions (2)

Fudan University¹, University of Hong Kong²

01 Feb 2004-Applied Mathematics and Computation

TL;DR: Some properties of the weighted Tikhonov filter matrices are given together with their filtering and regularization effects and perturbation identities for the weighted linear least squares problem and weighted pseudoinverses are presented.

...read moreread less

Journal Article•DOI•

A hybrid algorithm for queueing systems

[...]

Wai-On Yuen¹, Wai-Ki Ching¹, Michael K. Ng¹•Institutions (1)

University of Hong Kong¹

01 Oct 2004-Calcolo

TL;DR: The hybrid algorithm combines the evolutionary algorithm and the successive over-relaxation (SOR) method for solving linear systems of equations and it is proved the convergence of the hybrid algorithm for strictly diagonal dominant linear systems.

...read moreread less

Abstract: In this paper, we propose a hybrid algorithm based on [12] for solving linear systems of equations. The hybrid algorithm combines the evolutionary algorithm and the successive over-relaxation (SOR) method. The evolutionary algorithm allows the relaxation parameter w to be adaptive in the SOR method. We prove the convergence of the hybrid algorithm for strictly diagonal dominant linear systems. We then apply it to solve the steady-state probability distributions of Markovian queueing systems. Numerical examples are given to demonstrate the fast convergence rate of the method.

...read moreread less

Journal Article•

A feature weighting approach to building classification models by interactive clustering

[...]

Liping Jing¹, Joshua Zhexue Huang¹, Michael K. Ng¹, Hongqiang Rong¹•Institutions (1)

University of Hong Kong¹

01 Jan 2004-Lecture Notes in Computer Science

TL;DR: In this paper, a W-k-means algorithm is used to automatically calculate the feature weights from the training data, which increases the stability of the classifier and reduces outliers.

...read moreread less

Abstract: In using a classified data set to test clustering algorithms, the data points in a class are considered as one cluster (or more than one) in space. In this paper we adopt this principle to build classification models through interactively clustering a training data set to construct a tree of clusters. The leaf clusters of the tree are selected as decision clusters to classify new data based on a distance function. We consider the feature weights in calculating the distances between a new object and the center of a decision cluster. The new algorithm, W-k-means, is used to automatically calculate the feature weights from the training data. The Fastmap technique is used to handle outliers in selecting decision clusters. This step increases the stability of the classifier. Experimental results on public domain data sets have shown that the models built using this clustering approach outperformed some popular classification algorithms.

...read moreread less

Book Chapter•DOI•

A feature weighting approach to building classification models by interactive clustering

[...]

Liping Jing¹, Joshua Zhexue Huang¹, Michael K. Ng¹, Hongqiang Rong¹•Institutions (1)

University of Hong Kong¹

01 Aug 2004

TL;DR: This paper adopts the principle of interaction clustering to build classification models through interactively clustering a training data set to construct a tree of clusters and considers the feature weights in calculating the distances between a new object and the center of a decision cluster.

...read moreread less

Book Chapter•DOI•

Discretization of multidimensional web data for informative dense regions discovery

[...]

Edmond H. C. Wu¹, Michael K. Ng¹, Andy M. Yip², Tony F. Chan²•Institutions (2)

University of Hong Kong¹, University of California, Los Angeles²

16 Dec 2004

TL;DR: Several discretization methods for large matrices are discussed and purposed, and it is suggested that they can be employed in practical Web applications, such as user patterns discovery.

...read moreread less

Abstract: Dense regions discovery is an important knowledge discovery process for finding distinct and meaningful patterns from given data. The challenge in dense regions discovery is how to find informative patterns from various types of data stored in structured or unstructured databases, such as mining user patterns from Web data. Therefore, novel approaches are needed to integrate and manage these multi-type data repositories to support new generation information management systems. In this paper, we focus on discussing and purposing several discretization methods for large matrices. The experiments suggest that the discretization methods can be employed in practical Web applications, such as user patterns discovery.

...read moreread less

Journal Article•DOI•

Exact algorithms for singular tridiagonal systems with applications to Markov chains

[...]

Linzhang Lu¹, Wai-Ki Ching², Michael K. Ng²•Institutions (2)

Xiamen University¹, University of Hong Kong²

01 Nov 2004-Applied Mathematics and Computation

TL;DR: Two exact algorithms are proposed to solve the steady state probability distributions of irreducible Markov chains whose generator matrices have tridiagonal structure based on divide-and-conquer procedure and a parallel algorithm.

...read moreread less

Journal Article•DOI•

Preconditioning regularized least squares problems arising from high-resolution image reconstruction from low-resolution frames

[...]

Fu-Rong Lin¹, Fu-Rong Lin², Wai-Ki Ching², Michael K. Ng²•Institutions (2)

Shantou University¹, University of Hong Kong²

01 Nov 2004-Linear Algebra and its Applications

TL;DR: The problem of reconstructing a high-resolution image from multiple undersampled, shifted, degraded frames with subpixel displacement errors from multisensors is studied and it is found that cosine transform based preconditioners are effective when the number of shifted low-resolution frames are large, but are less effectivewhen the number is small.

...read moreread less

Journal Article•DOI•

High-resolution image reconstruction from rotated and translated low-resolution images with multisensors

[...]

You-Wei Wen¹, Michael K. Ng¹, Wai-Ki Ching¹•Institutions (1)

University of Hong Kong¹

01 Jan 2004-International Journal of Imaging Systems and Technology

TL;DR: This work extends the multisensor work by Bose and Boo (1998) and considers the perturbations of displacement error that are due to both translation and rotation, and introduces the warping process to obtain the ideal low‐resolution image.

...read moreread less

Abstract: We extend the multisensor work by Bose and Boo (1998) and consider the perturbations of displacement error that are due to both translation and rotation. The warping process is introduced to obtain the ideal low-resolution image, which is located at exactly horizontal and vertical shift. In this approach, the problem of high-resolution image reconstruction is turned into the problem of image restoration, and the system becomes spatially invariant rather than spatially variant in the original problem. An efficient algorithm is presented. Experimental results show that the proposed methods are quite effective, and they perform better than the bilinear image interpolation method. © 2004 Wiley Periodicals, Inc. Int J Imaging Syst Technol 14, 75–83, 2004; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/ima.20010

...read moreread less

Book Chapter•DOI•

A Clustering Model for Mining Evolving Web User Patterns in Data Stream Environment

[...]

Edmond H. C. Wu¹, Michael K. Ng¹, Andy M. Yip², Tony F. Chan²•Institutions (2)

University of Hong Kong¹, University of California²

25 Aug 2004

TL;DR: In this paper, a new clustering model for generating and maintaining clusters efficiently which represent the changing Web user patterns in Websites with effective pruning process, the clusters can be fast discovered and updated to reflect the current or changing user patterns to Website administrators.

...read moreread less

Abstract: With the fast growing of the Internet and its Web users all over the world, how to manage and discover useful patterns from tremendous and evolving Web information sources become new challenges to our data engineering researchers Also, there is a great demand on designing scalable and flexible data mining algorithms for various time-critical and data-intensive Web applications In this paper, we purpose a new clustering model for generating and maintaining clusters efficiently which represent the changing Web user patterns in Websites With effective pruning process, the clusters can be fast discovered and updated to reflect the current or changing user patterns to Website administrators This model can also be employed in different Web applications such as personalization and recommendation systems

...read moreread less