Conference

Pacific-Asia Conference on Knowledge Discovery and Data Mining

About: Pacific-Asia Conference on Knowledge Discovery and Data Mining is an academic conference. The conference publishes majorly in the area(s): Computer science & Cluster analysis. Over the lifetime, 1889 publications have been published by the conference receiving 23058 citations.

...read moreread less

Topics: Computer science, Cluster analysis, Artificial intelligence, Graph (abstract data type), Association rule learning ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

Density-Based Clustering Based on Hierarchical Density Estimates

[...]

Ricardo J. G. B. Campello¹, Davoud Moulavi¹, Joerg Sander¹•Institutions (1)

University of Alberta¹

14 Apr 2013

TL;DR: This work proposes a theoretically and practically improved density-based, hierarchical clustering method, providing a clustering hierarchy from which a simplified tree of significant clusters can be constructed, and proposes a novel cluster stability measure.

...read moreread less

Abstract: We propose a theoretically and practically improved density-based, hierarchical clustering method, providing a clustering hierarchy from which a simplified tree of significant clusters can be constructed For obtaining a “flat” partition consisting of only the most significant clusters (possibly corresponding to different density thresholds), we propose a novel cluster stability measure, formalize the problem of maximizing the overall stability of selected clusters, and formulate an algorithm that computes an optimal solution to this problem We demonstrate that our approach outperforms the current, state-of-the-art, density-based clustering methods on a wide variety of real world data

...read moreread less

1,132 citations

Book Chapter•DOI•

Discriminative Methods for Multi-labeled Classification

[...]

Shantanu Godbole¹, Sunita Sarawagi¹•Institutions (1)

Indian Institute of Technology Bombay¹

26 May 2004

TL;DR: A new technique for combining text features and features indicating relationships between classes, which can be used with any discriminative algorithm is presented, which beat accuracy of existing methods with statistically significant improvements.

...read moreread less

Abstract: In this paper we present methods of enhancing existing discriminative classifiers for multi-labeled predictions. Discriminative methods like support vector machines perform very well for uni-labeled text classification tasks. Multi-labeled classification is a harder task subject to relatively less attention. In the multi-labeled setting, classes are often related to each other or part of a is-a hierarchy. We present a new technique for combining text features and features indicating relationships between classes, which can be used with any discriminative algorithm. We also present two enhancements to the margin of SVMs for building better models in the presence of overlapping classes. We present results of experiments on real world text benchmark datasets. Our new methods beat accuracy of existing methods with statistically significant improvements.

...read moreread less

746 citations

Book Chapter•DOI•

Mining Access Patterns Efficiently from Web Logs

[...]

Jian Pei¹, Jiawei Han¹, Behzad Mortazavi-Asl¹, Hua Zhu¹•Institutions (1)

Simon Fraser University¹

18 Apr 2000

TL;DR: A novel data structure, called Web access pattern tree, or WAP-tree in short, is developed for efficient mining of access patterns from pieces of logs for access pattern mining.

...read moreread less

Abstract: With the explosive growth of data available on the World Wide Web, discovery and analysis of useful information from the World Wide Web becomes a practical necessity. Web access pattern, which is the sequence of accesses pursued by users frequently, is a kind of interesting and useful knowledge in practice. In this paper, we study the problem of mining access patterns from Web logs efficiently. A novel data structure, called Web access pattern tree, or WAP-tree in short, is developed for efficient mining of access patterns from pieces of logs. The Web access pattern tree stores highly compressed, critical information for access pattern mining and facilitates the development of novel algorithms for mining access patterns in large set of log pieces. Our algorithm can find access patterns from Web logs quite efficiently. The experimental and performance studies show that our method is in general an order of magnitude faster than conventional methods.

...read moreread less

572 citations

Book Chapter•DOI•

Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification

[...]

Eui-Hong Han¹, George Karypis¹, Vipin Kumar¹•Institutions (1)

University of Minnesota¹

16 Apr 2001

TL;DR: In this article, a Weight Adjusted k-Nearest Neighbor (WAKNN) classification method was proposed to learn feature weights based on a greedy hill climbing technique, and two performance optimizations of WAKNN were proposed to improve the computational performance by a few orders of magnitude, but do not compromise on the classification quality.

...read moreread less

Abstract: Text categorization presents unique challenges due to the large number of attributes present in the data set, large number of training samples, attribute dependency, and multi-modality of categories. Existing classification techniques have limited applicability in the data sets of these natures. In this paper, we present a Weight Adjusted k-Nearest Neighbor (WAKNN) classification that learns feature weights based on a greedy hill climbing technique. We also present two performance optimizations of WAKNN that improve the computational performance by a few orders of magnitude, but do not compromise on the classification quality. We experimentally evaluated WAKNN on 52 document data sets from a variety of domains and compared its performance against several classification algorithms, such as C4.5, RIPPER, Naive-Bayesian, PEBLS and VSM. Experimental results on these data sets confirm that WAKNN consistently outperforms other existing classification algorithms.

...read moreread less

446 citations

Book Chapter•DOI•

Evaluating the replicability of significance tests for comparing learning algorithms

[...]

Remco R. Bouckaert¹, Eibe Frank¹•Institutions (1)

University of Waikato¹

26 May 2004

TL;DR: It is argued that a test has low replicability if its outcome strongly depends on the particular random partitioning of the data that is used to perform it.

...read moreread less

Abstract: Empirical research in learning algorithms for classification tasks generally requires the use of significance tests The quality of a test is typically judged on Type I error (how often the test indicates a difference when it should not) and Type II error (how often it indicates no difference when it should) In this paper we argue that the replicability of a test is also of importance We say that a test has low replicability if its outcome strongly depends on the particular random partitioning of the data that is used to perform it We present empirical measures of replicability and use them to compare the performance of several popular tests in a realistic setting involving standard learning algorithms and benchmark datasets Based on our results we give recommendations on which test to use

...read moreread less

345 citations

Collapse

Performance

Metrics

1,889

Papers

23,058

Citations

No. of papers from the Conference in previous years
Year	Papers
2023	131
2022	124
2021	174
2020	154
2019	167
2018	196