Open AccessBook
Data Mining: Practical Machine Learning Tools and Techniques
Reads0
Chats0
TLDR
This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.Abstract:
Data Mining: Practical Machine Learning Tools and Techniques offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining. Thorough updates reflect the technical changes and modernizations that have taken place in the field since the last edition, including new material on Data Transformations, Ensemble Learning, Massive Data Sets, Multi-instance Learning, plus a new version of the popular Weka machine learning software developed by the authors. Witten, Frank, and Hall include both tried-and-true techniques of today as well as methods at the leading edge of contemporary research. *Provides a thorough grounding in machine learning concepts as well as practical advice on applying the tools and techniques to your data mining projects *Offers concrete tips and techniques for performance improvement that work by transforming the input or output in machine learning methods *Includes downloadable Weka software toolkit, a collection of machine learning algorithms for data mining tasks-in an updated, interactive interface. Algorithms in toolkit cover: data pre-processing, classification, regression, clustering, association rules, visualizationread more
Citations
More filters
Book
Data Mining: Concepts and Techniques
TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Journal ArticleDOI
The WEKA data mining software: an update
TL;DR: This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.
Journal ArticleDOI
Classification and regression trees
TL;DR: This article gives an introduction to the subject of classification and regression trees by reviewing some widely available algorithms and comparing their capabilities, strengths, and weakness in two examples.
Journal ArticleDOI
A review of feature selection techniques in bioinformatics
TL;DR: A basic taxonomy of feature selection techniques is provided, providing their use, variety and potential in a number of both common as well as upcoming bioinformatics applications.
Book ChapterDOI
Activity recognition from user-annotated acceleration data
Ling Bao,Stephen S. Intille +1 more
TL;DR: This is the first work to investigate performance of recognition algorithms with multiple, wire-free accelerometers on 20 activities using datasets annotated by the subjects themselves, and suggests that multiple accelerometers aid in recognition.
References
More filters
Journal ArticleDOI
Efficient algorithms for agglomerative hierarchical clustering methods
TL;DR: A centroid SAHN clustering algorithm that requires 0(n2) time, in the worst case, for fixedk and for a family of dissimilarity measures including the Manhattan, Euclidean, Chebychev and all other Minkowski metrics is described.
Book ChapterDOI
SLIQ: A Fast Scalable Classifier for Data Mining
TL;DR: Issues in building a scalable classifier are discussed and the design of SLIQ, a new classifier that uses a novel pre-sorting technique in the tree-growth phase to enable classification of disk-resident datasets is presented.
Book ChapterDOI
The power of decision tables
TL;DR: Experimental results show that on artificial and real-world domains containing only discrete features, IDTM, an algorithm inducing decision tables, can sometimes outperform state-of-the-art algorithms such as C4.5.
Proceedings Article
The Alternating Decision Tree Learning Algorithm
Yoav Freund,Llew Mason +1 more
TL;DR: A new type of classi cation rule, the alternating decision tree, which is a generalization of decision trees, voted decision trees and voted decision stumps and generates rules that are usually smaller in size and thus easier to interpret.
Proceedings Article
Analysis and visualization of classifier performance: comparison under imprecise class and cost distributions
Foster Provost,Tom Fawcett +1 more
TL;DR: The ROC convex hull method combines techniques from ROC analysis, decision analysis and computational geometry, and adapts them to the particulars of analyzing learned classifiers to present a method for the comparison of classifier performance that is robust to imprecise class distributions and misclassification costs.