Classification and Regression Trees.

doi:10.2307/2288003

Journal Article•DOI•

Classification and Regression Trees.

John Van Ryzin, Leo Breiman, Jerome H. Friedman, Richard A. Olshen, Charles J. Stone - Show less +1 more

01 Mar 1986-Journal of the American Statistical Association-Vol. 81, Iss: 393, pp 253

About: This article is published in Journal of the American Statistical Association.The article was published on 1986-03-01. It has received 21694 citations till now. The article focuses on the topics: Classification Tree Method & Logistic model tree.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Regression Shrinkage and Selection via the Lasso

[...]

Robert Tibshirani

01 Jan 1996-Journal of the royal statistical society series b-methodological

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

...read moreread less

Abstract: SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.

...read moreread less

40,785 citations

Book•

Data Mining: Concepts and Techniques

[...]

Jiawei Han¹, Micheline Kamber², Jian Pei²•Institutions (2)

University of Illinois at Urbana–Champaign¹, Simon Fraser University²

08 Sep 2000

TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.

...read moreread less

Abstract: The increasing volume of data in modern business and science calls for more complex and sophisticated tools. Although advances in data mining technology have made extensive data collection much easier, it's still always evolving and there is a constant need for new techniques and tools that can help us transform this data into useful information and knowledge. Since the previous edition's publication, great advances have been made in the field of data mining. Not only does the third of edition of Data Mining: Concepts and Techniques continue the tradition of equipping you with an understanding and application of the theory and practice of discovering patterns hidden in large data sets, it also focuses on new, important topics in the field: data warehouses and data cube technology, mining stream, mining social networks, and mining spatial, multimedia and other complex data. Each chapter is a stand-alone guide to a critical topic, presenting proven algorithms and sound implementations ready to be used directly or with strategic modification against live data. This is the resource you need if you want to apply today's most powerful data mining techniques to meet real business challenges. * Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects. * Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields. *Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data

...read moreread less

23,600 citations

Book•

Data Mining: Practical Machine Learning Tools and Techniques

[...]

Ian H. Witten, Eibe Frank, Mark Hall

25 Oct 1999

TL;DR: This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.

...read moreread less

Abstract: Data Mining: Practical Machine Learning Tools and Techniques offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining. Thorough updates reflect the technical changes and modernizations that have taken place in the field since the last edition, including new material on Data Transformations, Ensemble Learning, Massive Data Sets, Multi-instance Learning, plus a new version of the popular Weka machine learning software developed by the authors. Witten, Frank, and Hall include both tried-and-true techniques of today as well as methods at the leading edge of contemporary research. *Provides a thorough grounding in machine learning concepts as well as practical advice on applying the tools and techniques to your data mining projects *Offers concrete tips and techniques for performance improvement that work by transforming the input or output in machine learning methods *Includes downloadable Weka software toolkit, a collection of machine learning algorithms for data mining tasks-in an updated, interactive interface. Algorithms in toolkit cover: data pre-processing, classification, regression, clustering, association rules, visualization

...read moreread less

20,196 citations

Journal Article•DOI•

The WEKA data mining software: an update

[...]

Mark Hall, Eibe Frank¹, Geoffrey Holmes¹, Bernhard Pfahringer¹, Peter Reutemann¹, Ian H. Witten¹ - Show less +2 more•Institutions (1)

University of Waikato¹

16 Nov 2009-Sigkdd Explorations

TL;DR: This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.

...read moreread less

Abstract: More than twelve years have elapsed since the first public release of WEKA. In that time, the software has been rewritten entirely from scratch, evolved substantially and now accompanies a text on data mining [35]. These days, WEKA enjoys widespread acceptance in both academia and business, has an active community, and has been downloaded more than 1.4 million times since being placed on Source-Forge in April 2000. This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.

...read moreread less

19,603 citations

Journal Article•DOI•

The American Rheumatism Association 1987 revised criteria for the classification of rheumatoid arthritis.

[...]

Frank C. Arnett¹, Steven M. Edworthy², Daniel A. Bloch², Dennis J. McShane², James F. Fries², Norman S. Cooper³, L. A. Healey, Stephen R. Kaplan⁴, Matthew H. Liang⁵, Harvinder S. Luthra⁶, Thomas A. Medsger⁷, Donald M. Mitchell⁸, David H. Neustadt⁹, Robert S. Pinals¹⁰, Jane G. Schaller¹¹, John T. Sharp, Ronald L. Wilder, Gene G. Hunder⁶ - Show less +14 more•Institutions (11)

University of Texas Health Science Center at Houston¹, Stanford University², New York University³, Brown University⁴, Harvard University⁵, Mayo Clinic⁶, University of Pittsburgh⁷, University of Saskatchewan⁸, University of Louisville⁹, Rutgers University¹⁰, Tufts Medical Center¹¹

01 Mar 1988-Arthritis & Rheumatism

TL;DR: The revised criteria for the classification of rheumatoid arthritis (RA) were formulated from a computerized analysis of 262 contemporary, consecutively studied patients with RA and 262 control subjects with rheumatic diseases other than RA (non-RA).

...read moreread less

Abstract: The revised criteria for the classification of rheumatoid arthritis (RA) were formulated from a computerized analysis of 262 contemporary, consecutively studied patients with RA and 262 control subjects with rheumatic diseases other than RA (non-RA). The new criteria are as follows: 1) morning stiffness in and around joints lasting at least 1 hour before maximal improvement; 2) soft tissue swelling (arthritis) of 3 or more joint areas observed by a physician; 3) swelling (arthritis) of the proximal interphalangeal, metacarpophalangeal, or wrist joints; 4) symmetric swelling (arthritis); 5) rheumatoid nodules; 6) the presence of rheumatoid factor; and 7) radiographic erosions and/or periarticular osteopenia in hand and/or wrist joints. Criteria 1 through 4 must have been present for at least 6 weeks. Rheumatoid arthritis is defined by the presence of 4 or more criteria, and no further qualifications (classic, definite, or probable) or list of exclusions are required. In addition, a "classification tree" schema is presented which performs equally as well as the traditional (4 of 7) format. The new criteria demonstrated 91-94% sensitivity and 89% specificity for RA when compared with non-RA rheumatic disease control subjects.

...read moreread less

19,409 citations

Collapse

Classification and Regression Trees.

Citations

Related Papers (5)