Unbiased Recursive Partitioning: A Conditional Inference Framework

doi:10.1198/106186006X133933

Journal ArticleDOI

Unbiased Recursive Partitioning: A Conditional Inference Framework

Torsten Hothorn, +2 more

- 01 Sep 2006 -

Journal of Computational and Graphical S...

- Vol. 15, Iss: 3, pp 651-674

Chats0

TLDR

A unified framework for recursive partitioning is proposed which embeds tree-structured regression models into a well defined theory of conditional inference procedures and it is shown that the predicted accuracy of trees with early stopping is equivalent to the prediction accuracy of pruned trees with unbiased variable selection.

Abstract:

Recursive binary partitioning is a popular tool for regression analysis. Two fundamental problems of exhaustive search procedures usually applied to fit such models have been known for a long time: overfitting and a selection bias towards covariates with many possible splits or missing values. While pruning procedures are able to solve the overfitting problem, the variable selection bias still seriously affects the interpretability of tree-structured regression models. For some special cases unbiased procedures have been suggested, however lacking a common theoretical foundation. We propose a unified framework for recursive partitioning which embeds tree-structured regression models into a well defined theory of conditional inference procedures. Stopping criteria based on multiple test procedures are implemented and it is shown that the predictive performance of the resulting trees is as good as the performance of established exhaustive search procedures. It turns out that the partitions and therefore the...

Unbiased Recursive Partitioning: A Conditional Inference Framework

Citations

Classification and regression trees

Building Predictive Models in R Using the caret Package

Applied Predictive Modeling

Bias in random forest variable importance measures: Illustrations, sources and a solution

Conditional variable importance for random forests

References

Applied Logistic Regression

Applied Logistic Regression.

Classification and Regression Trees.

C4.5: Programs for Machine Learning

O'Brien's

Related Papers (5)

Random Forests

Classification and Regression Trees.

R: A language and environment for statistical computing.

Classification and Regression by randomForest

Bagging predictors