scispace - formally typeset
Search or ask a question

Showing papers on "AdaBoost published in 1998"


Proceedings ArticleDOI
24 Jul 1998
TL;DR: Several improvements to Freund and Schapire’s AdaBoost boosting algorithm are described, particularly in a setting in which hypotheses may assign confidences to each of their predictions.
Abstract: We describe several improvements to Freund and Schapire‘s AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their predictions. We give a simplified analysis of AdaBoost in this setting, and we show how this analysis can be used to find improved parameter settings as well as a refined criterion for training weak hypotheses. We give a specific method for assigning confidences to the predictions of decision trees, a method closely related to one used by Quinlan. This method also suggests a technique for growing decision trees which turns out to be identical to one proposed by Kearns and Mansour. We focus next on how to apply the new boosting algorithms to multiclass classification problems, particularly to the multi-label case in which each example may belong to more than one class. We give two boosting methods for this problem, plus a third method based on output coding. One of these leads to a new method for handling the single-label case which is simpler but as effective as techniques suggested by Freund and Schapire. Finally, we give some experimental results comparing a few of the algorithms discussed in this paper.

2,900 citations


Proceedings ArticleDOI
01 Aug 1998
TL;DR: This paper discusses two learning algorithms for text filtering: modified Rocchio and a boosting algorithm called AdaBoost, and shows how both algorithms can be adapted to maximize any general utility matrix that associates cost for each pair of machine prediction and correct label.
Abstract: We discuss two learning algorithms for text filtering: modified Rocchio and a boosting algorithm called AdaBoost. We show how both algorithms can be adapted to maximize any general utility matrix that associates cost (or gain) for each pair of machine prediction and correct label. We first show that AdaBoost significantly outperforms another highly effective text filtering algorithm. We then compare AdaBoost and Rocchio over three large text filtering tasks. Overall both algorithms are comparable and are quite effective. AdaBoost produces better classifiers than Rocchio when the training collection contains a very large number of relevant documents. However, on these tasks, Rocchio runs much faster than AdaBoost.

363 citations


Proceedings Article
Adam J. Grove1, Dale Schuurmans1
01 Jul 1998
TL;DR: The crucial question as to why boosting works so well in practice, and how to further improve upon it, remains mostly open, and it is concluded that no simple version of the minimum-margin story can be complete.
Abstract: The "minimum margin" of an ensemble classifier on a given training set is, roughly speaking, the smallest vote it gives to any correct training label. Recent work has shown that the Adaboost algorithm is particularly effective at producing ensembles with large minimum margins, and theory suggests that this may account for its success at reducing generalization error. We note, however, that the problem of finding good margins is closely related to linear programming, and we use this connection to derive and test new "LPboosting" algorithms that achieve better minimum margins than Adaboost.However, these algorithms do not always yield better generalization performance. In fact, more often the opposite is true. We report on a series of controlled experiments which show that no simple version of the minimum-margin story can be complete. We conclude that the crucial question as to why boosting works so well in practice, and how to further improve upon it, remains mostly open.Some of our experiments are interesting for another reason: we show that Adaboost sometimes does overfit--eventually. This may take a very long time to occur, however, which is perhaps why this phenomenon has gone largely unnoticed.

275 citations


Proceedings Article
01 Jan 1998
TL;DR: The paper shows asymptotic experimental results with RBF networks for the binary classi-cation case and proposes a regularized improved version of AdaBoost, called AdaBoostreg, which is shown the usefulness of this improvement in numerical simulations.
Abstract: Recent work has shown that combining multiple versions of weak classiiers such as decision trees or neural networks results in reduced test set error. To study this in greater detail, we analyze the asymptotic behavior of AdaBoost. The theoretical analysis establishes the relation between the distribution of margins of the training examples and the generated voting classiication rule. The paper shows asymptotic experimental results with RBF networks for the binary classi-cation case underlining the theoretical ndings. Our experiments show that AdaBoost does overrt, indeed. In order to avoid this and to get better generalization performance, we propose a regularized improved version of AdaBoost, which is called AdaBoostreg. We show the usefulness of this improvement in numerical simulations.

56 citations


Proceedings Article
01 Dec 1998
TL;DR: Three algorithms to allow for soft margin classification by introducing regularization with slack variables into the boosting concept are proposed: AdaBoostreg and regularized versions of (2) linear and quadratic programming AdaBoost and support vector machine.
Abstract: Boosting methods maximize a hard classification margin and are known as powerful techniques that do not exhibit overfitting for low noise cases. Also for noisy data boosting will try to enforce a hard margin and thereby give too much weight to outliers, which then leads to the dilemma of non-smooth fits and overfitting. Therefore we propose three algorithms to allow for soft margin classification by introducing regularization with slack variables into the boosting concept: (1) AdaBoostreg and regularized versions of (2) linear and (3) quadratic programming AdaBoost. Experiments show the usefulness of the proposed algorithms in comparison to another soft margin classifier: the support vector machine.

52 citations


Proceedings Article
01 Dec 1998
TL;DR: Cumulative training margin distributions for AdaBoost versus the "Direct Optimization Of Margins" (DOOM) algorithm, which sacrifices significant training error for improved test error.
Abstract: Cumulative training margin distributions for AdaBoost versus our "Direct Optimization Of Margins" (DOOM) algorithm. The dark curve is AdaBoost, the light curve is DOOM. DOOM sacrifices significant training error for improved test error (horizontal marks on margin = 0 line).

44 citations


Book ChapterDOI
02 Sep 1998
TL;DR: The paper shows asymptotic experimental results for the binary classification case and the relation between the model complexity and noise in the training data, and how to improve AdaBoost type algorithms in practice are discussed.
Abstract: Recent work has shown that combining multiple versions of weak classifiers such as decision trees or neural networks results in reduced test set error. To study this in greater detail, we analyze the asymptotic behavior of AdaBoost type algorithms. The theoretical analysis establishes the relation between the distribution of margins of the training examples and the generated voting classification rule. The paper shows asymptotic experimental results for the binary classification case underlining the theoretical findings. Finally, the relation between the model complexity and noise in the training data, and how to improve AdaBoost type algorithms in practice are discussed.

22 citations