Showing papers on "AdaBoost published in 1998"

PDF

Open Access

Proceedings Article•DOI•

Improved boosting algorithms using confidence-rated predictions

[...]

Robert E. Schapire¹, Yoram Singer¹•Institutions (1)

24 Jul 1998

TL;DR: Several improvements to Freund and Schapire’s AdaBoost boosting algorithm are described, particularly in a setting in which hypotheses may assign confidences to each of their predictions.

...read moreread less

Abstract: We describe several improvements to Freund and Schapire‘s AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their predictions. We give a simplified analysis of AdaBoost in this setting, and we show how this analysis can be used to find improved parameter settings as well as a refined criterion for training weak hypotheses. We give a specific method for assigning confidences to the predictions of decision trees, a method closely related to one used by Quinlan. This method also suggests a technique for growing decision trees which turns out to be identical to one proposed by Kearns and Mansour. We focus next on how to apply the new boosting algorithms to multiclass classification problems, particularly to the multi-label case in which each example may belong to more than one class. We give two boosting methods for this problem, plus a third method based on output coding. One of these leads to a new method for handling the single-label case which is simpler but as effective as techniques suggested by Freund and Schapire. Finally, we give some experimental results comparing a few of the algorithms discussed in this paper.

...read moreread less

2,900 citations

Proceedings Article•DOI•

Boosting and Rocchio applied to text filtering

[...]

Robert E. Schapire¹, Yoram Singer¹, Amit Singhal¹•Institutions (1)

AT&T Labs¹

01 Aug 1998

TL;DR: This paper discusses two learning algorithms for text filtering: modified Rocchio and a boosting algorithm called AdaBoost, and shows how both algorithms can be adapted to maximize any general utility matrix that associates cost for each pair of machine prediction and correct label.

...read moreread less

Abstract: We discuss two learning algorithms for text filtering: modified Rocchio and a boosting algorithm called AdaBoost. We show how both algorithms can be adapted to maximize any general utility matrix that associates cost (or gain) for each pair of machine prediction and correct label. We first show that AdaBoost significantly outperforms another highly effective text filtering algorithm. We then compare AdaBoost and Rocchio over three large text filtering tasks. Overall both algorithms are comparable and are quite effective. AdaBoost produces better classifiers than Rocchio when the training collection contains a very large number of relevant documents. However, on these tasks, Rocchio runs much faster than AdaBoost.

...read moreread less

363 citations

Proceedings Article•

Boosting in the limit: maximizing the margin of learned ensembles

[...]

Adam J. Grove¹, Dale Schuurmans¹•Institutions (1)

NEC¹

01 Jul 1998

TL;DR: The crucial question as to why boosting works so well in practice, and how to further improve upon it, remains mostly open, and it is concluded that no simple version of the minimum-margin story can be complete.

...read moreread less

Abstract: The "minimum margin" of an ensemble classifier on a given training set is, roughly speaking, the smallest vote it gives to any correct training label. Recent work has shown that the Adaboost algorithm is particularly effective at producing ensembles with large minimum margins, and theory suggests that this may account for its success at reducing generalization error. We note, however, that the problem of finding good margins is closely related to linear programming, and we use this connection to derive and test new "LPboosting" algorithms that achieve better minimum margins than Adaboost.However, these algorithms do not always yield better generalization performance. In fact, more often the opposite is true. We report on a series of controlled experiments which show that no simple version of the minimum-margin story can be complete. We conclude that the crucial question as to why boosting works so well in practice, and how to further improve upon it, remains mostly open.Some of our experiments are interesting for another reason: we show that Adaboost sometimes does overfit--eventually. This may take a very long time to occur, however, which is perhaps why this phenomenon has gone largely unnoticed.

...read moreread less

275 citations

Proceedings Article•

An Improvement of AdaBoost to Avoid Overfitting

[...]

G Raeetsch, Takashi Onoda, Klaus-Robert Müller

01 Jan 1998

TL;DR: The paper shows asymptotic experimental results with RBF networks for the binary classi-cation case and proposes a regularized improved version of AdaBoost, called AdaBoostreg, which is shown the usefulness of this improvement in numerical simulations.

...read moreread less

Abstract: Recent work has shown that combining multiple versions of weak classiiers such as decision trees or neural networks results in reduced test set error. To study this in greater detail, we analyze the asymptotic behavior of AdaBoost. The theoretical analysis establishes the relation between the distribution of margins of the training examples and the generated voting classiication rule. The paper shows asymptotic experimental results with RBF networks for the binary classi-cation case underlining the theoretical ndings. Our experiments show that AdaBoost does overrt, indeed. In order to avoid this and to get better generalization performance, we propose a regularized improved version of AdaBoost, which is called AdaBoostreg. We show the usefulness of this improvement in numerical simulations.

...read moreread less

56 citations

Proceedings Article•

Regularizing AdaBoost

[...]

Gunnar Rätsch, Takashi Onoda, Klaus R. Muller

01 Dec 1998

TL;DR: Three algorithms to allow for soft margin classification by introducing regularization with slack variables into the boosting concept are proposed: AdaBoostreg and regularized versions of (2) linear and quadratic programming AdaBoost and support vector machine.

...read moreread less

Abstract: Boosting methods maximize a hard classification margin and are known as powerful techniques that do not exhibit overfitting for low noise cases. Also for noisy data boosting will try to enforce a hard margin and thereby give too much weight to outliers, which then leads to the dilemma of non-smooth fits and overfitting. Therefore we propose three algorithms to allow for soft margin classification by introducing regularization with slack variables into the boosting concept: (1) AdaBoostreg and regularized versions of (2) linear and (3) quadratic programming AdaBoost. Experiments show the usefulness of the proposed algorithms in comparison to another soft margin classifier: the support vector machine.

...read moreread less

52 citations

Proceedings Article•

Direct Optimization of Margins Improves Generalization in Combined Classifiers

[...]

Llew Mason¹, Peter L. Bartlett¹, Jonathan Baxter¹•Institutions (1)

Australian National University¹

01 Dec 1998

TL;DR: Cumulative training margin distributions for AdaBoost versus the "Direct Optimization Of Margins" (DOOM) algorithm, which sacrifices significant training error for improved test error.

...read moreread less

Abstract: Cumulative training margin distributions for AdaBoost versus our "Direct Optimization Of Margins" (DOOM) algorithm. The dark curve is AdaBoost, the light curve is DOOM. DOOM sacrifices significant training error for improved test error (horizontal marks on margin = 0 line).

...read moreread less

44 citations

Book Chapter•DOI•

An asymptotic analysis of AdaBoost in the binary classification case

[...]

Takashi Onoda, Gunnar Rätsch, Klaus-Robert Müller

02 Sep 1998

TL;DR: The paper shows asymptotic experimental results for the binary classification case and the relation between the model complexity and noise in the training data, and how to improve AdaBoost type algorithms in practice are discussed.

...read moreread less

Abstract: Recent work has shown that combining multiple versions of weak classifiers such as decision trees or neural networks results in reduced test set error. To study this in greater detail, we analyze the asymptotic behavior of AdaBoost type algorithms. The theoretical analysis establishes the relation between the distribution of margins of the training examples and the generated voting classification rule. The paper shows asymptotic experimental results for the binary classification case underlining the theoretical findings. Finally, the relation between the model complexity and noise in the training data, and how to improve AdaBoost type algorithms in practice are discussed.

...read moreread less

22 citations