Large Margin DAGs for Multiclass Classification
Citations
40,826 citations
Cites background from "Large Margin DAGs for Multiclass Cl..."
...(Weston and Watkins, 1998; Platt et al., 2000)) have shown that it does not perform as good as “one-against-one” In addition, though we have to train as many as k(k − 1)/2 classifiers, as each problem is smaller (only data from two classes), the total training time may not be more than the...
[...]
10,141 citations
8,059 citations
Cites methods from "Large Margin DAGs for Multiclass Cl..."
...We can reduce the test time by structuring the classes into a DAG (directed acyclic graph), and performing O(C) pairwise comparisons (Platt et al. 2000)....
[...]
2,228 citations
Cites background from "Large Margin DAGs for Multiclass Cl..."
...By making similar classes share large parts of their decision paths, fewer classification functions need to be learned, thereby increasing the system’s performance [26]....
[...]
2,214 citations
Cites background from "Large Margin DAGs for Multiclass Cl..."
...Furthermore, previous work on the generalization properties of large margin DAGs (Platt et al., 2000) for multiclass problems showed that the generalization properties depend on the l2-norm of M (see also Crammer and Singer, 2000)....
[...]
References
26,531 citations
"Large Margin DAGs for Multiclass Cl..." refers methods in this paper
...The DAGSVM algorithm was evaluated on two different test set s: the USPS handwritten digit data set [9] and the UCI Letter data set [2]....
[...]
...The standard method for N -class SVMs [9] is to construct N SVMs....
[...]
12,940 citations
"Large Margin DAGs for Multiclass Cl..." refers methods in this paper
...The DAGSVM algorithm was evaluated on two different test set s: the USPS handwritten digit data set [9] and the UCI Letter data set [2]....
[...]
5,350 citations
"Large Margin DAGs for Multiclass Cl..." refers methods in this paper
...Empirically, SVM training is observed to scale super-linearly with the training set size m [7], according to a power law: T = crn"Y , where 'Y ~ 2 for algorithms based on the decomposition method, with some proportionality constant c....
[...]
...On each data set, we trained N 1-v-r SVMs and K 1-v-1 SVMs, using SMO [7], with soft margins....
[...]
5,019 citations
3,356 citations