Pegasos: primal estimated sub-gradient solver for SVM
read more
Citations
Online Multiple Kernel Learning for Structured Prediction
A discriminative approach for appearance based loop closing
Machine Learning with Abstention for Automated Liver Disease Diagnosis
Learning with dataset bias in latent subcategory models
Learning to Rank Based on Subsequences
References
Convex Optimization
Statistical learning theory
Pattern Classification and Scene Analysis.
Pattern classification and scene analysis
Advances in kernel methods: support vector learning
Related Papers (5)
Frequently Asked Questions (9)
Q2. What is the criterion for a -accurate optimization of a?
(For instance, if full optimization of SVM yields a test classification error of 1%, then the authors chose such that a -accurate optimization would guarantee test classification error of at most 1.1%.)
Q3. What is the time complexity of the algorithms in this family?
While algorithms in this family are fairly simple to implement and entertain general asymptotic convergence properties [8], the time complexity of most of the algorithms in this family is typically super linear in the training set size m.
Q4. What is the value of fj at w0?
2. If f(w) = maxi fi(w) for r differentiable functions f1, . . . , fr, and j = arg maxi fi(w0), then the gradient of fj at w0 is a sub-gradient of f at w0.
Q5. What is the way to find a solution to the SVM objective?
Cutting Planes Approach: Recently, Joachims [21] proposed SVM-Perf, which uses a cutting planes method to find a solution with accuracy in timeO(md/(λ 2)).
Q6. What is the performance of the kernelized Pegasos variant?
as the authors show in the sequel, the kernelized Pegasos variant described in section 4 gives good performance on a range of kernel SVM problems, provided that these problems have sufficient regularization.
Q7. What is the optimum form of the SVM learning problem?
In its more traditional form, the SVM learning problem was described as the following constrained optimization problem,1 2 ‖w‖2 + C m∑ i=1 ξi s.t. ∀i ∈ [m] : ξi ≥ 0, ξi ≥ 1− yi 〈w,xi〉 .
Q8. How can the authors implement the Pegasos algorithm?
The authors now show that the Pegasos algorithm can be implemented using only kernel evaluations, without direct access to the feature vectors φ(x) or explicit access to the weight vector w.
Q9. What is the primary suboptimality threshold for the Pegasos variant?
As in the linear experiments, the authors chose a primal suboptimality threshold for each dataset which guarantees a testing classification error within 10% of that at the optimum.