scispace - formally typeset
Search or ask a question

Showing papers by "John Platt published in 1998"


Journal ArticleDOI
TL;DR: This issue's collection of essays should help familiarize readers with this interesting new racehorse in the Machine Learning stable, and give a practical guide and a new technique for implementing the algorithm efficiently.
Abstract: My first exposure to Support Vector Machines came this spring when heard Sue Dumais present impressive results on text categorization using this analysis technique. This issue's collection of essays should help familiarize our readers with this interesting new racehorse in the Machine Learning stable. Bernhard Scholkopf, in an introductory overview, points out that a particular advantage of SVMs over other learning algorithms is that it can be analyzed theoretically using concepts from computational learning theory, and at the same time can achieve good performance when applied to real problems. Examples of these real-world applications are provided by Sue Dumais, who describes the aforementioned text-categorization problem, yielding the best results to date on the Reuters collection, and Edgar Osuna, who presents strong results on application to face detection. Our fourth author, John Platt, gives us a practical guide and a new technique for implementing the algorithm efficiently.

4,319 citations


Journal Article
John Platt1
TL;DR: The sequential minimal optimization (SMO) algorithm as mentioned in this paper uses a series of smallest possible QP problems to solve a large QP problem, which avoids using a time-consuming numerical QP optimization as an inner loop.
Abstract: This paper proposes a new algorithm for training support vector machines: Sequential Minimal Optimization, or SMO. Training a support vector machine requires the solution of a very large quadratic programming (QP) optimization problem. SMO breaks this large QP problem into a series of smallest possible QP problems. These small QP problems are solved analytically, which avoids using a time-consuming numerical QP optimization as an inner loop. The amount of memory required for SMO is linear in the training set size, which allows SMO to handle very large training sets. Because matrix computation is avoided, SMO scales somewhere between linear and quadratic in the training set size for various test problems, while the standard chunking SVM algorithm scales somewhere between linear and cubic in the training set size. SMO’s computation time is dominated by SVM evaluation, hence SMO is fastest for linear SVMs and sparse data sets. On realworld sparse data sets, SMO can be more than 1000 times faster than the chunking algorithm.

2,856 citations


Proceedings ArticleDOI
01 Nov 1998
TL;DR: A comparison of the effectiveness of five different automatic learning algorithms for text categorization in terms of learning speed, realtime classification speed, and classification accuracy is compared.
Abstract: 1. ABSTRACT Text categorization – the assignment of natural language texts to one or more predefined categories based on their content – is an important component in many information organization and management tasks. We compare the effectiveness of five different automatic learning algorithms for text categorization in terms of learning speed, realtime classification speed, and classification accuracy. We also examine training set size, and alternative document representations. Very accurate text classifiers can be learned automatically from training examples. Linear Support Vector Machines (SVMs) are particularly promising because they are very accurate, quick to train, and quick to evaluate. 1.1

1,606 citations


Patent
Eric Horvitz1, David Heckerman1, Susan T. Dumais1, Mehran Sahami1, John Platt1 
23 Jun 1998
TL;DR: In this article, a probabilistic classifier (e.g., a support vector machine) trained on prior content classifications is used to classify a message into one of a number of different folders, depicted in a pre-defined visually distinctive manner or simply discarded in its entirety.
Abstract: A technique, specifically a method and apparatus that implements the method, which through a probabilistic classifier (370) and, for a given recipient, detects electronic mail (e-mail) messages, in an incoming message stream, which that recipient is likely to consider "junk". Specifically, the invention discriminates message content for that recipient, through a probabilistic classifier (e.g., a support vector machine) trained on prior content classifications. Through a resulting quantitative probability measure, i.e., an output confidence level, produced by the classifier for each message and subsequently compared against a predefined threshold, that message is classified as either, e.g., spam or legitimate mail, and, e.g., then stored in a corresponding folder (223, 227) for subsequent retrieval by and display to the recipient. Based on the probability measure, the message can alternatively be classified into one of a number of different folders, depicted in a pre-defined visually distinctive manner or simply discarded in its entirety.

700 citations


Proceedings Article
John Platt1
01 Dec 1998
TL;DR: An algorithm for training SVMs: Sequential Minimal Optimization, or SMO, which breaks the large QP problem into a series of smallest possible QP problems which are analytically solvable and does not require a numerical QP library.
Abstract: Training a Support Vector Machine (SVM) requires the solution of a very large quadratic programming (QP) problem. This paper proposes an algorithm for training SVMs: Sequential Minimal Optimization, or SMO. SMO breaks the large QP problem into a series of smallest possible QP problems which are analytically solvable. Thus, SMO does not require a numerical QP library. SMO's computation time is dominated by evaluation of the kernel, hence kernel optimizations substantially quicken SMO. For the MNIST database, SMO is 1.7 times as fast as PCG chunking; while for the UCI Adult database and linear SVMs, SMO can be 1500 times faster than the PCG chunking algorithm.

327 citations


Patent
John Platt1
06 Apr 1998
TL;DR: In this paper, a quadratic programming problem involved in training support vector machines is solved by sweeping through a set of training examples, solving small subproblems of the problem, each of these sub-problems has an analytic solution.
Abstract: Solving a quadratic programming problem involved in training support vector machines by sweeping through a set of training examples, solving small sub-problems of the quadratic programming problem. Each of these sub-problems has an analytic solution, which is faster that the numerical quadratic programming solutions used in the prior art. In one embodiment, training examples with non-optimal Lagrange multipliers are adjusted, one at a time, until all are optimal (e.g. until all examples fulfill the Kuhn-Tucker conditions). In another embodiment, training examples with non-optimal Lagrange multipliers are paired and then adjusted, until all are optimal.

126 citations


Patent
31 Jul 1998
TL;DR: In this article, a force detecting touch pad with an outline whose height is low is provided with a substantially hard and strong touch face 18, substantially hard frame 12, plural spring constructions 20 integrally formed with the touch face and mechanically connected with a reference frame 12.
Abstract: PROBLEM TO BE SOLVED: To provide an inexpensive and highly precise force detecting touch pad with an outline whose height is low. SOLUTION: A force detecting touch pad 10 is provided with a substantially hard and strong touch face 18, substantially hard frame 12, plural spring constructions 20 integrally formed with the touch face 18 and mechanically connected with a reference frame 12, and circuit 30 for extracting the information of a force from an electric capacity proportional to a distance between the preliminarily decided part of the touch face 18 and the part of the frame 12 according to a force added to the touch face 18.

14 citations


Patent
16 Apr 1998
TL;DR: The use of an electronic key (100) and security software to provide security for a computer coupled to a touchpad (201) is discussed in this paper, where the computer security includes application security, data security, computer theft security and network security.
Abstract: Use of an electronic key (100) and security software to provide security for a computer that is coupled to a touchpad (201). The computer security includes application security, data security, computer theft security and network security. In a preferred embodiment, the electronic key (100) generates a security signal and converts it to a binary signal. The security code is extracted from the binary signal. The validity of the security code is verified. If the security code is valid a computer security operation is enabled.

1 citations