Home
/
Authors
/
Olvi L. Mangasarian

Author

Olvi L. Mangasarian

Other affiliations: University of Wisconsin-Madison, University of Oxford, Royal Dutch Shell

Bio: Olvi L. Mangasarian is an academic researcher from University of California, San Diego. The author has contributed to research in topics: Linear programming & Support vector machine. The author has an hindex of 77, co-authored 208 publications receiving 25677 citations. Previous affiliations of Olvi L. Mangasarian include University of Wisconsin-Madison & University of Oxford.

Papers published on a yearly basis

2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1971
1970
1969
1968
1967
1966
1965
1964
1963
1962

1 / 2

Papers

PDF

Open Access

More filters

Book•

Nonlinear Programming

[...]

Olvi L. Mangasarian

01 Jan 1969

TL;DR: It is shown that if A is closed for all k → x x, k → y y, where ( k A ∈ ) k y x , then ( ) A ∉ y x .

...read moreread less

Abstract: Part 1 (if): Assume that Z is closed. We must show that if A is closed for all k → x x , k → y y , where ( k A ∈ ) k y x , then ( ) A ∈ y x . By the definition of Z being closed, we know that all points arbitrarily close to Z are in Z. Let k → x x , k → y y , and ( k A ∈ ) k y x . Now, for any ε > 0, there exists an N such that for all k ≥ N we have || || k ε − < x x , || || k ε − < y y which implies that ( ) , x y is arbitrarily close to Z, so ( ) , x y ∈ Z and ( ) A ∈ y x . Thus, A is closed.

...read moreread less

2,146 citations

Proceedings Article•

Feature Selection via Concave Minimization and Support Vector Machines

[...]

Paul S. Bradley, Olvi L. Mangasarian

24 Jul 1998

TL;DR: Numerical tests on 6 public data sets show that classi ers trained by the concave minimization approach and those trained by a support vector machine have comparable 10fold cross-validation correctness.

...read moreread less

Abstract: Computational comparison is made between two feature selection approaches for nding a separating plane that discriminates between two point sets in an n-dimensional feature space that utilizes as few of the n features (dimensions) as possible. In the concave minimization approach [19, 5] a separating plane is generated by minimizing a weighted sum of distances of misclassi ed points to two parallel planes that bound the sets and which determine the separating plane midway between them. Furthermore, the number of dimensions of the space used to determine the plane is minimized. In the support vector machine approach [27, 7, 1, 10, 24, 28], in addition to minimizing the weighted sum of distances of misclassi ed points to the bounding planes, we also maximize the distance between the two bounding planes that generate the separating plane. Computational results show that feature suppression is an indirect consequence of the support vector machine approach when an appropriate norm is used. Numerical tests on 6 public data sets show that classi ers trained by the concave minimization approach and those trained by a support vector machine have comparable 10fold cross-validation correctness. However, in all data sets tested, the classi ers obtained by the concave minimization approach selected fewer problem features than those trained by a support vector machine.

...read moreread less

1,074 citations

Journal Article•DOI•

Multisurface Method of Pattern Separation for Medical Diagnosis Applied to Breast Cytology

[...]

William H. Wolberg¹, Olvi L. Mangasarian¹•Institutions (1)

University of Wisconsin-Madison¹

01 Dec 1990-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: The diagnosis of breast cytology is used to demonstrate the applicability ofMultisurface pattern separation to medical diagnosis and decision making and it is found that this mathematical method is applicable to other medical diagnostic and decision-making problems.

...read moreread less

Abstract: Multisurface pattern separation is a mathematical method for distinguishing between elements of two pattern sets. Each element of the pattern sets is comprised of various scalar observations. In this paper, we use the diagnosis of breast cytology to demonstrate the applicability of this method to medical diagnosis and decision making. Each of 11 cytological characteristics of breast fine-needle aspirates reported to differ between benign and malignant samples was graded 1 to 10 at the time of sample collection. Nine characteristics were found to differ significantly between benign and malignant samples. Mathematically, these values for each sample were represented by a point in a nine-dimensional space of real variables. Benign points were separated from malignant ones by planes determined by linear programming. Correct separation was accomplished in 369 of 370 samples (201 benign and 169 malignant). In the one misclassified malignant case, the fine-needle aspirate cytology was so definitely benign and the cytology of the excised cancer so definitely malignant that we believe the tumor was missed on aspiration. Our mathematical method is applicable to other medical diagnostic and decision-making problems.

...read moreread less

1,021 citations

Proceedings Article•DOI•

Proximal support vector machine classifiers

[...]

Glenn Fung¹, Olvi L. Mangasarian¹•Institutions (1)

University of Wisconsin-Madison¹

26 Aug 2001

TL;DR: Computational results on publicly available datasets indicate that the proposed proximal SVM classifier has comparable test set correctness to that of standard S VM classifiers, but with considerably faster computational time that can be an order of magnitude faster.

...read moreread less

Abstract: Instead of a standard support vector machine (SVM) that classifies points by assigning them to one of two disjoint half-spaces, points are classified by assigning them to the closest of two parallel planes (in input or feature space) that are pushed apart as far as possible. This formulation, which can also be interpreted as regularized least squares and considered in the much more general context of regularized networks [8, 9], leads to an extremely fast and simple algorithm for generating a linear or nonlinear classifier that merely requires the solution of a single system of linear equations. In contrast, standard SVMs solve a quadratic or a linear program that require considerably longer computational time. Computational results on publicly available datasets indicate that the proposed proximal SVM classifier has comparable test set correctness to that of standard SVM classifiers, but with considerably faster computational time that can be an order of magnitude faster. The linear proximal SVM can easily handle large datasets as indicated by the classification of a 2 million point 10-attribute set in 20.8 seconds. All computational results are based on 6 lines of MATLAB code.

...read moreread less

846 citations

Journal Article•DOI•

Breast Cancer Diagnosis and Prognosis Via Linear Programming

[...]

Olvi L. Mangasarian¹, W. Nick Street¹, William H. Wolberg¹•Institutions (1)

University of Wisconsin-Madison¹

01 Aug 1995-Operations Research

TL;DR: In this paper, linear programming-based machine learning techniques are used to increase the accuracy and objectivity of breast cancer diagnosis and prognosis, and two medical applications of linear programming are described in this paper.

...read moreread less

Abstract: Two medical applications of linear programming are described in this paper. Specifically, linear programming-based machine learning techniques are used to increase the accuracy and objectivity of breast cancer diagnosis and prognosis. The first application to breast cancer diagnosis utilizes characteristics of individual cells, obtained from a minimally invasive fine needle aspirate, to discriminate benign from malignant breast lumps. This allows an accurate diagnosis without the need for a surgical biopsy. The diagnostic system in current operation at University of Wisconsin Hospitals was trained on samples from 569 patients and has had 100% chronological correctness in diagnosing 131 subsequent patients. The second application, recently put into clinical practice, is a method that constructs a surface that predicts when breast cancer is likely to recur in patients that have had their cancers excised. This gives the physician and the patient better information with which to plan treatment, and may elimin...

...read moreread less

815 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

LIBSVM: A library for support vector machines

[...]

Chih-Chung Chang¹, Chih-Jen Lin¹•Institutions (1)

National Taiwan University¹

06 May 2011-ACM Transactions on Intelligent Systems and Technology

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

...read moreread less

Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

...read moreread less

40,826 citations

Journal Article•DOI•

I and i

[...]

Kevin Barraclough

08 Dec 2001-BMJ

TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.

...read moreread less

Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

...read moreread less

33,785 citations

Journal Article•

Visualizing Data using t-SNE

[...]

Laurens van der Maaten, Geoffrey E. Hinton

01 Jan 2008-Journal of Machine Learning Research

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.

...read moreread less

Abstract: We present a new technique called “t-SNE” that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map. The technique is a variation of Stochastic Neighbor Embedding (Hinton and Roweis, 2002) that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map. t-SNE is better than existing techniques at creating a single map that reveals structure at many different scales. This is particularly important for high-dimensional data that lie on several different, but related, low-dimensional manifolds, such as images of objects from multiple classes seen from multiple viewpoints. For visualizing the structure of very large datasets, we show how t-SNE can use random walks on neighborhood graphs to allow the implicit structure of all of the data to influence the way in which a subset of the data is displayed. We illustrate the performance of t-SNE on a wide variety of datasets and compare it with many other non-parametric visualization techniques, including Sammon mapping, Isomap, and Locally Linear Embedding. The visualizations produced by t-SNE are significantly better than those produced by the other techniques on almost all of the datasets.

...read moreread less

30,124 citations

Journal Article•DOI•

A Tutorial on Support Vector Machines for Pattern Recognition

[...]

Christopher John Burges¹•Institutions (1)

Alcatel-Lucent¹

01 Jun 1998-Data Mining and Knowledge Discovery

TL;DR: There are several arguments which support the observed high accuracy of SVMs, which are reviewed and numerous examples and proofs of most of the key theorems are given.

...read moreread less

Abstract: The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light.

...read moreread less

15,696 citations

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse