Home
/
Authors
/
Pedro J. García-Laencina

Author

Pedro J. García-Laencina

Other affiliations: Universidad Politécnica de Cartagena

Bio: Pedro J. García-Laencina is an academic researcher from United States Air Force Academy. The author has contributed to research in topics: Missing data & Artificial neural network. The author has an hindex of 11, co-authored 25 publications receiving 1381 citations. Previous affiliations of Pedro J. García-Laencina include Universidad Politécnica de Cartagena.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Pattern classification with missing data: a review

[...]

Pedro J. García-Laencina¹, José-Luis Sancho-Gómez¹, Aníbal R. Figueiras-Vidal²•Institutions (2)

Universidad Politécnica de Cartagena¹, Charles III University of Madrid²

01 Mar 2010-Neural Computing and Applications

TL;DR: The aim of this work is to analyze the missing data problem in pattern classification tasks, and to summarize and compare some of the well-known methods used for handling missing values.

...read moreread less

Abstract: Pattern classification has been successfully applied in many problem domains, such as biometric recognition, document classification or medical diagnosis. Missing or unknown data are a common drawback that pattern recognition techniques need to deal with when solving real-life classification tasks. Machine learning approaches and methods imported from statistical learning theory have been most intensively studied and used in this subject. The aim of this work is to analyze the missing data problem in pattern classification tasks, and to summarize and compare some of the well-known methods used for handling missing values.

...read moreread less

625 citations

Journal Article•DOI•

Missing data imputation using statistical and machine learning methods in a real breast cancer problem

[...]

José M. Jerez¹, Ignacio Molina¹, Pedro J. García-Laencina², Emilio Alba, Nuria Ribelles, Miguel Martín³, Leonardo Franco¹ - Show less +3 more•Institutions (3)

University of Málaga¹, Universidad Politécnica de Cartagena², Hospital Clínico San Carlos³

01 Oct 2010-Artificial Intelligence in Medicine

TL;DR: The method based on machine learning techniques were the most suited for the imputation of missing values and led to a significant enhancement of prognosis accuracy compared to imputation methods based on statistical procedures.

...read moreread less

401 citations

Journal Article•DOI•

K nearest neighbours with mutual information for simultaneous classification and missing data imputation

[...]

Pedro J. García-Laencina¹, José-Luis Sancho-Gómez¹, Aníbal R. Figueiras-Vidal², Michel Verleysen³•Institutions (3)

Universidad Politécnica de Cartagena¹, Charles III University of Madrid², Université catholique de Louvain³

01 Mar 2009-Neurocomputing

TL;DR: This article proposes a novel KNN imputation procedure using a feature-weighted distance metric based on mutual information (MI), which provides a missing data estimation aimed at solving the classification task.

...read moreread less

193 citations

Journal Article•DOI•

Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values

[...]

Pedro J. García-Laencina¹, Pedro Henriques Abreu², Miguel Henriques Abreu, Noémia Afonoso•Institutions (2)

United States Air Force Academy¹, University of Coimbra²

01 Apr 2015-Computers in Biology and Medicine

TL;DR: This research work analyzes a real breast cancer dataset from Institute Portuguese of Oncology of Porto with a high percentage of unknown categorical information and constructed prediction models for breast cancer survivability using K-Nearest Neighbors, Classification Trees, Logistic Regression and Support Vector Machines.

...read moreread less

120 citations

Journal Article•DOI•

Efficient feature selection and linear discrimination of EEG signals

[...]

Germán Rodríguez-Bermúdez¹, Pedro J. García-Laencina¹, Joaquín Roca-González², Joaquín Roca-Dorda¹•Institutions (2)

United States Air Force Academy¹, Universidad Politécnica de Cartagena²

04 Sep 2013-Neurocomputing

TL;DR: An efficient embedded approach for feature selection and linear discrimination of EEG signals is presented, which efficiently selects and combines the most useful features for classification with less computational requirements.

...read moreread less

64 citations

1
2
3
4
…
5

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Book•

Applied Predictive Modeling

[...]

Max Kuhn, Kjell Johnson

17 May 2013

TL;DR: This research presents a novel and scalable approach called “Smartfitting” that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of designing and implementing statistical models for regression models.

...read moreread less

Abstract: General Strategies.- Regression Models.- Classification Models.- Other Considerations.- Appendix.- References.- Indices.

...read moreread less

3,672 citations

Journal Article•DOI•

Multiple Imputation for Nonresponse in Surveys

[...]

Roger A. Sugden¹•Institutions (1)

Goldsmiths, University of London¹

01 May 1988-Journal of The Royal Statistical Society Series A-statistics in Society

TL;DR: It is concluded that multiple Imputation for Nonresponse in Surveys should be considered as a legitimate method for answering the question of why people do not respond to survey questions.

...read moreread less

Abstract: 25. Multiple Imputation for Nonresponse in Surveys. By D. B. Rubin. ISBN 0 471 08705 X. Wiley, Chichester, 1987. 258 pp. £30.25.

...read moreread less

3,216 citations

Book•

Flexible Imputation of Missing Data

[...]

Stef van Buuren

29 Mar 2012

TL;DR: The problem of missing data concepts of MCAR, MAR and MNAR simple solutions that do not (always) work multiple imputation in a nutshell and some dangers, some do's and some don'ts are covered.

...read moreread less

Abstract: Basics Introduction The problem of missing data Concepts of MCAR, MAR and MNAR Simple solutions that do not (always) work Multiple imputation in a nutshell Goal of the book What the book does not cover Structure of the book Exercises Multiple imputation Historic overview Incomplete data concepts Why and when multiple imputation works Statistical intervals and tests Evaluation criteria When to use multiple imputation How many imputations? Exercises Univariate missing data How to generate multiple imputations Imputation under the normal linear normal Imputation under non-normal distributions Predictive mean matching Categorical data Other data types Classification and regression trees Multilevel data Non-ignorable methods Exercises Multivariate missing data Missing data pattern Issues in multivariate imputation Monotone data imputation Joint Modeling Fully Conditional Specification FCS and JM Conclusion Exercises Imputation in practice Overview of modeling choices Ignorable or non-ignorable? Model form and predictors Derived variables Algorithmic options Diagnostics Conclusion Exercises Analysis of imputed data What to do with the imputed data? Parameter pooling Statistical tests for multiple imputation Stepwise model selection Conclusion Exercises Case studies Measurement issues Too many columns Sensitivity analysis Correct prevalence estimates from self-reported data Enhancing comparability Exercises Selection issues Correcting for selective drop-out Correcting for non-response Exercises Longitudinal data Long and wide format SE Fireworks Disaster Study Time raster imputation Conclusion Exercises Extensions Conclusion Some dangers, some do's and some don'ts Reporting Other applications Future developments Exercises Appendices: Software R S-Plus Stata SAS SPSS Other software References Author Index Subject Index

...read moreread less

2,156 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse