# Using New Models to Analyze Complex Regularities of the World

28 Apr 2014-Vol. 2, Iss: 1, pp 78-82

TL;DR: In this paper, the authors discuss issues related to model fitting, comparison of classification accuracy of generative and discriminative models, and two (or more) cultures of data modeling.

Abstract: This commentary to the recent article by Musso et al. (2013) discusses issues related to model fitting, comparison of classification accuracy of generative and discriminative models, and two (or more) cultures of data modeling. We start by questioning the extremely high classification accuracy with an empirical data from a complex domain. There is a risk that we model perfect nonsense perfectly. Our second concern is related to the relevance of comparing multilayer perceptron neural networks and linear discriminant analysis classification accuracy indices. We find this problematic, as it is like comparing apples and oranges. It would have been easier to interpret the model and the variable (group) importance’s if the authors would have compared MLP to some discriminative classifier, such as group lasso logistic regression. Finally, we conclude our commentary with a discussion about the predictive properties of the adopted data modeling approach.

##### Citations

More filters

••

TL;DR: In this paper, the authors apply the Developmental Model of Vocational Excellence (DMVE) in the domain of air traffic control and describe the characteristics and predictors related to air traffic controllers' vocational expertise and excellence.

Abstract: Purpose – The purpose of this paper is to apply the Developmental Model of Vocational Excellence (DMVE) in the domain of air traffic control and to describe the characteristics and predictors related to air traffic controllers’ (ATCO) vocational expertise and excellence. Based on DMVE, the study analyses the role of natural abilities (gifts), intrinsic characteristics (self-regulatory abilities) and extrinsic conditions (domain and non-domain specific factors) in ATCOs’ vocational development. Design/methodology/approach – The target population of the multiple case study consisted of ATCOs in Finland (N = 300), of which 28 were interviewed. The interviewees represented four different airports. Initially, three key personnel interviews were conducted to validate the structured theme interview instrument that was subsequently used for the 28 interviews. The data set also included the ATCOs’ aptitude test scores and training records. Employee assessments were used to determine their on-the-job performance le...

6 citations

##### References

More filters

•

TL;DR: A set of simple, yet safe and robust non-parametric tests for statistical comparisons of classifiers is recommended: the Wilcoxon signed ranks test for comparison of two classifiers and the Friedman test with the corresponding post-hoc tests for comparisons of more classifiers over multiple data sets.

Abstract: While methods for comparing two learning algorithms on a single data set have been scrutinized for quite some time already, the issue of statistical tests for comparisons of more algorithms on multiple data sets, which is even more essential to typical machine learning studies, has been all but ignored. This article reviews the current practice and then theoretically and empirically examines several suitable tests. Based on that, we recommend a set of simple, yet safe and robust non-parametric tests for statistical comparisons of classifiers: the Wilcoxon signed ranks test for comparison of two classifiers and the Friedman test with the corresponding post-hoc tests for comparison of more classifiers over multiple data sets. Results of the latter can also be neatly presented with the newly introduced CD (critical difference) diagrams.

10,306 citations

••

TL;DR: Algorithmic models have been widely used in fields outside statistics as discussed by the authors, both in theory and practice, and can be used both on large complex data sets and as a more accurate and informative alternative to data modeling on smaller data sets.

Abstract: There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical community has been committed to the almost exclusive use of data models. This commitment has led to irrelevant theory, questionable conclusions, and has kept statisticians from working on a large range of interesting current problems. Algorithmic modeling, both in theory and practice, has developed rapidly in fields outside statistics. It can be used both on large complex data sets and as a more accurate and informative alternative to data modeling on smaller data sets. If our goal as a field is to use data to solve problems, then we need to move away from exclusive dependence on data models and adopt a more diverse set of tools.

2,948 citations

•

03 Jan 2001

TL;DR: It is shown, contrary to a widely-held belief that discriminative classifiers are almost always to be preferred, that there can often be two distinct regimes of performance as the training set size is increased, one in which each algorithm does better.

Abstract: We compare discriminative and generative learning as typified by logistic regression and naive Bayes. We show, contrary to a widely-held belief that discriminative classifiers are almost always to be preferred, that there can often be two distinct regimes of performance as the training set size is increased, one in which each algorithm does better. This stems from the observation—which is borne out in repeated experiments—that while discriminative learning has lower asymptotic error, a generative classifier may also approach its (higher) asymptotic error much faster.

2,226 citations

•

TL;DR: If the goal as a field is to use data to solve problems, then the statistical community needs to move away from exclusive dependence on data models and adopt a more diverse set of tools.

Abstract: There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated bya given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical communityhas been committed to the almost exclusive use of data models. This commit- ment has led to irrelevant theory, questionable conclusions, and has kept statisticians from working on a large range of interesting current prob- lems. Algorithmic modeling, both in theoryand practice, has developed rapidlyin fields outside statistics. It can be used both on large complex data sets and as a more accurate and informative alternative to data modeling on smaller data sets. If our goal as a field is to use data to solve problems, then we need to move awayfrom exclusive dependence on data models and adopt a more diverse set of tools.

1,735 citations

••

TL;DR: An efficient algorithm is presented, that is especially suitable for high dimensional problems, which can also be applied to generalized linear models to solve the corresponding convex optimization problem.

Abstract: Summary. The group lasso is an extension of the lasso to do variable selection on (predefined) groups of variables in linear regression models. The estimates have the attractive property of being invariant under groupwise orthogonal reparameterizations. We extend the group lasso to logistic regression models and present an efficient algorithm, that is especially suitable for high dimensional problems, which can also be applied to generalized linear models to solve the corresponding convex optimization problem. The group lasso estimator for logistic regression is shown to be statistically consistent even if the number of predictors is much larger than sample size but with sparse true underlying structure. We further use a two-stage procedure which aims for sparser models than the group lasso, leading to improved prediction performance for some cases. Moreover, owing to the two-stage nature, the estimates can be constructed to be hierarchical. The methods are used on simulated and real data sets about splice site detection in DNA sequences.

1,709 citations