Showing papers on "Ordinal regression published in 2008"

PDF

Open Access

Journal Article•DOI•

Ordinal regression revisited: multiple criteria ranking using a set of additive value functions

[...]

Salvatore Greco, Vincent Mousseau¹, Roman Słowiński², Roman Słowiński³•Institutions (3)

Paris Dauphine University¹, Poznań University of Technology², Polish Academy of Sciences³

01 Dec 2008-European Journal of Operational Research

TL;DR: Distinguishing necessary and possible consequences of preference information on the complete set of actions, UTAGMS answers questions of robustness analysis and can support the decision maker when his/her preference statements cannot be represented in terms of an additive value function.

...read moreread less

448 citations

Book•

An Introduction to Generalized Linear Models, Third Edition

[...]

Annette Dobson, Adrian G. Barnett

01 Jan 2008

TL;DR: An Introduction to Generalized Linear Models, Third Edition provides a cohesive framework for statistical modeling and includes examples and exercises with complete data sets for nearly all the models covered.

...read moreread less

Abstract: Introduces GLMs in a way that enables readers to understand the unifying structure that underpins them. Discusses common concepts and principles of advanced GLMs, including nominal and ordinal regression, survival analysis, and longitudinal analysis. Connects Bayesian analysis and MCMC methods to fit GLMs. Contains numerous examples from business, medicine, engineering, and the social sciences. Provides the example code for R, Stata, and WinBUGS to encourage implementation of the methods. Offers the data sets and solutions to the exercises online. Continuing to emphasize numerical and graphical methods, An Introduction to Generalized Linear Models, Third Edition provides a cohesive framework for statistical modeling. This new edition of a bestseller has been updated with Stata, R, and WinBUGS code as well as three new chapters on Bayesian analysis. Like its predecessor, this edition presents the theoretical background of generalized linear models (GLMs) before focusing on methods for analyzing particular kinds of data. It covers normal, Poisson, and binomial distributions; linear regression models; classical estimation and model fitting methods; and frequentist methods of statistical inference. After forming this foundation, the authors explore multiple linear regression, analysis of variance (ANOVA), logistic regression, log-linear models, survival analysis, multilevel modeling, Bayesian models, and Markov chain Monte Carlo (MCMC) methods. Using popular statistical software programs, this concise and accessible text illustrates practical approaches to estimation, model fitting, and model comparisons. It includes examples and exercises with complete data sets for nearly all the models covered.

...read moreread less

238 citations

Proceedings Article•DOI•

A neural network approach to ordinal regression

[...]

Jianlin Cheng¹, Zheng Wang¹, Gianluca Pollastri²•Institutions (2)

University of Missouri¹, University College Dublin²

01 Jun 2008

TL;DR: An effective approach to adapt a traditional neural network to learn ordinal categories is described, a generalization of the perceptron method for ordinal regression, which outperforms a neural network classification method.

...read moreread less

Abstract: Ordinal regression is an important type of learning, which has properties of both classification and regression. Here we describe an effective approach to adapt a traditional neural network to learn ordinal categories. Our approach is a generalization of the perceptron method for ordinal regression. On several benchmark datasets, our method (NNRank) outperforms a neural network classification method. Compared with the ordinal regression methods using Gaussian processes and support vector machines, NNRank achieves comparable performance. Moreover, NNRank has the advantages of traditional neural networks: learning in both online and batch modes, handling very large training datasets, and making rapid predictions. These features make NNRank a useful and complementary tool for large-scale data mining tasks such as information retrieval, Web page ranking, collaborative filtering, and protein ranking in bioinformatics. The neural network software is available at: http://www.cs.missouri.edu/~chengji/cheng software.html.

...read moreread less

160 citations

Journal Article•DOI•

Stochastic dominance-based rough set model for ordinal classification

[...]

Wojciech Kotłowski¹, Krzysztof Dembczyński¹, Salvatore Greco², Roman Słowiński¹•Institutions (2)

Poznań University of Technology¹, University of Catania²

01 Nov 2008-Information Sciences

TL;DR: A probabilistic model for ordinal classification problems with monotonicity constraints is introduced and the equivalence of the variable consistency rough sets to the specific empirical risk-minimizing decision rule in the statistical decision theory is shown.

...read moreread less

146 citations

Proceedings Article•DOI•

An Analysis of Statistical Models and Features for Reading Difficulty Prediction

[...]

Michael Heilman¹, Kevyn Collins-Thompson¹, Maxine Eskenazi¹•Institutions (1)

Carnegie Mellon University¹

19 Jun 2008

TL;DR: This article used a combination of lexical features and grammatical features derived from subtrees of syntactic parses to measure readability and found that a model for ordinal regression such as the proportional odds model is most effective at predicting reading difficulty.

...read moreread less

Abstract: A reading difficulty measure can be described as a function or model that maps a text to a numerical value corresponding to a difficulty or grade level We describe a measure of readability that uses a combination of lexical features and grammatical features that are derived from subtrees of syntactic parses We also tested statistical models for nominal, ordinal, and interval scales of measurement The results indicate that a model for ordinal regression, such as the proportional odds model, using a combination of grammatical and lexical features is most effective at predicting reading difficulty

...read moreread less

123 citations

Book Chapter•DOI•

Multilevel Models for Ordinal and Nominal Variables

[...]

Donald Hedeker¹•Institutions (1)

University of Illinois at Chicago¹

01 Jan 2008

TL;DR: For ordinal response data, the multilevel logistic regression model is a very popular choice for analysis of dichotomous data as mentioned in this paper, and several approaches adopting either a logistic or probit regression model have been have discussed and compared some of these models and their estimation procedures.

...read moreread less

Abstract: Reflecting the usefulness of multilevel analysis and the importance of categorical outcomes in many areas of research, generalization of multilevel models for categorical outcomes has been an active area of statistical research. For dichotomous response data, several approaches adopting either a logistic or probit regression model and various methods for incorporating and estimating the influence of the random effects have been have discussed and compared some of these models and their estimation procedures. Also, Snijders and Bosker [99, chap. 14] provide a practical summary of the multilevel logistic regression model and the various procedures for estimating its parameters. As these sources indicate, the multilevel logistic regression model is a very popular choice for analysis of dichotomous data. Extending the methods for dichotomous responses to ordinal response data has also been actively Again, developments have been mainly in terms of logistic and probit regression models, and many of these are reviewed in Agresti and Natarajan [5]. Because the proportional odds model described by McCullagh [71], which is based on the logistic regression formulation, is a common choice for analysis of ordinal data, many of the multilevel models for ordinal data are generalizations of this model. The proportional odds model characterizes the ordinal responses in C categories in terms of C−1 cumulative category comparisons, specifically, C−1 cumulative logits (i.e., log odds) of the ordinal responses. In the proportional odds model, the covariate effects are assumed to be the same across these cumulative logits, or proportional across the cumulative odds. As noted by Peterson and Harrell [77], however, examples of non-proportional odds are

...read moreread less

119 citations

Journal Article•DOI•

ROC analysis in ordinal regression learning

[...]

Willem Waegeman¹, Bernard De Baets¹, Luc Boullart¹•Institutions (1)

Ghent University¹

01 Jan 2008-Pattern Recognition Letters

TL;DR: A natural generalization of the Wilcoxon-Mann-Whitney statistic, which now corresponds to the volume under an r-dimensional surface (VUS) for r ordered categories and differs from extensions recently proposed for multi-class classification.

...read moreread less

106 citations

Book•

SPSS 16.0 Advanced Statistical Procedures Companion

[...]

Marija Norusis

10 Feb 2008

TL;DR: The SPSS 16.0: Advanced Statistical Procedures Companion as discussed by the authors contains valuable tips, warnings, and examples that will help to take advantage of advanced statistical procedures and better analyze data.

...read moreread less

Abstract: Key Message: SPSS 16.0: Advanced Statistical Procedures Companion contains valuable tips, warnings, and examples that will help you take advantage of SPSS and better analyze data. This book offers clear and concise explanations and examples of advanced statistical procedures in the SPSS Advanced and Regression modules. Key Topics: Model Selection Loglinear Analysis; Logit Loglinear Analysis; Multinomial Logistic Regression; Ordinal Regression; Probit Regression; Kaplan-Meier Survival Analysis; Life Tables; Cox Regression; Variance Components; Linear Mixed Models; Generalized Linear Models; Generalized Estimating Equations; Nonlinear Regression; Two-Stage Least-Squares Regression; Weighted Least-Squares Regression; Multidimensional Scaling Market: for all readers interested in SPSS.

...read moreread less

87 citations

Book•

SPSS 16.0 Statistical Procedures Companion

[...]

M. J. Norušis

08 Feb 2008

TL;DR: The Statistical Procedures Companion as mentioned in this paper contains tips, warnings, and examples that will help to take advantage of SPSS to analyze real data better, with an emphasis on the practice of analyzing data.

...read moreread less

Abstract: Key Message: SPSS 16.0: Statistical Procedures Companion contains tips, warnings, and examples that will help you take advantage of SPSS to analyze data better. This book is a basic review of the underlying statistical concepts, with an emphasis on the practice of analyzing data. Ideal for both new and experienced users, this companion offers suggestions and strategies for handling the issues that arise when analyzing real data. Key Topics: Introduction; Getting to Know SPSS; Introducing Data; Preparing Your Data; Transforming Your Data; Describing Your Data; Testing Hypotheses; T Tests; One-Way Analysis of Variance; Crosstabulation; Correlation; Bivariate Linear Regression; Multiple Linear Regression; Discriminant Analysis; Logistic Regression Analysis; Cluster Analysis; Factor Analysis; Reliability Analysis; Nonparametric Tests; Ordinal Regression; General Loglinear Analysis; GLM Univariate; GLM Multivariate; GLM Repeated Measures Market: for all readers interested in SPSS.

...read moreread less

84 citations

Journal Article•DOI•

Ordinal logistic regression models: application in quality of life studies

[...]

Mery Natali Silva Abreu¹, Arminda Lucia Siqueira¹, Clareci Silva Cardoso¹, Waleska Teixeira Caiaffa¹•Institutions (1)

Universidade Federal de Minas Gerais¹

01 Jan 2008-Cadernos De Saude Publica

TL;DR: All tested models showed good fit, but the proportional odds or partial proportional odds models proved to be the best choice due to the nature of the data and ease of interpretation of the results.

...read moreread less

Abstract: Quality of life has been increasingly emphasized in public health research in recent years. Typically, the results of quality of life are measured by means of ordinal scales. In these situations, specific statistical methods are necessary because procedures such as either dichotomization or misinformation on the distribution of the outcome variable may complicate the inferential process. Ordinal logistic regression models are appropriate in many of these situations. This article presents a review of the proportional odds model, partial proportional odds model, continuation ratio model, and stereotype model. The fit, statistical inference, and comparisons between models are illustrated with data from a study on quality of life in 273 patients with schizophrenia. All tested models showed good fit, but the proportional odds or partial proportional odds models proved to be the best choice due to the nature of the data and ease of interpretation of the results. Ordinal logistic models perform differently depending on categorization of outcome, adequacy in relation to assumptions, goodness-of-fit, and parsimony.

...read moreread less

73 citations

Journal Article•DOI•

A Fast Algorithm for Learning a Ranking Function from Large-Scale Data Sets

[...]

Vikas C. Raykar¹, Ramani Duraiswami², Balaji Krishnapuram¹•Institutions (2)

Siemens¹, University of Maryland, College Park²

01 Jul 2008-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Experiments on public benchmarks for ordinal regression and collaborative filtering indicate that the proposed algorithm is as accurate as the best available methods in terms of ranking accuracy, when the algorithms are trained on the same data.

...read moreread less

Abstract: We consider the problem of learning a ranking function that maximizes a generalization of the Wilcoxon-Mann-Whitney statistic on the training data. Relying on an e-accurate approximation for the error function, we reduce the computational complexity of each iteration of a conjugate gradient algorithm for learning ranking functions from O(m2) to O(m), where m is the number of training samples. Experiments on public benchmarks for ordinal regression and collaborative filtering indicate that the proposed algorithm is as accurate as the best available methods in terms of ranking accuracy, when the algorithms are trained on the same data. However, since it is several orders of magnitude faster than the current state-of-the-art approaches, it is able to leverage much larger training data sets.

...read moreread less

Book Chapter•DOI•

Interactive Multiobjective Optimization Using a Set of Additive Value Functions

[...]

José Rui Figueira¹, Salvatore Greco², Vincent Mousseau³, Roman Słowiński⁴•Institutions (4)

Technical University of Lisbon¹, University of Catania², Paris Dauphine University³, Poznań University of Technology⁴

18 Oct 2008

TL;DR: A new interactive procedure for multiobjective optimization, based on the use of a set of value functions as a preference model built by an ordinal regression method, which results in possible and necessary rankings of Pareto optimal solutions.

...read moreread less

Abstract: In this chapter, we present a new interactive procedure for multiobjective optimization, which is based on the use of a set of value functions as a preference model built by an ordinal regression method. The procedure is composed of two alternating stages. In the first stage, a representative sample of solutions from the Pareto optimal set (or from its approximation) is generated. In the second stage, the Decision Maker (DM) is asked to make pairwise comparisons of some solutions from the generated sample. Besides pairwise comparisons, the DM may compare selected pairs from the viewpoint of the intensity of preference, both comprehensively and with respect to a single criterion. This preference information is used to build a preference model composed of all general additive value functions compatible with the obtained information. The set of compatible value functions is then applied on the whole Pareto optimal set, which results in possible and necessary rankings of Pareto optimal solutions. These rankings are used to select a new sample of solutions, which is presented to the DM, and the procedure cycles until a satisfactory solution is selected from the sample or the DM comes to conclusion that there is no satisfactory solution for the current problem setting. Construction of the set of compatible value functions is done using ordinal regression methods called UTA$^{\mbox{\scriptsize GMS}}$ and GRIP. These two methods generalize UTA-like methods and they are competitive to AHP and MACBETH methods. The interactive procedure will be illustrated through an example.

...read moreread less

Journal Article•DOI•

Modeling and inference for an ordinal effect size measure.

[...]

Euijung Ryu¹, Alan Agresti²•Institutions (2)

Mayo Clinic¹, University of Florida²

10 May 2008-Statistics in Medicine

TL;DR: Simulation studies show that with independent multinomial samples, confidence intervals based on inverting the score test and a pseudo-score-type test perform well and this score method also seems to work well with fully-ranked data, but for dependent samples a simple Wald interval on the logit scale can be better with small samples.

...read moreread less

Abstract: An ordinal measure of effect size is a simple and useful way to describe the difference between two ordered categorical distributions. This measure summarizes the probability that an outcome from one distribution falls above an outcome from the other, adjusted for ties. We develop and compare confidence interval methods for the measure. Simulation studies show that with independent multinomial samples, confidence intervals based on inverting the score test and a pseudo-score-type test perform well. This score method also seems to work well with fully-ranked data, but for dependent samples a simple Wald interval on the logit scale can be better with small samples. We also explore how the ordinal effect size measure relates to an effect measure commonly used for normal distributions, and we consider a logit model for describing how it depends on explanatory variables. The methods are illustrated for a study comparing treatments for shoulder-tip pain.

...read moreread less

Book Chapter•DOI•

Customer churn time prediction in mobile telecommunication industry using ordinal regression

[...]

Rupesh K. Gopal, Saroj K. Meher

20 May 2008

TL;DR: It is noticed from the results that ordinal regression could be an alternative technique for survival analysis for churn time prediction of mobile customers and state-of-the-art methods for tenure prediction - survival analysis.

...read moreread less

Abstract: Customer churn in considered to be a core issue in telecommunication customer relationship management (CRM). Accurate prediction of churn time or customer tenure is important for developing appropriate retention strategies. In this paper, we discuss a method based on ordinal regression to predict churn time or tenure of mobile telecommunication customers. Customer tenure is treated as an ordinal outcome variable and ordinal regression is used for tenure modeling. We compare ordinal regression with the state-of-the-art methods for tenure prediction - survival analysis. We notice from our results that ordinal regression could be an alternative technique for survival analysis for churn time prediction of mobile customers. To the best knowledge of authors, the use of ordinal regression as a potential technique for modeling customer tenure has been attempted for the first time.

...read moreread less

Journal Article•DOI•

Classification trees for ordinal variables

[...]

Raffaella Piccarreta

01 Jul 2008-Computational Statistics

TL;DR: New criteria to obtain classification trees for ordinal response variables are introduced and the hereby proposed methods are compared with the ordered twoing criterion via simulations.

...read moreread less

Abstract: We introduce new criteria to obtain classification trees for ordinal response variables. At this aim, Breiman et al. (Classification and regression trees. Wadsworth, Belmont, 1984), extended their twoing criterion to the ordinal case. Following CART procedure, we extend the well known Gini---Simpson criterion to the ordinal case. Referring to the exclusivity preference property (introduced by Taylor and Silverman in Stat Comput 3:147---161, 1993, for the nominal case), suitably modified for the ordinal case, a second criterion is introduced. The hereby proposed methods are compared with the ordered twoing criterion via simulations.

...read moreread less

Book Chapter•DOI•

Measuring inequality with ordinal data: a note

[...]

Buhong Zheng

15 Oct 2008

TL;DR: In this article, the applicability of stochastic dominance to ordinal data such as self-reported health status was investigated and it was shown that for ordinal distributions, stochastically dominant has limited applicability in ranking social welfare, while it has no applicability for ranking inequality.

...read moreread less

Abstract: This note formally investigates the applicability of stochastic dominance (Lorenz dominance) to ordinal data such as self-reported health status. We confirm that for ordinal data distributions, stochastic dominance has limited applicability in ranking social welfare, while it has no applicability in ranking inequality.

...read moreread less

Journal Article•DOI•

An imputation strategy for incomplete longitudinal ordinal data

[...]

Hakan Demirtas¹, Donald Hedeker¹•Institutions (1)

University of Illinois at Chicago¹

10 Sep 2008-Statistics in Medicine

TL;DR: The essential idea is collapsing ordinal levels to binary ones and converting correlated binary outcomes to multivariate normal outcomes in a sensible way so that re‐conversion to the binary and then ordinal scale, after conducting multiple imputation, yields the original marginal distributions and correlations.

...read moreread less

Abstract: A new quasi-imputation strategy for correlated ordinal responses is proposed by borrowing ideas from random number generation. The essential idea is collapsing ordinal levels to binary ones and converting correlated binary outcomes to multivariate normal outcomes in a sensible way so that re-conversion to the binary and then ordinal scale, after conducting multiple imputation, yields the original marginal distributions and correlations. This conversion process ensures that the correlations are transformed reasonably, which in turn allows us to take advantage of well-developed imputation techniques for Gaussian outcomes. We use the phrase 'quasi' because the original observations are not guaranteed to be preserved. We present an application using a data set from psychiatric research. We conclude that the proposed method may be a promising tool for handling incomplete longitudinal or clustered ordinal outcomes.

...read moreread less

Journal Article•DOI•

Bayesian model determination for multivariate ordinal and binary data

[...]

Emily L. Webb¹, Jonathan J. Forster²•Institutions (2)

Institute of Cancer Research¹, University of Southampton²

01 Jan 2008-Computational Statistics & Data Analysis

TL;DR: Different conditional independence specifications for ordinal categorical data are compared by calculating a posterior distribution over classes of graphical models by parameterising the precision matrix of the associated multivariate normal in Cholesky form.

...read moreread less

Journal Article•DOI•

IDD: A Supervised Interval Distance-Based Method for Discretization

[...]

Francisco J. Ruiz, Cecilio Angulo, Núria Agell¹•Institutions (1)

Ramon Llull University¹

01 Sep 2008-IEEE Transactions on Knowledge and Data Engineering

TL;DR: A new method for supervised discretization based on interval distances by using a novel concept of neighbourhood in the target's space that takes into consideration the order of the class attribute, when this exists, so that it can be used with ordinal discrete classes as well as continuous classes in the case of regression problems.

...read moreread less

Abstract: This article introduces a new method for supervised discretization based on interval distances by using a novel concept of neighbourhood in the target's space. The method proposed takes into consideration the order of the class attribute, when this exists, so that it can be used with ordinal discrete classes as well as continuous classes, in the case of regression problems. The method has proved to be very efficient in terms of accuracy and faster than the most commonly supervised discretization methods used in the literature. It is illustrated through several examples and a comparison with other standard discretization methods is performed for three public data sets by using two different learning tasks: a decision tree algorithm and SVM for regression.

...read moreread less

Journal Article•DOI•

Zero inflation in ordinal data: Incorporating susceptibility to response through the use of a mixture model

[...]

Mary E. Kelley¹, Stewart J. Anderson²•Institutions (2)

Emory University¹, University of Pittsburgh²

15 Aug 2008-Statistics in Medicine

TL;DR: A mixture model for ordinal data with a built‐in probability of non‐response is proposed, which allows modeling of the range of the scale, while simultaneously modeling the presence/absence of the symptom.

...read moreread less

Abstract: The aim of this paper is to produce a methodology that will allow users of ordinal scale data to more accurately model the distribution of ordinal outcomes in which some subjects are susceptible to exhibiting the response and some are not (i.e. the dependent variable exhibits zero inflation). This situation occurs with ordinal scales in which there is an anchor that represents the absence of the symptom or activity, such as 'none', 'never' or 'normal,' and is particularly common when measuring abnormal behavior, symptoms, and side effects. Due to the unusually large number of zeros, traditional statistical tests of association can be non-informative. We propose a mixture model for ordinal data with a built-in probability of non-response, which allows modeling of the range (e.g. severity) of the scale, while simultaneously modeling the presence/absence of the symptom. Simulations show that the model is well behaved and a likelihood ratio test can be used to choose between the zero-inflated and the traditional proportional odds model. The model, however, does have minor restrictions on the nature of the covariates that must be satisfied in order for the model to be identifiable. The method is particularly relevant for public health research such as large epidemiological surveys where more careful documentation of the reasons for response may be difficult.

...read moreread less

Proceedings Article•DOI•

A survey on learning to rank

[...]

Chuan He¹, Cong Wang¹, Yixin Zhong¹, Rui-Fan Li¹•Institutions (1)

Beijing University of Posts and Telecommunications¹

12 Jul 2008

TL;DR: The cons and pros of the recent-proposed framework and algorithms for ranking are analyzed, and the relationships among them are discussed; the promising directions in practice are also pointed out.

...read moreread less

Abstract: Ranking is the key problem for information retrieval and other text applications. Recently, the ranking methods based on machine learning approaches, called learning to rank, become the focus for researchers and practitioners. The main idea of these methods is to apply the various existing and effective algorithms on machine learning to ranking. However, as a learning problem, ranking is different from other classical ones such as classification and regression. In this paper, we investigate the important papers in this direction; the cons and pros of the recent-proposed framework and algorithms for ranking are analyzed, and the relationships among them are discussed. Finally, the promising directions in practice are also pointed out.

...read moreread less

Book Chapter•DOI•

Generalization Bounds for Some Ordinal Regression Algorithms

[...]

Shivani Agarwal¹•Institutions (1)

Massachusetts Institute of Technology¹

12 Oct 2008

TL;DR: This work starts with the most basic algorithms that work by learning a real-valued function in a regression framework and then rounding off a predicted real value to the closest discrete label and ends with a margin-based bound for the state-of-the-art ordinal regression algorithm of Chu & Keerthi (2007).

...read moreread less

Abstract: The problem of ordinal regression, in which the goal is to learn a rule to predict labels from a discrete but ordered set, has gained considerable attention in machine learning in recent years. We study generalization properties of algorithms for this problem. We start with the most basic algorithms that work by learning a real-valued function in a regression framework and then rounding off a predicted real value to the closest discrete label; our most basic bounds for such algorithms are derived by relating the ordinal regression error of the resulting prediction rule to the regression error of the learned real-valued function. We end with a margin-based bound for the state-of-the-art ordinal regression algorithm of Chu & Keerthi (2007).

...read moreread less

Journal Article•DOI•

Likelihood analysis of the multivariate ordinal probit regression model for repeated ordinal responses

[...]

Yonghai Li, Daniel W. Schafer¹•Institutions (1)

Oregon State University¹

01 Mar 2008-Computational Statistics & Data Analysis

TL;DR: This work considers the analysis of longitudinal ordinal data, meaning regression-like analysis when the response variable is categorical with ordered categories, and is measured repeatedly over time (or space) on the experimental or sampling units.

...read moreread less

Journal Article•DOI•

Consistent models of transitivity for reciprocal preferences on a finite ordinal scale

[...]

Susana Díaz¹, José Luis García-Lapresta², Susana Montes¹•Institutions (2)

University of Oviedo¹, University of Valladolid²

01 Jul 2008-Information Sciences

TL;DR: This paper considers a decision maker who shows his/her preferences for different alternatives through a finite set of ordinal values and analyzes the problem of consistency taking into account some transitivity properties within this framework.

...read moreread less

Journal Article•DOI•

Variable Selection Issues in Tree-Based Regression Models

[...]

Xiao Qin¹, Junhee Han¹•Institutions (1)

University of Wisconsin-Madison¹

11 Nov 2008-Transportation Research Record

TL;DR: An unbiased tree-based regression generalized unbiased interaction detection and estimation (GUIDE) model is introduced for its robustness against the variable selection bias and it is anticipated that the GUIDE model will provide a new perspective for users of tree- based models and will offer an advantage over existing methods.

...read moreread less

Abstract: Recently, there has been increasing interest in the use of classification and regression tree (CART) analysis. A tree-based regression model can be constructed by recursively partitioning the data with such criteria as to yield the maximum reduction in the variability of the response. Unfortunately, the exhaustive search may yield a bias in variable selection, and it tends to choose a categorical variable as a splitter that has many distinct values. In this study, an unbiased tree-based regression generalized unbiased interaction detection and estimation (GUIDE) model is introduced for its robustness against the variable selection bias. Not only are the underlying theoretical differences behind CART and GUIDE in variable selection presented, but also the outcomes of the two different tree-based regression models are compared and analyzed by utilizing intersection inventory and crash data. The results underscore GUIDE's strength in selecting variables equally. A simulation shed additional light on the resu...

...read moreread less

Posted Content•

BIOPROBIT: Stata module for bivariate ordered probit regression

[...]

Zurab Sajaia

02 Apr 2008-Research Papers in Economics

TL;DR: In this article, Bioprobit fits maximum-likelihood two-equation ordered probit models of ordinal variables depvar1 and depvar2 on the independent variables indepvars 1 and 2.

...read moreread less

Abstract: bioprobit fits maximum-likelihood two-equation ordered probit models of ordinal variables depvar1 and depvar2 on the independent variables indepvars1 and indepvars2. The actual values taken on by dependent variables are irrelevant, except that larger values are assumed to correspond to "higher" outcomes.

...read moreread less

Journal Article•DOI•

Customer satisfaction barometers and economic development: An explorative ordinal regression analysis

[...]

Evangelos Grigoroudis¹, G. Nikolopoulou¹, Constantin Zopounidis¹•Institutions (1)

Technical University of Crete¹

29 Apr 2008-Total Quality Management & Business Excellence

TL;DR: In this paper, the authors examined the linkage between national satisfaction indices and several macroeconomic development data using an explorative ordinal regression method, in the context of goal programming modelling, using a post-optimality analysis approach.

...read moreread less

Abstract: The national customer satisfaction barometers are aggregated measures of customer satisfaction, which have the potential to provide broad-based benchmarks for business organisations. These barometers may also explain changes in national economic returns and stability, and thus they are able to give a measure of the economic welfare and the quality of a national economic output. The aim of this paper is to examine the linkage between national satisfaction indices and several macroeconomic development data. The presented approach is an explorative ordinal regression method, in the context of goal programming modelling. Stability evaluation of the provided results is also discussed, using a post-optimality analysis approach. The application presented concerns the comparative analysis of the American Customer Satisfaction Index (ACSI), the Swedish Customer Satisfaction Barometer (SCSB), and the German Customer Satisfaction Barometer (GCSB), while the results are mainly focused on measuring the contribution of...

...read moreread less

Statistical Approach to Ordinal Classification with Monotonicity Constraints

[...]

Wojciech Kotowski

01 Jan 2008

TL;DR: This paper proposes a procedure for “monotonizing” the data by relabeling objects, based on minimization of the empirical risk in the class of all monotone functions, and uses this procedure as a preprocessing tool, improving the accuracy of the classifiers.

...read moreread less

Abstract: In the ordinal classification with monotonicity constraints, it is assumed that the class label of an object does not decrease when evaluations of this object on considered attributes increase. In this paper, we formulate the problem of ordinal classification with monotonicity constraints from statistical point of view, by imposing constraints both on the probability distribution and on the loss function. We propose a procedure for “monotonizing” the data by relabeling objects, based on minimization of the empirical risk in the class of all monotone functions. The procedure is then used as a preprocessing tool, improving the accuracy of the classifiers. We verify these claims in a computational experiment.

...read moreread less

Journal Article•DOI•

A tree-based method for modeling a multivariate ordinal response.

[...]

Heping Zhang¹, Yuanqing Ye²•Institutions (2)

Yale University¹, University of Texas MD Anderson Cancer Center²

01 Jan 2008-Statistics and Its Interface

TL;DR: This work proposes a tree-based method for analyzing a multivariate ordinal response and assumes a within-node parametric distribution on the adaptive nonparametric tree framework to demonstrate the ability of the method to identify underlying structures in the data.

...read moreread less

Abstract: Motivated by a real example of understanding the so-called “building related occupant complaint syndromes” (BROCS), we propose a tree-based method for analyzing a multivariate ordinal response. Our method is semiparametric by assuming a within-node parametric distribution on the adaptive nonparametric tree framework. We use simulation experiments to demonstrate the ability of our method to identify underlying structures in the data and the fact that analyzing ordinal response data with proper methods that take ordinality into account is considerably more powerful than dichotomization. The reanalysis of the BROCS data also suggests new insights that go beyond a previous analysis based on the dichotomization.

...read moreread less

Book Chapter•DOI•

Learning to Predict One or More Ranks in Ordinal Regression Tasks

[...]

Jaime Alonso¹, Juan José del Coz¹, Jorge Díez¹, Oscar Luaces¹, Antonio Bahamonde¹ - Show less +1 more•Institutions (1)

Artificial Intelligence Center¹

15 Sep 2008

TL;DR: After defining a family of loss functions inspired in Information Retrieval, an algorithm is derived based on posterior probabilities of ranks given an entry that keeps the set of ranks as small as possible, while still containing the true rank.

...read moreread less

Abstract: We present nondeterministic hypotheses learned from an ordinal regression task. They try to predict the true rank for an entry, but when the classification is uncertain the hypotheses predict a set of consecutive ranks (an interval). The aim is to keep the set of ranks as small as possible, while still containing the true rank. The justification for learning such a hypothesis is based on a real world problem arisen in breeding beef cattle. After defining a family of loss functions inspired in Information Retrieval, we derive an algorithm for minimizing them. The algorithm is based on posterior probabilities of ranks given an entry. A couple of implementations are compared: one based on a multiclass SVMand other based on Gaussian processes designed to minimize the linear loss in ordinal regression tasks.

...read moreread less