scispace - formally typeset
Search or ask a question

Showing papers on "Ordinal regression published in 2016"



Proceedings ArticleDOI
01 Jun 2016
TL;DR: This paper proposes an End-to-End learning approach to address ordinal regression problems using deep Convolutional Neural Network, which could simultaneously conduct feature learning and regression modeling, and achieves the state-of-the-art performance on both the MORPH and AFAD datasets.
Abstract: To address the non-stationary property of aging patterns, age estimation can be cast as an ordinal regression problem. However, the processes of extracting features and learning a regression model are often separated and optimized independently in previous work. In this paper, we propose an End-to-End learning approach to address ordinal regression problems using deep Convolutional Neural Network, which could simultaneously conduct feature learning and regression modeling. In particular, an ordinal regression problem is transformed into a series of binary classification sub-problems. And we propose a multiple output CNN learning algorithm to collectively solve these classification sub-problems, so that the correlation between these tasks could be explored. In addition, we publish an Asian Face Age Dataset (AFAD) containing more than 160K facial images with precise age ground-truths, which is the largest public age dataset to date. To the best of our knowledge, this is the first work to address ordinal regression problems by using CNN, and achieves the state-of-the-art performance on both the MORPH and AFAD datasets.

562 citations


Journal ArticleDOI
TL;DR: The results confirm that ordering information benefits ordinal models improving their accuracy and the closeness of the predictions to actual targets in the ordinal scale.
Abstract: Ordinal regression problems are those machine learning problems where the objective is to classify patterns using a categorical scale which shows a natural order between the labels. Many real-world applications present this labelling structure and that has increased the number of methods and algorithms developed over the last years in this field. Although ordinal regression can be faced using standard nominal classification techniques, there are several algorithms which can specifically benefit from the ordering information. Therefore, this paper is aimed at reviewing the state of the art on these techniques and proposing a taxonomy based on how the models are constructed to take the order into account. Furthermore, a thorough experimental study is proposed to check if the use of the order information improves the performance of the models obtained, considering some of the approaches within the taxonomy. The results confirm that ordering information benefits ordinal models improving their accuracy and the closeness of the predictions to actual targets in the ordinal scale.

332 citations


Journal ArticleDOI
TL;DR: An overview of current practice in the analysis of VAS scores is provided, an extension of current ordinal regression methodology is proposed, which is appropriate for VAS at an ordinal level of measurement, and best practice recommendations are provided.

266 citations


Journal ArticleDOI
TL;DR: Results based on simulated and real data suggest that predictor rankings can be improved in some settings by using new permutation importance measures that explicitly use the ordering in the response levels in combination with ordinal regression trees.

126 citations


Journal ArticleDOI
TL;DR: To handle interactions between criteria and hierarchical structure of criteria, the Choquet integral is applied as a preference model and the recently proposed methodology called Multiple Criteria Hierarchy Process is applied.
Abstract: The paper deals with two important issues of Multiple Criteria Decision Aiding: interaction between criteria and hierarchical structure of criteria. To handle interactions, we apply the Choquet integral as a preference model, and to handle the hierarchy of criteria, we apply the recently proposed methodology called Multiple Criteria Hierarchy Process. In addition to dealing with the above issues, we suppose that the preference information provided by the Decision Maker is indirect and has the form of pairwise comparisons of criteria with respect to their importance and pairwise preference comparisons of some pairs of alternatives with respect to some criteria. In consequence, many instances of the Choquet integral are usually compatible with this preference information. These instances are identified and exploited by Robust Ordinal Regression and Stochastic Multiobjective Acceptability Analysis. To illustrate the whole approach, we show its application to a real world decision problem concerning the ranking of universities for a hypothetical Decision Maker.

89 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider multiple criteria decision aided in the case of interaction between criteria and propose to use AHP on a set of reference points in the scale of each criterion and to use an interpolation to obtain the other values.
Abstract: We consider multiple criteria decision aiding in the case of interaction between criteria. In this case the usual weighted sum cannot be used to aggregate evaluations on different criteria and other value functions with a more complex formulation have to be considered. The Choquet integral is the most used technique and also the most widespread in the literature. However, the application of the Choquet integral presents two main problems being the necessity to determine the capacity, which is the function that assigns a weight not only to all single criteria but also to all subset of criteria, and the necessity to express on the same scale evaluations on different criteria. While with respect to the first problem we adopt the recently introduced Non-Additive Robust Ordinal Regression (NAROR) taking into account all the capacities compatible with the preference information provided by the DM, with respect to the second one we build the common scale for the considered criteria using the Analytic Hierarchy Process (AHP). We propose to use AHP on a set of reference points in the scale of each criterion and to use an interpolation to obtain the other values. This permits to reduce considerably the number of pairwise comparisons usually required by the DM when applying AHP. An illustrative example details the application of the proposed methodology.

68 citations


Journal ArticleDOI
TL;DR: The authors examined three approaches for testing goodness of fit in ordinal logistic regression models: an ordinal version of the Hosmer-Lemeshow test (Cg), the Lipsitz test, and the Pulkstenis-Robinson test.
Abstract: We examine three approaches for testing goodness of fit in ordinal logistic regression models: an ordinal version of the Hosmer–Lemeshow test (Cg), the Lipsitz test, and the Pulkstenis–Robinson (PR) tests. The properties of these tests have previously been investigated for the proportional odds model. Here, we extend the tests to two other commonly used models: the adjacent-category and the constrained continuation-ratio models. We use a simulation study to assess null distributions and power. All three tests work well and can detect several types of lack of fit under both the adjacent-category and constrained continuation-ratio models. The Cg and Lipsitz tests are best suited to detect lack of fit associated with continuous covariates, whereas the PR tests excel at detecting lack of fit associated with categorical covariates. We illustrate the use of the tests with data from a study of aftercare placement of psychiatrically hospitalized adolescents. Based on the results here and previous research...

59 citations


Journal ArticleDOI
TL;DR: A lp distance-based method is proposed to formulate the underlying optimization problems as goal programming (GP) models for ordinal and additive consistency problems respectively, and the proposed model can preserve the initial preference information as much as possible.

55 citations


Journal ArticleDOI
27 Jul 2016-Chaos
TL;DR: A statistically significant difference between healthy patients and several groups of unhealthy patients with varying heart conditions is found for the distributions of the mean degrees, unlike for any of the distribution of the entropies or NFPs.
Abstract: Electrocardiogram (ECG) data from patients with a variety of heart conditions are studied using ordinal pattern partition networks. The ordinal pattern partition networks are formed from the ECG time series by symbolizing the data into ordinal patterns. The ordinal patterns form the nodes of the network and edges are defined through the time ordering of the ordinal patterns in the symbolized time series. A network measure, called the mean degree, is computed from each time series-generated network. In addition, the entropy and number of non-occurring ordinal patterns (NFP) is computed for each series. The distribution of mean degrees, entropies, and NFPs for each heart condition studied is compared. A statistically significant difference between healthy patients and several groups of unhealthy patients with varying heart conditions is found for the distributions of the mean degrees, unlike for any of the distributions of the entropies or NFPs.

55 citations


Proceedings ArticleDOI
26 Jun 2016
TL;DR: A novel modeling framework is introduced that leverages the power of copula functions and CRFs, to detangle the probabilistic modeling of AU dependencies from the marginal modeling of the AU intensity, and it is shown that the proposed approach consistently outperforms independent modeling ofAU intensities, and the state-of the-art approach for the target task.
Abstract: Joint modeling of the intensity of facial action units (AUs) from face images is challenging due to the large number of AUs (30+) and their intensity levels (6). This is in part due to the lack of suitable models that can efficiently handle such a large number of outputs/classes simultaneously, but also due to the lack of labelled target data. For this reason, majority of the methods proposed so far resort to independent classifiers for the AU intensity. This is suboptimal for at least two reasons: the facial appearance of some AUs changes depending on the intensity of other AUs, and some AUs co-occur more often than others. Encoding this is expected to improve the estimation of target AU intensities, especially in the case of noisy image features, head-pose variations and imbalanced training data. To this end, we introduce a novel modeling framework, Copula Ordinal Regression (COR), that leverages the power of copula functions and CRFs, to detangle the probabilistic modeling of AU dependencies from the marginal modeling of the AU intensity. Consequently, the COR model achieves the joint learning and inference of intensities of multiple AUs, while being computationally tractable. We show on two challenging datasets of naturalistic facial expressions that the proposed approach consistently outperforms (i) independent modeling of AU intensities, and (ii) the state-of the-art approach for the target task.

Journal ArticleDOI
TL;DR: In this paper, the authors deal with an urban and territorial planning problem by applying the Non Additive Robust Ordinal Regression (NAROR) to the Choquet integral preference model which permits to represent interaction between considered criteria through the use of a set of non-additive weights called capacity or fuzzy measure.
Abstract: In this paper we deal with an urban and territorial planning problem by applying the Non Additive Robust Ordinal Regression (NAROR). NAROR is a recent extension of the Robust Ordinal Regression family of Multiple Criteria Decision Aiding methods to the Choquet integral preference model which permits to represent interaction between considered criteria through the use of a set of non-additive weights called capacity or fuzzy measure. The use of NAROR permits the Decision Maker (DM) to give preference information in terms of preferences between pairs of alternatives with which she is familiar, and relative importance and interaction of considered criteria. The basic idea of NAROR is to consider the whole set of capacities that are compatible with the preference information given by the DM. In fact, the recommendation supplied by NAROR is expressed in terms of necessary preferences, in case an alternative is preferred to another for all compatible capacities, and of possible preferences, in case an alternative is preferred to another for at least one compatible capacity. In the considered case study, several sites for the location of a landfill are analyzed and compared through the use of the NAROR on the basis of different criteria, such as presence of population, hydrogeological risk, interferences on transport infrastructures and economic cost. This paper is the first application of NAROR to a real-world problem, even if not already with real DMs, but with a panel of experts simulating the decision process.

Journal ArticleDOI
TL;DR: This work proposes several scoring procedures for transforming the results of robustness analysis to a univocal recommendation using a preference model in form of an additive value function, and assumes the Decision Maker to provide pairwise comparisons of reference alternatives.

Journal ArticleDOI
TL;DR: Application of the vignettes by the two approaches removed scaling biases, thereby improving the accuracy of the analyses of the associations between travel mode and quality of life.
Abstract: Purpose Likert scales are frequently used in public health research, but are subject to scale perception bias. This study sought to explore scale perception bias in quality-of-life (QoL) self-assessment and assess its relationships with commuting mode in the Sydney Travel and Health Study.

Journal ArticleDOI
TL;DR: This paper defines and relates Ordinal classification and monotonic classification in a common framework, providing proper descriptions, characteristics, and a categorization of existing approaches in the state-of-the-art.
Abstract: Ordinal classification covers those classification tasks where the different labels show an ordering relation, which is related to the nature of the target variable. In addition, if a set of monotonicity constraints between independent and dependent variables has to be satisfied, then the problem is known as monotonic classification. Both issues are of great practical importance in machine learning. Ordinal classification has been widely studied in specialized literature, but monotonic classification has received relatively low attention. In this paper, we define and relate both tasks in a common framework, providing proper descriptions, characteristics, and a categorization of existing approaches in the state-of-the-art. Moreover, research challenges and open issues are discussed, with focus on frequent experimental behaviours and pitfalls, commonly used evaluation measures and the encouragement in devoting substantial research efforts in specific learning paradigms.

Posted Content
TL;DR: This paper explores ordinal classification (in the context of deep neural networks) through a simple modification of the squared error loss which not only allows it to not only be sensitive to class ordering, but also allows the possibility of having a discrete probability distribution over the classes.
Abstract: In this paper, we explore ordinal classification (in the context of deep neural networks) through a simple modification of the squared error loss which not only allows it to not only be sensitive to class ordering, but also allows the possibility of having a discrete probability distribution over the classes. Our formulation is based on the use of a softmax hidden layer, which has received relatively little attention in the literature. We empirically evaluate its performance on the Kaggle diabetic retinopathy dataset, an ordinal and high-resolution dataset and show that it outperforms all of the baselines employed.

Journal ArticleDOI
TL;DR: A probability distribution for ordinal data is designed by modeling the process generating data, which is assumed to rely only on order comparisons between categories, and the previous univariate ordinal model is straightforwardly extended to model-based clustering for multivariate ordinals data.
Abstract: We design a probability distribution for ordinal data by modeling the process generating data, which is assumed to rely only on order comparisons between categories. Contrariwise, most competitors often either forget the order information or add a non-existent distance information. The data generating process is assumed, from optimality arguments, to be a stochastic binary search algorithm in a sorted table. The resulting distribution is natively governed by two meaningful parameters (position and precision) and has very appealing properties: decrease around the mode, shape tuning from uniformity to a Dirac, identifiability. Moreover, it is easily estimated by an EM algorithm since the path in the stochastic binary search algorithm can be considered as missing values. Using then the classical latent class assumption, the previous univariate ordinal model is straightforwardly extended to model-based clustering for multivariate ordinal data. Parameters of this mixture model are estimated by an AECM algorithm. Both simulated and real data sets illustrate the great potential of this model by its ability to parsimoniously identify particularly relevant clusters which were unsuspected by some traditional competitors.

Journal ArticleDOI
01 Jul 2016
TL;DR: In this article, a comprehensive framework to regression models is proposed in case ordinal data come out from a discrete choice, and the added value of this unifying perspective is the possibility to introduce further generalizations and also to deepen similarities and differences among the proposed models.
Abstract: Literature on the models for ordinal variables grew very fast in the last decades and several proposals have been advanced when ordered data are expression of ratings, preferences, judgments, opinions, etc. A dichotomy has been emphasized between methods based on a latent variable which is behind the ordered selection and methods anchored to a probability distribution with a well defined pattern. In this paper, a comprehensive framework to regression models is proposed in case ordinal data come out from a discrete choice. The added value of this unifying perspective is the possibility to introduce further generalizations and also to deepen similarities and differences among the proposed models. A case study confirms the usefulness of this general framework. Some concluding remarks end the paper.

Journal ArticleDOI
TL;DR: An integrated framework for robustness analysis with the rule-based preference model is proposed, and a set of indicators and outcomes giving an insight into the spaces of consensus and disagreement between the DMs are presented.

Journal ArticleDOI
TL;DR: A Bayesian stochastic search variable selection (BSSVS) method is presented for variable selection in quantile regression (QReg) for ordinal models and a Markov Chain Monte Carlo method is adopted to draw the unknown quantities from the full posteriors.


Journal ArticleDOI
TL;DR: In this paper, the authors proposed a multicriteria assessment system to support top management of a real multinational pharmaceutical company over the investment on new pharmaceutical products, based on three points of view: current market situation, development of the sector over the recent years, and comparison with other European countries.
Abstract: The size of the pharmaceutical market and its contribution to the regional, national and international economic development is widely recognized. This fact indicates that supported and efficient decision making in the sector is a matter of paramount importance. This paper proposes a multicriteria assessment system to support top management of a real multinational pharmaceutical company over the investment on new pharmaceutical products. The evaluation criteria are extracted from three points of view, namely: (i) current market situation, (ii) development of the sector over the recent years, and (iii) comparison with other European countries. This research work results in the evaluation and ranking of 192 therapeutic categories for investment purposes in the Greek pharmaceutical market. The ranking of these categories is obtained through the application of an additive value model, which is assessed by the ordinal regression method UTASTAR. In the first phase, the decision makers are asked to rank a sample of these alternatives, inferring therefore implicitly a personal additive value system. In the second phase, all the alternatives are evaluated and a complete ranking is obtained. Finally, the robustness of the results is analysed and measured, given the imperfect determination of the model parameters. For this purpose, an extreme ranking analysis is implemented, calculating each alternative’s best and worst possible position in the ranking.

Journal ArticleDOI
TL;DR: A latent Gaussian mixture model to classify ordinal data is proposed that allows us to overcome the computational problems arising in the full maximum likelihood approach due to the evaluation of multidimensional integrals that cannot be written in closed form.
Abstract: A latent Gaussian mixture model to classify ordinal data is proposed. The observed categorical variables are considered as a discretization of an underlying finite mixture of Gaussians. The model is estimated within the expectation-maximization (EM) framework maximizing a pairwise likelihood. This allows us to overcome the computational problems arising in the full maximum likelihood approach due to the evaluation of multidimensional integrals that cannot be written in closed form. Moreover, a method to cluster the observations on the basis of the posterior probabilities in output of the pairwise EM algorithm is suggested. The effectiveness of the proposal is shown comparing the pairwise likelihood approach with the full maximum likelihood and the maximum likelihood for continuous data ignoring the ordinal nature of the variables. The comparison is made by means of a simulation study; applications to real data are provided.

Journal ArticleDOI
TL;DR: In this article, an ordinal beta hurdle model is proposed to directly model ordinal category probabilities with a biologically realistic beta-distributed latent variable, which allows ecologists to explore distribution (absence) and abundance processes in an integrated framework.
Abstract: Ecological abundance data are often recorded on an ordinal scale in which the lowest category represents species absence. One common example is when plant species cover is visually assessed within bounded quadrats and then assigned to pre-defined cover class categories. We present an ordinal beta hurdle model that directly models ordinal category probabilities with a biologically realistic beta-distributed latent variable. A hurdle-at-zero model allows ecologists to explore distribution (absence) and abundance processes in an integrated framework. This provides an alternative to cumulative link models when data are inconsistent with the assumption that the odds of moving into a higher category are the same for all categories (proportional odds). Graphical tools and a deviance information criterion were developed to assess whether a hurdle-at-zero model should be used for inferences rather than standard ordinal methods. Hurdle-at-zero and non-hurdle ordinal models fit to vegetation cover class data produced substantially different conclusions. The ordinal beta hurdle model yielded more precise parameter estimates than cumulative logit models, although out-of-sample predictions were similar. The ordinal beta hurdle model provides inferences directly on the latent biological variable of interest, percent cover, and supports exploration of more realistic ecological patterns and processes through the hurdle-at-zero or two-part specification. We provide JAGS code as an on-line supplement. Supplementary materials accompanying this paper appear on-line.

Journal ArticleDOI
TL;DR: This study proposes conditional joint random-effects models, which take into account the inherent association between the continuous, binary and ordinal outcomes, and Simulation studies show that, by jointly modelling the trivariate outcomes, standard deviations of the estimates of parameters in the models are smaller and much more stable, leading to more efficient parameter estimates and reliable statistical inferences.
Abstract: In medical studies, repeated measurements of continuous, binary and ordinal outcomes are routinely collected from the same patient. Instead of modelling each outcome separately, in this study we propose to jointly model the trivariate longitudinal responses, so as to take account of the inherent association between the different outcomes and thus improve statistical inferences. This work is motivated by a large cohort study in the North West of England, involving trivariate responses from each patient: Body Mass Index, Depression (Yes/No) ascertained with cut-off score not less than 8 at the Hospital Anxiety and Depression Scale, and Pain Interference generated from the Medical Outcomes Study 36-item short-form health survey with values returned on an ordinal scale 1-5. There are some well-established methods for combined continuous and binary, or even continuous and ordinal responses, but little work was done on the joint analysis of continuous, binary and ordinal responses. We propose conditional joint random-effects models, which take into account the inherent association between the continuous, binary and ordinal outcomes. Bayesian analysis methods are used to make statistical inferences. Simulation studies show that, by jointly modelling the trivariate outcomes, standard deviations of the estimates of parameters in the models are smaller and much more stable, leading to more efficient parameter estimates and reliable statistical inferences. In the real data analysis, the proposed joint analysis yields a much smaller deviance information criterion value than the separate analysis, and shows other good statistical properties too.

Journal ArticleDOI
TL;DR: A new approach is proposed that builds a probability distribution over the space of all value functions compatible with the DM's certain holistic judgments that is more credible than preference of b over a.
Abstract: Multiple criteria ranking problem is approached using Subjective Stochastic Ordinal Regression (SSOR).Preferences of the decision maker are expressed through pairwise comparisons of some reference alternatives.A part of pairwise comparisons is certain, and another part is uncertain.Uncertain pairwise comparisons are used to build a probability distribution over the space of all preference models compatible with certain pairwise comparisons.From sampling of this distribution, one learns a probability with which a is ranked on the rth position (rank acceptability index), and probability that a is preferred to b (pairwise winning index). Ordinal regression methods of Multiple Criteria Decision Aiding (MCDA) take into account one, several, or all value functions compatible with the indirect preference information provided by the Decision Maker (DM). When dealing with multiple criteria ranking problems, typically, this information is a series of holistic and certain judgments having the form of pairwise comparisons of some reference alternatives, indicating that alternative a is certainly either preferred to or indifferent with alternative b. In some decision situations, it might be useful, however, to additionally account for uncertain pairwise comparisons interpreted in the following way: although the preference of a over b is not certain, it is more credible than preference of b over a. To handle certain and uncertain preference information, we propose a new approach that builds a probability distribution over the space of all value functions compatible with the DM's certain holistic judgments. A didactic example shows the applicability of the proposed approach.

Journal ArticleDOI
TL;DR: This work states that, based on simulation conditions, Maximum Likelihood (ML) method is better than Penalized Quasilikelihood (PQL) method in three-category ordinal outcome variable and, for five- category ordinal response variable model, the power of PQL method is slightly higher than thePower of ML method.
Abstract: For most of the time, biomedical researchers have been dealing with ordinal outcome variable in multilevel models where patients are nested in doctors. We can justifiably apply multilevel cumulative logit model, where the outcome variable represents the mild, severe, and extremely severe intensity of diseases like malaria and typhoid in the form of ordered categories. Based on our simulation conditions, Maximum Likelihood (ML) method is better than Penalized Quasilikelihood (PQL) method in three-category ordinal outcome variable. PQL method, however, performs equally well as ML method where five-category ordinal outcome variable is used. Further, to achieve power more than 0.80, at least 50 groups are required for both ML and PQL methods of estimation. It may be pointed out that, for five-category ordinal response variable model, the power of PQL method is slightly higher than the power of ML method.

Journal ArticleDOI
TL;DR: A Bayesian multivariate probit framework to capture the latent disease status leading to a natural clustering of tooth sites and subjects with similar PD status (beyond spatial clustering), and improved parameter estimation through sharing of information is developed.
Abstract: Clinical attachment level (CAL) is regarded as the most popular measure to assess periodontal disease (PD). These probed tooth-site level measures are usually rounded and recorded as whole numbers (in mm) producing clustered (site measures within a mouth) error-prone ordinal responses representing some ordering of the underlying PD progression. In addition, it is hypothesized that PD progression can be spatially-referenced, i.e., proximal tooth-sites share similar PD status in comparison to sites that are distantly located. In this paper, we develop a Bayesian multivariate probit framework for these ordinal responses where the cut-point parameters linking the observed ordinal CAL levels to the latent underlying disease process can be fixed in advance. The latent spatial association characterizing conditional independence under Gaussian graphs is introduced via a nonparametric Bayesian approach motivated by the probit stick-breaking process, where the components of the stick-breaking weights follows a multivariate Gaussian density with the precision matrix distributed as G-Wishart. This yields a computationally simple, yet robust and flexible framework to capture the latent disease status leading to a natural clustering of tooth-sites and subjects with similar PD status (beyond spatial clustering), and improved parameter estimation through sharing of information. Both simulation studies and application to a motivating PD dataset reveal the advantages of considering this flexible nonparametric ordinal framework over other alternatives.

17 Jul 2016
TL;DR: A model-based coclustering algorithm for ordinal data that relies on the latent block model using the BOS model and a SEM-Gibbs algorithm for inference for inference is presented.
Abstract: A model-based coclustering algorithm for ordinal data is presented. This algorithm relies on the latent block model using the BOS model (Biernacki and Jacques, 2015, Stat. Comput.) for ordinal data and a SEM-Gibbs algorithm for inference. Nu- merical experiments on simulated data illustrate the eciency of the inference strategy.

Journal ArticleDOI
TL;DR: The authors introduced a concept of inequality comparisons with ordinal bivariate categorical data, where one population is more unequal than another when they have common arithmetic median outcomes and the first can be obtained from the second by correlation-increasing switches and/or median-preserving spreads.
Abstract: This paper introduces a concept of inequality comparisons with ordinal bivariate categorical data. In our model, one population is more unequal than another when they have common arithmetic median outcomes and the first can be obtained from the second by correlation-increasing switches and/or median-preserving spreads. For the canonical 2 × 2 case (with two binary indicators), we derive a simple operational procedure for checking ordinal inequality relations in practice. As an illustration, we apply the model to childhood deprivation in Mozambique.