scispace - formally typeset
Search or ask a question
Topic

Ordinal regression

About: Ordinal regression is a research topic. Over the lifetime, 1879 publications have been published within this topic receiving 65431 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper investigates data mining support available from ordinal data and investigates the effect of considering ordinal dependencies in the data set on the overall results of constructing decision trees and induction rules.
Abstract: Many classification tasks can be viewed as ordinal. Use of numeric information usually provides possibilities for more powerful analysis than ordinal data. On the other hand, ordinal data allows more powerful analysis when compared to nominal data. It is therefore important not to overlook knowledge about ordinal dependencies in data sets used in data mining. This paper investigates data mining support available from ordinal data. The effect of considering ordinal dependencies in the data set on the overall results of constructing decision trees and induction rules is illustrated. The degree of improved prediction of ordinal over nominal data is demonstrated. When data was very representative and consistent, use of ordinal information reduced the number of final rules with a lower error rate. Data treatment alternatives are presented to deal with data sets having greater imperfections.

32 citations

Journal ArticleDOI
TL;DR: A multidimensional isotonic regression-based estimator far exceeds the others in terms of accuracy and efficiency and is compared with a non-parametric design for binary response trials, by keeping track of ordinal data for estimation purposes, but dichotomizing the data in the design phase.
Abstract: A non-parametric multi-dimensional isotonic regression estimator is developed for use in estimating a set of target quantiles from an ordinal toxicity scale. We compare this estimator to the standard parametric maximum likelihood estimator from a proportional odds model for extremely small data sets. A motivating example is from phase I oncology clinical trials, where various non-parametric designs have been proposed that lead to very small data sets, often with ordinal toxicity response data. Our comparison of estimators is performed in conjunction with three of these non-parametric sequential designs for ordinal response data, two from the literature and a new design based on a random walk rule. We also compare with a non-parametric design for binary response trials, by keeping track of ordinal data for estimation purposes, but dichotomizing the data in the design phase. We find that a multidimensional isotonic regression-based estimator far exceeds the others in terms of accuracy and efficiency. A rule by Simon et al. (J. Natl. Cancer Inst. 1997; 89:1138-1147) yields particularly efficient estimators, more so than the random walk rule, but has higher numbers of dose-limiting toxicity. A small data set from a leukemia clinical trial is analysed using our multidimensional isotonic regression-based estimator.

31 citations

Journal ArticleDOI
TL;DR: By stabilizing maximum likelihood estimation, this work is able to fit an ordinal latent class model that would otherwise not be identifiable without application of strict constraints to facilitate analysis of high-dimensional ordinal data.
Abstract: Latent class models provide a useful framework for clustering observations based on several features. Application of latent class methodology to correlated, high-dimensional ordinal data poses many challenges. Unconstrained analyses may not result in an estimable model. Thus, information contained in ordinal variables may not be fully exploited by researchers. We develop a penalized latent class model to facilitate analysis of high-dimensional ordinal data. By stabilizing maximum likelihood estimation, we are able to fit an ordinal latent class model that would otherwise not be identifiable without application of strict constraints. We illustrate our methodology in a study of schwannoma, a peripheral nerve sheath tumor, that included 3 clinical subtypes and 23 ordinal histological measures.

31 citations

Journal ArticleDOI
TL;DR: This work proposes a classifier based on a deep convolutional neural network that outperforms average human interrater as well as intrarater reliability and surpasses state-of-the-art machine learning solutions for automatically grading disc degeneration.
Abstract: OBJECTIVES Although magnetic resonance imaging-based formalized grading schemes for intervertebral disc degeneration offer improved reproducibility compared with purely subjective ratings, their intrarater and interrater reliability are not nearly good enough to be able to detect small to medium effects in clinical longitudinal studies. The aim of this study thus was to develop a method that enables automatic and therefore reproducible and reliable evaluation of disc degeneration based on conventional clinical image data and Pfirrmann's grading scheme. MATERIALS AND METHODS We propose a classifier based on a deep convolutional neural network that we trained on a large, manually evaluated data set of 1599 patients (7948 intervertebral discs). To improve upon the status quo, we focused on the quality of the training data and performed extensive hyperparameter optimization. We assessed the potential benefits of optimizing loss functions beyond common cross-entropy loss, such as soft kappa loss, ordinal cross-entropy loss, or regression losses. We furthermore experimented with ways to mitigate class imbalance by pooling classes or using class-weighted loss functions. During model development and hyperparameter optimization, we used a fixed 90%/10% training/validation set split. To estimate real-world prediction performance, we performed 10-fold cross-validation. RESULTS The evaluated image data results in a Gaussian degeneration grade distribution, and thus grades 1 and 5 are slightly underrepresented in the training set. Our default cross-entropy-based classifier achieves a reliability of κ = 0.92 (Cohen κ), an average sensitivity of 90.2%, and an average precision of 92.5%. In 99.2% of validation cases, the network's prediction deviates at most 1 Pfirrmann grades from the ground truth. Framed as an ordinal regression problem, the mean absolute error between the ground truth and the prediction is 0.08 Pfirrmann grade with a correlation of r = 0.96. The results of the 10-fold cross validation confirm those performance estimates, indicating no substantial overfitting. More sophisticated loss functions, class-based loss weighting, or class pooling did not lead to improved classification performance overall. CONCLUSIONS With a reliability of κ > 0.9, our system clearly outperforms average human interrater as well as intrarater reliability. With an average sensitivity of more than 90%, our classifier also surpasses state-of-the-art machine learning solutions for automatically grading disc degeneration.

31 citations

Journal ArticleDOI
TL;DR: Motivated by an application in credit risk, where multiple credit rating agencies assess the creditworthiness of a firm on an ordinal scale, multivariate ordinal regression models with a latent variable specification and correlated error terms are considered.
Abstract: Correlated ordinal data typically arises from multiple measurements on a collection of subjects. Motivated by an application in credit risk, where multiple credit rating agencies assess the creditworthiness of a firm on an ordinal scale, we consider multivariate ordinal regression models with a latent variable specification and correlated error terms. Two different link functions are employed, by assuming a multivariate normal and a multivariate logistic distribution for the latent variables underlying the ordinal outcomes. Composite likelihood methods, more specifically the pairwise and tripletwise likelihood approach, are applied for estimating the model parameters. Using simulated data sets with varying number of subjects, we investigate the performance of the pairwise likelihood estimates and find them to be robust for both link functions and reasonable sample size. The empirical application consists of an analysis of corporate credit ratings from the big three credit rating agencies (Standard & Poor’s, Moody’s and Fitch). Firm-level and stock price data for publicly traded US firms as well as an unbalanced panel of issuer credit ratings are collected and analyzed to illustrate the proposed framework.

31 citations


Network Information
Related Topics (5)
Regression analysis
31K papers, 1.7M citations
84% related
Linear regression
21.3K papers, 1.2M citations
79% related
Inference
36.8K papers, 1.3M citations
78% related
Empirical research
51.3K papers, 1.9M citations
78% related
Social media
76K papers, 1.1M citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023102
2022191
202188
202093
201979
201873