scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Evaluating the categorisation of the public hospitals in Chile according to case-mix complexity: a genetic algorithm approach

TL;DR: An ad-hoc genetic algorithm is proposed which combines filter feature selection and clustering strategies to determine if there is a set of features related to the case-mix that allow to reach the same categorisation proposed by the MINSAL.
Abstract: The healthcare services must provide quality health safeguarding the efficient use of the resources. To evaluate technical efficiency performing fairly comparisons it is necessary to group the hospitals according to the type of patient treated: case-mix. Generally, this evaluation is performed by using the Related Groups for Diagnosis (DRG) system. Since only a few hospitals have implemented this system in Chile, the analysis of technical efficiency results limited. The Ministry of Health of Chile (MINSAL) has proposed an administrative categorisation for the public hospitals: high, medium and low complexity. However, it has not been studied if this definition is associated to the case-mix and if it can be used to study technical efficiency. In this work, we propose an ad-hoc genetic algorithm which combines filter feature selection and clustering strategies to determine if there is a set of features related to the case-mix that allow to reach the same categorisation proposed by the MINSAL. The results show that, although a small set of features is able to reach this categorisation by year, there is not enough evidence to establish a relationship with the case-mix. It is recommended that future technical efficiency analyses use new categorisations based on case-mix instead of the MINSAL categorisation.
References
More filters
Journal ArticleDOI
TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.
Abstract: An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of "normal" examples with only a small percentage of "abnormal" or "interesting" examples. It is also the case that the cost of misclassifying an abnormal (interesting) example as a normal example is often much higher than the cost of the reverse error. Under-sampling of the majority (normal) class has been proposed as a good means of increasing the sensitivity of a classifier to the minority class. This paper shows that a combination of our method of oversampling the minority (abnormal)cla ss and under-sampling the majority (normal) class can achieve better classifier performance (in ROC space)tha n only under-sampling the majority class. This paper also shows that a combination of our method of over-sampling the minority class and under-sampling the majority class can achieve better classifier performance (in ROC space)t han varying the loss ratios in Ripper or class priors in Naive Bayes. Our method of over-sampling the minority class involves creating synthetic minority class examples. Experiments are performed using C4.5, Ripper and a Naive Bayes classifier. The method is evaluated using the area under the Receiver Operating Characteristic curve (AUC)and the ROC convex hull strategy.

17,313 citations

Journal ArticleDOI
TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.
Abstract: An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of "normal" examples with only a small percentage of "abnormal" or "interesting" examples. It is also the case that the cost of misclassifying an abnormal (interesting) example as a normal example is often much higher than the cost of the reverse error. Under-sampling of the majority (normal) class has been proposed as a good means of increasing the sensitivity of a classifier to the minority class. This paper shows that a combination of our method of over-sampling the minority (abnormal) class and under-sampling the majority (normal) class can achieve better classifier performance (in ROC space) than only under-sampling the majority class. This paper also shows that a combination of our method of over-sampling the minority class and under-sampling the majority class can achieve better classifier performance (in ROC space) than varying the loss ratios in Ripper or class priors in Naive Bayes. Our method of over-sampling the minority class involves creating synthetic minority class examples. Experiments are performed using C4.5, Ripper and a Naive Bayes classifier. The method is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.

11,512 citations

Journal Article
TL;DR: In this paper, Hubert and Arabie corrected the Rand Index for chance (Adjusted Rand Index) and presented some alternative indices, which do not assume one set of units for two partitions.
Abstract: Rand (1971) proposed the Rand Index to measure the stability of two partitions of one set of units. Hubert and Arabie (1985) corrected the Rand Index for chance (Adjusted Rand Index). In this paper, we present some alternative indices. The proposed indices do not assume one set of units for two partitions. Here, one set of units can be a subset of the other set of units. According to the purpose of the comparison of two partitions, the merging and splitting of clusters in two partitions can have different impact on the value of the indices. Therefore, we proposed different modified Rand Indices.

2,417 citations


"Evaluating the categorisation of th..." refers methods in this paper

  • ...2) Adjusted Rand index: it is an improved version of the Rand index proposed by [36]....

    [...]

Journal ArticleDOI
TL;DR: The derivation and use of a measure of similarity between two hierarchical clusterings, Bk, is derived from the matching matrix, [mij], formed by cutting the two hierarchical trees and counting the number of matching entries in the k clusters in each tree.
Abstract: This article concerns the derivation and use of a measure of similarity between two hierarchical clusterings. The measure, Bk , is derived from the matching matrix, [mij ], formed by cutting the two hierarchical trees and counting the number of matching entries in the k clusters in each tree. The mean and variance of Bk are determined under the assumption that the margins of [mij ] are fixed. Thus, Bk represents a collection of measures for k = 2, …, n – 1. (k, Bk ) plots are found to be useful in portraying the similarity of two clusterings. Bk is compared to other measures of similarity proposed respectively by Baker (1974) and Rand (1971). The use of (k, Bk ) plots for studying clustering methods is explored by a series of Monte Carlo sampling experiments. An example of the use of (k, Bk ) on real data is given.

1,376 citations


"Evaluating the categorisation of th..." refers background in this paper

  • ...3) Fowlkes-Mallows index: this index was proposed by [37] and it is defined as:...

    [...]

Journal ArticleDOI
TL;DR: The rationale underlying the iterated racing procedures in irace is described and a number of recent extensions are introduced, including a restart mechanism to avoid premature convergence, the use of truncated sampling distributions to handle correctly parameter bounds, and an elitist racing procedure for ensuring that the best configurations returned are also those evaluated in the highest number of training instances.

1,280 citations