Book Chapter•DOI•

A Clustering-Based Patient Grouper for Burn Care

Chimdimma Noelyn Onah¹, Richard Allmendinger¹, Julia Handl¹, Paraskevas Yiapanis, Kenneth W. Dunn - Show less +1 more•Institutions (1)

University of Manchester¹

14 Nov 2019-pp 123-131

TL;DR: It is argued that a data-driven approach minimises bias in feature selection in patient groups, and a reduction of within cluster cost-variation in the identified groups, when compared to the original casemix is demonstrated.

read less

Abstract: Patient casemix is a system of defining groups of patients. For reimbursement purposes, these groups should be clinically meaningful and share similar resource usage during their hospital stay. In the UK National Health Service (NHS) these groups are known as health resource groups (HRGs), and are predominantly derived based on expert advice and checked for homogeneity afterwards, typically using length of stay (LOS) to assess similarity in resource consumption. LOS does not fully capture the actual resource usage of patients, and assurances on the accuracy of HRG as a basis of payment rate derivation are therefore difficult to give. Also, with complex patient groups such as those encountered in burn care, expert advice will often reflect average patients only, therefore not capturing the complexity and severity of many patients’ injury profile. The data-driven development of a grouper may support the identification of features and segments that more accurately account for patient complexity and resource use. In this paper, we describe the development of such a grouper using established techniques for dimensionality reduction and cluster analysis. We argue that a data-driven approach minimises bias in feature selection. Using a registry of patients from 23 burn services in England and Wales, we demonstrate a reduction of within cluster cost-variation in the identified groups, when compared to the original casemix.

...read moreread less

Summary (2 min read)

Jump to: [Introduction] – [1 Motivation] – [2.1 Data] – [2.2 Analysis Pipeline] and [3 Results and Analysis]

Introduction

Imbursement purposes, these groups should be clinically meaningful and share similar resource usage during their hospital stay.
In the UK National Health Service (NHS) these groups are known as health resource groups (HRGs), and are predominantly derived based on expert advice and checked for homogeneity afterwards, typically using length of stay (LOS) to assess similarity in resource consumption.
Also, with complex patient groups such as those encountered in burn care, expert advice will often reflect average patients only, therefore not capturing the complexity and severity of many patients’ injury profile.
The data-driven development of a grouper may support the identification of features and segments that more accurately account for patient complexity and resource use.
The authors argue that a data-driven approach minimises bias in feature selection.

1 Motivation

The NHS serves a wide population with varied demographic and medical histories, with the aim of providing health interventions to the population who need them.
In contrast, prospective payment systems (PPSs) determine the provider's payment rates ex ante without any link to the real costs of the individual provider [2].
HRGs are generated using nationally mandated patient-level data, which primarily includes age, complications and comorbidities, diagnosis and procedures.
The authors core hypothesis is that in-depth analysis of the available data should be used in conjunction with expert input to develop an evidence-based model that comprehensively captures the complexity of care provided by such services, and accurately classifies patients into homogeneous groups with respect to costs and patient characteristics.
Burn services are to be open regardless of the number of patients admitted, with a minimum number of staff, and they rely on the use of highly specialist equipment and interventions.

2.1 Data

This study uses comprehensive anonymized patient-level data that is nationally mandated for all burn units in England and Wales.
This includes features such as demographic characteristics (age, gender), burn characteristics (depth, total burn surface area, burn site, locality, type, source, category and injury group), pre-existing conditions (self-harm, alcohol usage, asthma, clotting disorder etc.), time from injury to admission, patient-level cost, LOS and index of multiple deprivation (IMD).
To highlight current variation in HRGs and as a benchmark for model performance, the authors use the 2017/18 average patient-level cost by HRG open data released by NHS Improvement.
This is limited to one year as PLICS adoption was introduced just in 2017/18 data collection cycle.

2.2 Analysis Pipeline

Selecting relevant features and cases, also known as Step 1.
Linear discriminant analysis (LDA), a supervised approach to dimensionality reduction, is adopted.
The target feature is then generated using k-means clustering algorithm (k = 38, same as number of HRGs) to partition the two-dimensional target space defined by adjusted LOS and patient-level cost.
The current grouper splits the data into young patients (<16 years old) and older patients (>=16 years old).
This reflects the burn care pathway, designed to treat pediatrics separately from adults as young age is identified as a significant complicator.

3 Results and Analysis

The authors explore the patient-level cost by HRG, as generated by the National Casemix office.
The wider the boxplot, the more variable are the costs within that group.
When comparing the clusters Adult3 and Adult12, these have very similar average age, but Adult3 has the more severe burns (TBSA), higher LOS and cost, and so the necessity to have separate groups.
Child5 and Child10 though with similar adjusted LOS, Child5 has a higher TBSA, higher score with respect to the severity of existing disorders and thus a higher average patient-level cost.
These results highlight the effectiveness of the datadriven HAC grouper in generating groups with homogenous patient characteristics.

Did you find this useful? Give us your feedback

Figures (7)

Fig. 3. Factor loadings on 1st linear discriminant: Child vs Adult Segment

Fig. 2. 2017/18 HRG by patient-level cost. Ordered in decreasing order of injury complexity

Fig. 4. Identified groups by patient-level cost ordered by decreasing average cost

Table 2. A sample* of HAC groups by patient characteristics (average).

Fig. 5. Within cluster variation of patient-level cost in HRG vs HAC

Content maybe subject to copyright Report

The University of Manchester Research

A Clustering-Based Patient Grouper for Burn Care

DOI:

10.1007/978-3-030-33617-2_14

Document Version

Accepted author manuscript

Link to publication record in Manchester Research Explorer

Citation for published version (APA):

Onah, C., Allmendinger, R., Handl, J., Yiapanis, P., & Dunn, K. W. (2019). A Clustering-Based Patient Grouper for

Burn Care. In Intelligent Data Engineering and Automated Learning - IDEAL 2019 https://doi.org/10.1007/978-3-

030-33617-2_14

Published in:

Intelligent Data Engineering and Automated Learning - IDEAL 2019

Citing this paper

Please note that where the full-text provided on Manchester Research Explorer is the Author Accepted Manuscript

or Proof version this may differ from the final Published version. If citing, it is advised that you check and use the

publisher's definitive version.

General rights

authors and/or other copyright owners and it is a condition of accessing publications that users recognise and

abide by the legal requirements associated with these rights.

Takedown policy

If you believe that this document breaches copyright please refer to the University of Manchester’s Takedown

Procedures [http://man.ac.uk/04Y6Bo] or contact uml.scholarlycommunications@manchester.ac.uk providing

relevant details, so we can investigate your claim.

Download date:10. Aug. 2022

A Clustering-Based Patient Grouper for Burn Care

Chimdimma Noelyn Onah

, Richard Allmendinger

, Julia Handl

, Paraskevas

Yiapanis

, Ken W. Dunn

University of Manchester

Medical Data Solutions and Services

University Hospital South Manchester

Abstract. Patient casemix is a system of defining groups of patients. For re-

imbursement purposes, these groups should be clinically meaningful and share

similar resource usage during their hospital stay. In the UK National Health

Service (NHS) these groups are known as health resource groups (HRGs), and

are predominantly derived based on expert advice and checked for homogeneity

afterwards, typically using length of stay (LOS) to assess similarity in resource

consumption. LOS does not fully capture the actual resource usage of patients,

and assurances on the accuracy of HRG as a basis of payment rate derivation

are therefore difficult to give. Also, with complex patient groups such as those

encountered in burn care, expert advice will often reflect average patients only,

therefore not capturing the complexity and severity of many patients’ injury

profile. The data-driven development of a grouper may support the identifica-

tion of features and segments that more accurately account for patient complexi-

ty and resource use. In this paper, we describe the development of such a group-

er using established techniques for dimensionality reduction and cluster analy-

sis. We argue that a data-driven approach minimises bias in feature selection.

Using a registry of patients from 23 burn services in England and Wales, we

demonstrate a reduction of within cluster cost-variation in the identified groups,

when compared to the original casemix.

Keywords: Patient Casemix, Clustering, Data Driven.

1 Motivation

The NHS serves a wide population with varied demographic and medical histories,

with the aim of providing health interventions to the population who need them. The

provision and maintenance of these interventions is constrained by scarce resources

and cost containment [1]. The pressure from binding budget constraints, and thus the

need to control costs, has induced a shift in favor of prospective payments over retro-

spective payment systems.

The use of patient-level payment system transfers all cost burden to the payer,

since the reimbursement is based on the real costs. In the context of such a system,

even profit maximizing providers may be insufficiently motivated to decrease costs.

In contrast, prospective payment systems (PPSs) determine the provider's payment

rates ex ante without any link to the real costs of the individual provider [2]. This

payment system is increasingly being adopted over retrospective systems, as it en-

courages cost containment and a shared burden with the providers. There is wide

adoption of PPS globally, with approximately 70% of all OECD countries and more

than 25 low-and middle-income countries having adopted some sort of casemix system

for reimbursement purposes [3, 4].

Here, a casemix is a system of defining cohorts of related patients, which comprise

cases that are homogenous by resource consumption pattern and at the same time,

clinically similar. In the NHS, the National Casemix Office (NCO) is commissioned to

develop and maintain a set of casemix groupings, called HRG (health resource

group). This is a type of PPS where payment rate is determined as the average patient

cost in each HRG. HRGs are generated using nationally mandated patient-level data,

which primarily includes age, complications and comorbidities, diagnosis and proce-

dures. Adopted in acute care, the groups are generated by transcribing expert advice

into if-else rules, with the aim of capturing differing patient severity and length of

stay (LOS).

Any reimbursement methodology based on generalizations across patient groups

(i.e. determining payment rate as an average of cost in each HRG) will have weaknesses

regarding its ability to fairly work across a variety of settings and HRGs are no excep-

tion to this. The use of LOS as an (imperfect) indicator of resource use contributes

further to this weakness – it is known to be unreliable particularly for the case of sur-

gical patients [5]. Finally, the identification of relevant factors based on expert advice

alone carries the risk of ignoring other unknown (or less well established) factors that

may account for the case complexity of certain patient sub-groups.

Our core hypothesis is that in-depth analysis of the available data should be used in

conjunction with expert input to develop an evidence-based model that comprehen-

sively captures the complexity of care provided by such services, and accurately clas-

sifies patients into homogeneous groups with respect to costs and patient characteris-

tics. This dual approach was previously not possible due to a lack of availability of

extensive patient-level cost data, and the resulting primary dependence on expert

advice.

Our research aims to provide evidence for this hypothesis. First, we explore the ac-

curacy of current HRGs in terms of actual resource usage. Second, we describe an

analytical approach to the development of an alternative, data-driven grouper.

Throughout our analysis, we use burn care as a base case. Burn services are selected

as an example of a specialized service, which deals with rare and complex conditions

and by necessity operates at high expenditure. Burn services are to be open regardless

of the number of patients admitted, with a minimum number of staff, and they rely on

the use of highly specialist equipment and interventions. We expect that the complex

characteristics of this setting make them particularly sensitive to the impact of weak-

nesses in the current HRG classification.

The remainder of this paper is structured as follows. The next section introduces

the data sets used to explore HRGs and generate the data-driven groups. We then in-

troduce the analysis pipeline adopted, which includes data pre-processing, dimension-

ality reduction and the deployment of clustering approaches in two separate steps. In

Section 3, we discuss the results, using visualizations and within cluster variation of

costs to identify improvements. The final section includes a conclusion and discussion

of future work.

2 Methodology

2.1 Data

This study uses comprehensive anonymized patient-level data that is nationally man-

dated for all burn units in England and Wales. The data covers a time period from

2003 to 2019 and captures 164 features for just over 100,000 patients. This includes

features such as demographic characteristics (age, gender), burn characteristics

(depth, total burn surface area, burn site, locality, type, source, category and injury

group), pre-existing conditions (self-harm, alcohol usage, asthma, clotting disorder

etc.), time from injury to admission, patient-level cost, LOS and index of multiple

deprivation (IMD).

To highlight current variation in HRGs and as a benchmark for model perfor-

mance, we use the 2017/18 average patient-level cost by HRG open data released by

NHS Improvement. This is limited to one year as PLICS adoption was introduced just

in 2017/18 data collection cycle. This data is at the burn service level and so repre-

sents average patient level cost in each service.

2.2 Analysis Pipeline

Step 1: Selecting relevant features and cases. To ensure the use of quality features

that reflect the clinical and cost differences of patients, the features selected for clus-

tering were those identified as statistically significant in predicting patient-level cost

and patient outcome. Cost prediction accuracy was improved with the removal of

non-survivals, which LOS and cost less compared to survivals with similar burn char-

acteristics. Thus, is in line with the current grouper, the following analysis focuses on

survival cases only. All cases with missing data were deleted, leaving just over 80,000

cases and 24 features after feature selection. Table 1 lists these features.

Table 1. Selected Features

Feature type (count)

Feature

Demographic (3)

Gender, Age, Index of Multiple Deprivation (IMD)

Burn characteristics (17)

Total burn surface area (TBSA); Presence of inhalation;

Site of burn (leg; upper limb (UL); torso and thorax; face, hands, feet

and perineum (FHPP); head and hand (HH); face, hands and feet

(FHF)); Type of injury (contact, cold, flame, electrical, scald, chemi-

cal, friction, flash, radiation)

Comorbidity (2)

Number existing disorders, significance of existing disorder

Cost Features (2)

Adjusted LOS, Patient-level cost

We implement further dimensionality reduction to minimise noise, data complexity

and reduce redundancy. Dimensionality reduction also helps reduce processing time

and mitigates against the curse of dimensionality [6]. Linear discriminant analysis

(LDA), a supervised approach to dimensionality reduction, is adopted. Here, this

method is preferred over unsupervised dimension reduction models such as principal

component analysis (PCA), as we wish to identify components that maximise cost

separation rather than percent of variance alone.

Step 2: Deriving target feature for LDA. We derive a set of target classes for the

LDA using a cluster analysis on multiple cost features, to reduce sensitivity to a single

cost measure. This is achieved by using cost features: patient level cost and adjusted

LOS as the target space. The target feature is then generated using k-means clustering

algorithm (k = 38, same as number of HRGs) to partition the two-dimensional target

space defined by adjusted LOS and patient-level cost.

Step 3: Segmentation by age. The current grouper splits the data into young patients

(<16 years old) and older patients (>=16 years old). This reflects the burn care path-

way, designed to treat pediatrics separately from adults as young age is identified as a

significant complicator. The 2001 National Burn Care Review Report [8] highlights

the unpredictable complication of seemingly simple burn injuries especially for pedi-

atric patients. It argues and mandates the need for separate burn units for children and

adults, due to the peculiar needs of children such as play specialist, teachers, family

counselors and intensive psychosocial support. In line with the current grouper, we

therefore further split the data by age group.

Step 4: Dimensionality reduction using LDA. The comorbidity details, demograph-

ic and burn characteristics listed in Table 1 are used as the input features for the LDA.

We retain the first two LDA components. Therefore, the output of this analysis is a

projection from the original feature space into a two-dimensional manifold spanned

by orthogonal components that maximise separation by the target feature constructed

in Step 2. This is done on each segment derived in Step 3.

Step 5: Segmentation into homogeneous patient groups. With these pre-processing

and dimensionality reduction steps completed, an unsupervised clustering method is

deployed to derive homogenous patient groups. This paper uses an unsupervised clus-

tering method, as we assume that the true class of patients are unknown. The use of a

supervised method, for example, using cost labels may create groups that are homoge-

nous in terms of cost only. This therefore does not meet the clinical relevance criteria.

In particular, we deploy an agglomerative hierarchical clustering (HAC) algorithm

using the LDA components generated on each age segments (<16 years old and >=16

years old) as input data to generate 13 and 25 patient groups respectively. The group

numbers reflect the number of segments generated by the current grouper, to facilitate

comparison.

HTML Viewer

Frequently Asked Questions (2)

Q1. What have the authors stated for future works in "A clustering-based patient grouper for burn care" ?

The collection of patient-level cost, at a national scale, has created the possibility of generating improved data-driven groups. Future work will be aimed at exploring changes to their analytical model, including the consideration of different approaches to dimensionality reduction and cluster analysis, as well as the inclusion of expert opinion in feature selection and group validation. The authors have been able to highlight that improvements can be made in identifying patient case mix suitable for payment rate derivation. There could be further reduction in within cluster variance with the use of state-ofthe-art clustering algorithms that simultaneously consider Step 2, 3 and 4 of their analysis.

Q2. What are the contributions in "A clustering-based patient grouper for burn care" ?

In this paper, the authors describe the development of such a grouper using established techniques for dimensionality reduction and cluster analysis. Using a registry of patients from 23 burn services in England and Wales, the authors demonstrate a reduction of within cluster cost-variation in the identified groups, when compared to the original casemix.

A Clustering-Based Patient Grouper for Burn Care

Summary (2 min read)

Introduction

1 Motivation

2.1 Data

2.2 Analysis Pipeline

3 Results and Analysis

Figures (7)

Citations

References

Related Papers (5)

Frequently Asked Questions (2)

Q1. What have the authors stated for future works in "A clustering-based patient grouper for burn care" ?

Q2. What are the contributions in "A clustering-based patient grouper for burn care" ?