2
rates ex ante without any link to the real costs of the individual provider [2]. This
payment system is increasingly being adopted over retrospective systems, as it en-
courages cost containment and a shared burden with the providers. There is wide
adoption of PPS globally, with approximately 70% of all OECD countries and more
than 25 low-and middle-income countries having adopted some sort of casemix system
for reimbursement purposes [3, 4].
Here, a casemix is a system of defining cohorts of related patients, which comprise
cases that are homogenous by resource consumption pattern and at the same time,
clinically similar. In the NHS, the National Casemix Office (NCO) is commissioned to
develop and maintain a set of casemix groupings, called HRG (health resource
group). This is a type of PPS where payment rate is determined as the average patient
cost in each HRG. HRGs are generated using nationally mandated patient-level data,
which primarily includes age, complications and comorbidities, diagnosis and proce-
dures. Adopted in acute care, the groups are generated by transcribing expert advice
into if-else rules, with the aim of capturing differing patient severity and length of
stay (LOS).
Any reimbursement methodology based on generalizations across patient groups
(i.e. determining payment rate as an average of cost in each HRG) will have weaknesses
regarding its ability to fairly work across a variety of settings and HRGs are no excep-
tion to this. The use of LOS as an (imperfect) indicator of resource use contributes
further to this weakness – it is known to be unreliable particularly for the case of sur-
gical patients [5]. Finally, the identification of relevant factors based on expert advice
alone carries the risk of ignoring other unknown (or less well established) factors that
may account for the case complexity of certain patient sub-groups.
Our core hypothesis is that in-depth analysis of the available data should be used in
conjunction with expert input to develop an evidence-based model that comprehen-
sively captures the complexity of care provided by such services, and accurately clas-
sifies patients into homogeneous groups with respect to costs and patient characteris-
tics. This dual approach was previously not possible due to a lack of availability of
extensive patient-level cost data, and the resulting primary dependence on expert
advice.
Our research aims to provide evidence for this hypothesis. First, we explore the ac-
curacy of current HRGs in terms of actual resource usage. Second, we describe an
analytical approach to the development of an alternative, data-driven grouper.
Throughout our analysis, we use burn care as a base case. Burn services are selected
as an example of a specialized service, which deals with rare and complex conditions
and by necessity operates at high expenditure. Burn services are to be open regardless
of the number of patients admitted, with a minimum number of staff, and they rely on
the use of highly specialist equipment and interventions. We expect that the complex
characteristics of this setting make them particularly sensitive to the impact of weak-
nesses in the current HRG classification.
The remainder of this paper is structured as follows. The next section introduces
the data sets used to explore HRGs and generate the data-driven groups. We then in-
troduce the analysis pipeline adopted, which includes data pre-processing, dimension-
ality reduction and the deployment of clustering approaches in two separate steps. In