# A Bayesian Optimisation Algorithm for the Nurse Scheduling Problem

TL;DR: In this paper, a Bayesian optimization algorithm for nurse scheduling is presented, which involves choosing a suitable scheduling rule from a set for each nurse's assignment, and the conditional probability of each variable in the network is computed according to an initial set of promising solutions.

Abstract: A Bayesian optimization algorithm for the nurse scheduling problem is presented, which involves choosing a suitable scheduling rule from a set for each nurse’s assignment. Unlike our previous work that used GAs to implement implicit learning, the learning in the proposed algorithm is explicit, i.e. eventually, we will be able to identify and mix building blocks directly. The Bayesian optimization algorithm is applied to implement such explicit learning by building a Bayesian network of the joint distribution of solutions. The conditional probability of each variable in the network is computed according to an initial set of promising solutions. Subsequently, each new instance for each variable is generated by using the corresponding conditional probabilities, until all variables have been generated, i.e. in our case, a new rule string has been obtained. Another set of rule strings will be generated in this way, some of which will replace previous strings based on fitness selection. If stopping conditions are not met, the conditional probabilities for all nodes in the Bayesian network are updated again using the current set of promising rule strings. Computational results from 52 real data instances demonstrate the success of this approach. It is also suggested that the learning mechanism in the proposed approach might be suitable for other scheduling problems.

## Summary (3 min read)

Jump to: [1 Introduction] – [2.1 General Problem] – [2.2 Integer Programming] – [3 Graphical Models and Bayesian Networks] – [4.1 The Construction of a Bayesian Network] – [4.2 Learning based on the Bayesian Network] – [4.3 A Bayesian Optimization Algorithm] – [4.4 Four Building Rules] – [4.5 Fitness Function] – [5 Computational Results] – [5.1 Details of Algorithms] – [5.2 Analysis of Results] and [6 Conclusions]

### 1 Introduction

- After much practice, the scheduler gradually masters the knowledge of which solution parts go well with others.
- As a model of the selected strings, a Bayesian network (Pearl 1998) is used in the proposed Bayesian optimization algorithm to solve the nurse scheduling problem.
- The conditional probabilities are computed according to an initial set of promising solutions.
- Subsequently, each new instance for each node is generated by using the corresponding conditional probabilities, until values for all nodes have been generated, i.e. a new rule string has been generated.

### 2.1 General Problem

- These schedules have to satisfy working contracts and meet the demand for a given number of nurses of different grades on each shift, while being seen to be fair by the staff concerned.
- Thus scheduling the different grades independently is not possible.
- Furthermore, the problem has a special day-night structure as most of the nurses are contracted to work either days or nights in one week but not both.
- The latter two characteristics make this problem challenging for any local search algorithm, because finding and maintaining feasible solutions is extremely difficult.
- For each nurse i and each shift pattern j all the information concerning the desirability of the pattern for this nurse is captured in a single numeric preference cost pij.

### 2.2 Integer Programming

- Note that the definition of qis is such that higher graded nurses can substitute those at lower grades if necessary.
- Thus, the Integer Programming formulation has about 12000 binary variables and 100 constraints.
- Some problem cases remain unsolved after overnight computation using professional software.

### 3 Graphical Models and Bayesian Networks

- The authors introduce concepts from graphical models in general and Bayesian networks in particular.
- They provide a natural tool for dealing with uncertainty and complexity that occur throughout applied mathematics and engineering.
- There are two main kinds of graphical models: undirected and directed.
- The number of such events is exponential.
- To achieve compactness, Bayesian networks factor the joint distribution into local conditional distributions for each variable given its parents.

### 4.1 The Construction of a Bayesian Network

- In their nurse scheduling problem, the number of the nurse is fixed (up to 30), and the target is to create a weekly schedule by assigning each nurse one shift pattern in the most efficient way.
- Due to human limitations, these rules are typically simple.
- Nevertheless, human generated schedules are of high quality due to the ability of the scheduler to switch between the rules, based on the state of the current solution.
- The authors envisage the Bayesian optimisation algorithm to perform this role.
- In their particular implementation, an edge denotes a construction unit (or rule sub-string) for nurse i where the previous rule is j and the current rule is j’.

### 4.2 Learning based on the Bayesian Network

- According to whether the structure of the model is known or unknown, and whether all variables are fully observed or some of them are hidden, there are four kinds of learning (Heckerman 1998).
- In the proposed approach, learning amounts to counting and hence the authors use the symbol ‘#’ meaning ‘the number of’ in the following equations.
- Since the first rule in a solution has no parents, it will be chosen from nodes N1j according to their probabilities.
- Since all the probability values are normalized, the roulette-wheel method is good strategy for rule selection.
- Because pure low-cost or random allocation produces low quality solutions, either rule 1 is used for the first 2-3 nurses and rule 2 on remainder or vice versa.

### 4.3 A Bayesian Optimization Algorithm

- Based on the estimation of conditional probabilities, this section introduces a Bayesian optimization algorithm for the nurse scheduling problem.
- It uses techniques from the field of modelling data by Bayesian networks to estimate the joint distribution of promising solutions.
- New rule strings are generated by using these conditional probability values, and are added into the old population, replacing some of the old rule strings.
- If the termination conditions are not met (the authors use 2000 generations), go to step 2.

### 4.4 Four Building Rules

- As far as the domain knowledge of nurse scheduling is concerned, the following four ru les are currently investigated.
- Due to the nature of this approach, nurses’ preference costs pij are not taken into account by this rule.
- The fourth rule, called ‘Contribution’ rule, is biased towards solution quality but includes some aspects of feasibility by computing an overall score for each feasible pattern for the nurse currently being scheduled.
- This is achieved by going through the entire set of feasible shift patterns for a nurse and assigning each one a score.
- The one with the highest (i.e. best) score is chosen.

### 4.5 Fitness Function

- Independent of the rules used, the fitness of completed solutions has to be calculated.
- Therefore, the authors still need a penalty function approach.
- Since the chosen encoding automatically satisfies constraint set (3) of the integer programming formulation, the authors can use the following formula, where wdemand is the penalty weight, to calculate the fitness of solutions.
- Note that the penalty is proportional to the number of uncovered shifts. (9).

### 5 Computational Results

- The authors present the results of extensive computer experiments and compare them to results of the same data instances found previously by other algorithms.
- Table 1 lists the full and detailed computational results of 20 runs with different random seeds, where N/A indicates no feasible solution was found.
- Figures 2 summarises this information, Figure 3 shows a single typical run and finally Figure 4 gives an overall comparison between various algorithms.

### 5.1 Details of Algorithms

- The value of three units was chosen as it corresponds to the penalty cost of violating the least important level of requests in the original formulation.
- Thus, these solutions are still acceptable to the hospital.
- The executing time of the algorithm is approx.
- At this stage, there are based on their experience and intuition.
- When computing the mean a censored cost value of 255 has been used when an algorithm failed to find a feasible solution (N/A).

### 5.2 Analysis of Results

- Comparing the computational results on various test instances , one can see that using the random rule alone does not yield a single feasible solution.
- This underlines the difficulty of this problem.
- In addition, without learning the conditional probabilities, the results are much weaker, as the CP column shows.
- Thus, it is not simply enough to use the four rules to build solutions.
- Overall, the Bayesian results found rival those found by the complex multi-population GA.

### 6 Conclusions

- The approach is novel because it is the first time that Bayesian networks have been applied to the field of personnel scheduling.
- An effective method is proposed to solve the problem about how to implement explicit learning from past solutions.
- Unlike most existing approaches, the new approach has the ability to build schedules by using flexible, rather than fixed rules.
- Experimental results from real-world nurse scheduling problems have demonstrated the strength of the proposed Bayesian optimization algorithm.
- The proposed approach mimics human behaviour much more strongly than a standard GA based scheduling system.

Did you find this useful? Give us your feedback

A Bayesian Optimization Algorithm for the Nurse Scheduling Problem

Proceedings of 2003 Congress on Evolutionary Computation (CEC2003), pp. 2149-2156, IEEE Press, Canberra, Australia, 2003.

Jingpeng Li

School of Computer Science

University of Nottingham

NG8 1BB UK

jpl@cs.nott.ac.uk

Uwe Aickelin

School of Computer Science

University of Nottingham

NG8 1BB UK

uxa@cs.nott.ac.uk

Abstract- A Bayesian optimization algorithm for the

nurse scheduling problem is presented, which involves

choosing a suitable scheduling rule from a set for each

nurse’s assignment. Unlike our previous work that

used GAs to implement implicit learning, the learning

in the proposed algorithm is explicit, i.e. eventually, we

will be able to identify and mix building blocks

directly. The Bayesian optimization algorithm is

applied to implement such explicit learning by

building a Bayesian network of the joint distribution

of solutions. The conditional probability of each

variable in the network is computed according to an

initial set of promising solutions. Subsequently, each

new instance for each variable is generated by using

the corresponding conditional probabilities, until all

variables have been generated, i.e. in our case, a new

rule string has been obtained. Another set of rule

strings will be generated in this way, some of which

will replace previous strings based on fitness selection.

If stopping conditions are not met, the conditional

probabilities for all nodes in the Bayesian network are

updated again using the current set of promising rule

strings. Computational results from 52 real data

instances demonstrate the success of this approach. It

is also suggested that the learning mechanism in the

proposed approach might be suitable for other

scheduling problems.

1 Introduction

Scheduling problems are generally NP-hard combinatorial

problems, and a lot of research has been done to solve

these problems heuristically (Aickelin and Dowsland,

2002 and 2003; Li and Kwan, 2001a and 2003). However,

most previous approaches are problem-specific and

research into the development of a general scheduling

algorithm is still in its infancy.

Genetic Algorithms (GAs) (Holland 1975; Goldberg

1989), mimicking the natural evolutionary process of the

survival of the fittest, have attracted much attention in

solving difficult scheduling problems in recent years.

Some obstacles exist when using GAs: there is no

canonical mechanism to deal with constraints, which are

commonly met in most real-world scheduling problems,

and small changes to a solution are difficult. To overcome

both difficulties, indirect approaches have been presented

(Aickelin and Dowsland, 2003; Li and Kwan, 2001b and

2003) for nurse and driver scheduling. In these indirect

GAs, the solution space is mapped and then a separate

decoding routine builds solutions to the original problem.

In our previous indirect GAs, learning was implicit

(‘black-box’) and restricted to the efficient adjustment of

weights for a set of rules that are used to construct

schedules. The major limitation of those approaches is

that they learn in a non-human way. Like most existing

construction algorithms, once the best weight combination

is found, the rules used in the construction process are

fixed at each iteration. However, normally a long

sequence of moves is needed to construct a schedule and

using fixed rules at each move is thus unreasonable and

not coherent with the human learning processes.

When a human scheduler works, he normally builds a

schedule systematically following a set of rules. After

much practice, the scheduler gradually masters the

knowledge of which solution parts go well with others. He

can identify good parts and is aware of the solution

quality even if the scheduling process is not completed

yet, thus having the ability to finish a schedule by using

flexible, rather than fixed, rules. In this paper, we design a

more human-like scheduling algorithm, by using a

Bayesian optimization algorithm to implement explicit

learning from past solutions. A nurse scheduling problem

with 52 real data instances gathered from a UK hospital is

used as the test problem.

Nurse scheduling has been widely studied in recent

years, and an extensive summary of the approaches can be

found in Hung (1995) and Sitompul and Randhawa

(1990). This problem is highly constrained, making it

extremely difficult for most local search algorithms to

find feasible solutions, let alone optimal ones. In our

nurse scheduling problem, the number of the nurses is

fixed (up to 30), and the target is to create a weekly

schedule by assigning each nurse one out of up to 411

shift patterns in the most efficient way. The proposed

Bayesian approach achieves this by choosing a suitable

rule, from a rule set containing a number of available

rules, for each nurse. A potential solution is therefore

represented as a rule string, or a sequence of rules

corresponding to nurses from the first one to the last.

As a model of the selected strings, a Bayesian network

(Pearl 1998) is used in the proposed Bayesian

optimization algorithm to solve the nurse scheduling

problem. A Bayesian network is a directed acyclic graph

with each node corresponding to one variable, and each

variable corresponding to the individual rule by which a

schedule will be constructed step by step. The causal

relationship between two variables is represented by a

directed edge between the two corresponding nodes.

The Bayesian optimization algorithm is applied to

learn to identify good partial solutions and to complete

them by building a Bayesian network of the joint

distribution of solutions (Pelikan et al, 1999; Pelikan and

Goldberg, 2000). The conditional probabilities are

computed according to an initial set of promising

solutions. Subsequently, each new instance for each node

is generated by using the corresponding conditional

probabilities, until values for all nodes have been

generated, i.e. a new rule string has been generated.

Another set of rule strings will be generated in the

same way, some of which will replace previous strings

based on roulette-wheel fitness selection. If stopping

conditions are not met, the conditional probabilities for all

nodes in the Bayesian network are updated again using

the current set of rule strings. The algorithm thereby tries

to explicitly identify and mix promising building blocks.

It should be noted that for most scheduling problems,

the structure of the network model is known and all

variables are fully observed. In this case, the goal of

learning is to find the rule values that maximize the

likelihood of the training data. Thus, learning can amount

to ‘counting’ in the case of multinomial distributions.

The rest of this paper is organized as follows. Section 2

gives an overview on the nurse scheduling problem, and

the following section 3 introduces the general concepts

about graphical models and Bayesian networks. Section 4

discuses the proposed Bayesian optimization algorithm,

describing the construction of a Bayesian network,

learning based on the Bayesian network, and the four

building rules in detail. Computational results using 52

data instances gathered from a UK hospital are presented

in section 5. Concluding remarks are in section 6.

2 The Nurse Scheduling Problem

2.1 General Problem

Our nurse scheduling problem is to create weekly

schedules for wards of nurses by assigning one of a

number of possible shift patterns to each nurse. These

schedules have to satisfy working contracts and meet the

demand for a given number of nurses of different grades

on each shift, while being seen to be fair by the staff

concerned. The latter objective is achieved by meeting as

many of the nurses’ requests as possible and considering

historical information to ensure that unsatisfied requests

and unpopular shifts are evenly distributed.

The problem is complicated by the fact that higher

qualified nurses can substitute less qualified nurses but

not vice versa. Thus scheduling the different grades

independently is not possible. Furthermore, the problem

has a special day-night structure as most of the nurses are

contracted to work either days or nights in one week but

not both. However due to working contracts, the number

of days worked is not usually the same as the number of

nights. Therefore, it becomes important to schedule the

‘correct’ nurses onto days and nights respectively. The

latter two characteristics make this problem challenging

for any local search algorithm, because finding and

maintaining feasible solutions is extremely difficult.

The numbers of days or nights to be worked by each

nurse defines the set of feasible weekly work patterns for

that nurse. These will be referred to as shift patterns or

shift pattern vectors in the following. For each nurse i and

each shift pattern j all the information concerning the

desirability of the pattern for this nurse is captured in a

single numeric preference cost p

ij

. These costs were

determined in close consultation with the hospital and are

a weighted sum of the following factors: basic shift-

pattern cost, general day/night preferences, specific

requests, continuity problems, number of successive

working day, rotating nights/weekends and other working

history information. Patterns that violate mandatory

contractual requirements are marked as infeasible for a

particular nurse and week by giving them a suitably high

p

ij

value.

2.2 Integer Programming

The problem can be formulated as an integer linear

program as follows.

Indices:

i = 1...n nurse index;

j = 1...m shift pattern index;

k = 1...14 day and night index (1...7 are days and 8...14

are nights);

s = 1...p grade index.

Decision variables:

=

else ,0

pattern shift works nurse 1, ji

x

ij

.

Parameters:

m = Number of shift patterns;

n = Number of nurses;

p = Number of grades;

=

else ,0

day/night covers pattern shift 1, kj

a

jk

;

=

else ,0

higheror grade of is nurse 1, si

q

is

;

p

ij

= Preference cost of nurse i working shift pattern j;

R

ks

= Demand of nurses with grade s on day/night k;

N

i

= Working shifts per week of nurse i if night shifts are

worked;

D

i

= Working shifts per week of nurse i if day shifts are

worked;

B

i

= Working shifts per week of nurse i if both day and

night shifts are worked {for special nurses};

F(i) = Set of feasible shift patterns for nurse i, where F(i)

is defined as

.,

shifts combined ,

shiftsnight ,

shiftsday ,

)(

14

1

14

8

7

1

i

jBa

jNa

jDa

iF

k

ijk

i

k

jk

i

k

jk

∀

∈∀=

∈∀=

∈∀=

=

∑

∑

∑

=

=

=

(1)

Target function:

Minimize total preference cost of all nurses, denoted as

min!

1 )(

→

∑ ∑

= ∈

n

i

m

iFj

ijij

xp

. (2)

Subject to:

1. Every nurse works exactly one feasible shift pattern:

ix

iFj

ij

∀=

∑

∈

,1

)(

; (3)

2. The demand for nurses is fulfilled for every grade on

every day and night:

∑ ∑

∈ =

∀≥

)( 1

,,

iFj

n

i

ksijjkis

skRxaq

(4)

Constraint set (3) ensures that every nurse works

exactly one shift pattern from his/her feasible set, and

constraint set (4) ensures that the demand for nurses is

covered for every grade on every day and night. Note that

the definition of q

is

is such that higher graded nurses can

substitute those at lower grades if necessary.

Typical problem dimensions are 30 nurses of three

grades and 400 shift patterns. Thus, the Integer

Programming formulation has about 12000 binary

variables and 100 constraints. This is a moderately sized

problem. However, some problem cases remain unsolved

after overnight computation using professional software.

3 Graphical Models and Bayesian Networks

In this section, we introduce concepts from graphical

models in general and Bayesian networks in particular.

Section 4 will then explain how we applied these concepts

to our nurse scheduling problem.

Graphical models are graphs in which nodes represent

random variables, and the lack of edges represents

conditional independence assumptions (Edwards 2000).

They have important applications in many multivariate

probabilistic systems in fields such as statistics, systems

engineering, information theory and pattern recognition.

In particular, they are playing an increasingly important

role in the design and analysis of machine learning

algorithms.

As described by Jordon (1999), graphical models are a

marriage between probability theory and graph theory.

They provide a natural tool for dealing with uncertainty

and complexity that occur throughout applied

mathematics and engineering. In a graphical model, the

fundamental notion of modularity is used to build a

complex system by combining simpler parts. Probability

theory provides the glue to combine the parts, ensuring

that the whole system is consistent, and providing ways to

interface models to data. The graph theory provides an

intuitively appealing interface by which humans can

model highly interacting sets of variables, and a data

structure that leads itself naturally to the design of

general-purpose algorithms.

There are two main kinds of graphical models:

undirected and directed. Undirected graphical models are

more popular with the physics and vision communities.

Directed graphical model, also called Bayesian networks,

are more popular with the artificial intelligence and

machine learning communities. Bayesian networks are

often used to model multinomial data with both discrete

and continuous variables by encoding the relationship

between the variables contained in the modelled data,

which represents the structure of a problem.

Moreover, Bayesian networks can be used to generate

new instances of the variables with similar properties as

those of given data. Each node in the network corresponds

to one variable, and each variable corresponds to one

position in the strings representing the solutions. The

relationship between two variables is represented by a

directed edge between the two corresponding nodes.

Any complete probabilistic model of a domain must

represent the joint distribution, the probability of every

possible event as defined by the values of all the variables.

The number of such events is exponential. To achieve

compactness, Bayesian networks factor the joint

distribution into local conditional distributions for each

variable given its parents.

Mathematically, an acyclic Bayesian network encodes

a full joint probability distribution by the product

))(|(),...,(

1

1 ∏

=

=

n

i

iin

XpaxPxxP

, (5)

where x

i

denotes some values of the variable X

i

, pa(X

i

)

denotes a set of values for parents of X

i

in the network

(the set of nodes from which there exists an individual

edge to X

i

), and P(x

i

| pa(X

i

)) denotes the conditional

probability of X

i

conditioned on variables pa(X

i

). This

distribution can be used to generate new instances using

the marginal and conditional probabilities.

4 A Bayesian Optimization Algorithm for

Nurse Scheduling

This section discusses the proposed Bayesian optimization

algorithm for the nurse scheduling problem, including the

construction of a Bayesian network, learning based on the

Bayesian network and the four building rules used.

4.1 The Construction of a Bayesian Network

In our nurse scheduling problem, the number of the nurse

is fixed (up to 30), and the target is to create a weekly

schedule by assigning each nurse one shift pattern in the

most efficient way. The proposed approach achieves this

by using one suitable rule, from a rule set that contains a

number of available rules, for each nurse’s assignment.

Thus, a potential solution is represented as a rule string, or

a sequence of rules corresponding to nurses from the first

one to the last one individually.

We chose this approach, as the longer-term aim of our

research is to model the explicit learning of a human

scheduler. Human schedulers can provide high quality

solutions, but the task is tedious and often requires a large

amount of time. Typically, they construct schedules based

on rules learnt during scheduling. Due to human

limitations, these rules are typically simple. Hence, our

rules will be relatively simple, too. Nevertheless, human

generated schedules are of high quality due to the ability

of the scheduler to switch between the rules, based on the

state of the current solution. We envisage the Bayesian

optimisation algorithm to perform this role.

Figure 1: A Bayesian network for nurse scheduling

Figure 1 is the Bayesian network constructed for the

nurse scheduling problem, which is a hierarchical and

acyclic directed graph representing the solution structure

of the problem.

The node

}),...,2,1{};,...,2,1{( njmiN

ij

∈∈

in the network

denotes that nurse i is assigned using rule j, where m is the

number of nurses to be scheduled and n is the number of

rules to be used in the building process. The directed edge

from node N

ij

to node N

i+1,j’

denotes a causal relationship

of “N

ij

causing N

i+1,j’

”. In our particular implementation,

an edge denotes a construction unit (or rule sub-string) for

nurse i where the previous rule is j and the current rule is

j’. In this network, a possible solution (a complete rule

string) is represented as a directed path from nurse 1 to

nurse m connecting m nodes.

4.2 Learning based on the Bayesian Network

According to whether the structure (topology) of the

model is known or unknown, and whether all variables are

fully observed or some of them are hidden, there are four

kinds of learning (Heckerman 1998). According to

Heckerman, the learning process for the proposed

approach belongs to the category of “known structure and

full observation,” and the learning goal is to find the

variable values of all nodes N

ij

that maximize the

likelihood of the training date containing T independent

cases.

In the proposed approach, learning amounts to

counting and hence we use the symbol ‘#’ meaning ‘the

number of’ in the following equations. It calculates the

conditional probabilities of each possible value for each

node given all possible values of its parents. For example,

for node N

i+1,j’

with a parent node N

ij

, its conditional

probability is

),(#),(#

),(#

)(

),(

)|(

,1,1

,1

,1

,1

trueNfalseNtrueNtrueN

trueNtrueN

NP

NNP

NNP

ijjiijji

ijji

ij

ijji

ijji

==+==

==

=

=

′+′+

′+

′+

′+

.(6)

Note that nodes N

1j

have no parents. In this

circumstance, their probabilities are computed as

T

trueN

falseNtrueN

trueN

NP

j

jj

j

j

)(#

)(#)(#

)(#

)(

1

11

1

1

=

=

=+=

=

=

.(7)

These probability values can be used to generate new

rule strings, or new solutions. Since the first rule in a

solution has no parents, it will be chosen from nodes N

1j

according to their probabilities. The next rule will be

chosen from nodes N

ij

according to their probabilities

conditioned on the previous nodes. This building process

is repeated until the last node has been chosen from nodes

N

mj

, where m is number of the nurses. A link from nurse 1

to nurse m is thus created, representing a new possible

solution. Since all the probability values are normalized,

the roulette-wheel method is good strategy for rule

selection.

For clarity, consider the following toy example of

scheduling five nurses with two rules (1: random

allocation, 2: allocate nurse to low-cost shifts). In the

beginning of the search, the probabilities of choosing rule

1 or 2 for each nurse is equal, i.e. 50%. After a few

iterations, due to the selection pressure and reinforcement

learning, we experience two solution pathways: Because

pure low-cost or random allocation produces low quality

solutions, either rule 1 is used for the first 2-3 nurses and

rule 2 on remainder or vice versa. In essence, BOA learns

‘use rule 2 after 2-3x using rule 1’ or vice versa.

4.3 A Bayesian Optimization Algorithm

Based on the estimation of conditional probabilities, this

section introduces a Bayesian optimization algorithm for

the nurse scheduling problem. It uses techniques from the

field of modelling data by Bayesian networks to estimate

the joint distribution of promising solutions. The nodes, or

variables, in the Bayesian network correspond to the

individual rules by which a schedule will be built step by

step.

In the proposed Bayesian optimization algorithm, the

first population of rule strings is generated at random.

From the current population, a set of better rule strings is

. . .

. . .

. . .

. . .

. . .

. . .

. . . . . .

. . .

N

11

N

12

N

1,n

N

21

N

22

N

2,n

N

31

N

32

N

3,n

N

m-1,1

N

m-1,2

N

m-1,n

N

m,1

N

m,2

N

m,n

selected. Any selection method biased towards better

fitness can be used, and in this paper, the traditional

roulette-wheel Selection is applied. The conditional

probabilities of each node in the Bayesian network are

computed. New rule strings are generated by using these

conditional probability values, and are added into the old

population, replacing some of the old rule strings. In more

detail, the steps of the Bayesian optimization algorithm

for nurse scheduling are:

1. Set t = 0, and generate an initial population P(0) at

random;

2. Use roulette-wheel to select a set of promising

rule strings S(t) from P(t);

3. Compute the conditional probabilities of each

node according to this set of promising solutions ;

4. For the assignment of each nurse, the roulette-

wheel method is used to select one rule according

to the conditional probabilities of all available

nodes, thus obtaining a new rule string. A set of

new rule strings O(t) will be generated in this

way;

5. Create a new population P(t+1) by replacing some

rule strings from P(t) with O(t), and set t = t+1;

6. If the termination conditions are not met (we use

2000 generations), go to step 2.

4.4 Four Building Rules

Similar to the working pattern of a human scheduler, the

proposed schedule-constructing process uses a set of rules

to build a schedule step by step. As far as the domain

knowledge of nurse scheduling is concerned, the

following four rules are currently investigated.

4.4.1 Random Rule

The first rule, called ‘Random’ rule, is used to select a

nurse’s shift pattern at random. Its purpose is to introduce

randomness into the search thus enlarging the search

space, and most importantly to ensure that the proposed

algorithm has the ability to escape from local optimum.

This rule mirrors much of a scheduler’s creativeness to

come up with different solutions if required.

4.4.2 k-Cheapest Rule

The second rule is the ‘k-Cheapest’ rule. Disregarding the

feasibility of the schedule, it randomly selects a shift

pattern from a k-length list containing patterns with k-

cheapest cost p

ij

, in an effort to reduce the total cost of a

schedule as more as possible.

4.4.3 Cover Rule

Compared with the first two rules, the ‘Cover’ rule and

last 'Contribution’ rule are relatively more complicated.

The third ‘Cover’ rule is designed to consider only the

feasibility of the schedule. It schedules one nurse at a time

in such a way as to cover those days and nights with the

highest number of uncovered shifts.

The ‘Cover’ rule constructs solutions as follows. For

each shift pattern in a nurse’s feasible set, calculate the

total number of uncovered shifts and would be covered if

the nurse worked that shift pattern. For simplicity, this

calculation does not take into account how many nurses

are still required in a particular shift. For instance, assume

that a shift pattern covers Monday to Friday nights.

Further assume that the current requirements for the

nights from Monday to Sunday are as follows: (-3, 0, +1, -

2, -1, -2, 0), where a negative number means undercover

and a positive over cover. The Monday to Friday shift

pattern hence has a cover value of 3, as the most negative

value it covers is -3. In this example, a Tuesday to

Saturday pattern would have a value of 2.

In order to ensure that high-grade nurses are not

‘wasted’ covering unnecessarily for nurses of lower

grades, for nurses of grade s, only the shifts requiring

grade s nurses are counted as long as there is a single

uncovered shift for this grade. If all these are covered,

shifts of the next lower grade are considered and once

these are filled those of the next lower grade. Due to the

nature of this approach, nurses’ preference costs p

ij

are not

taken into account by this rule. However, they will

influence decisions indirectly via the fitness function.

Hence, the ‘Cover’ rule can be summarised as finding

those shift patterns with corresponding largest amount of

undercover.

4.4.4 Contribution Rule

The fourth rule, called ‘Contribution’ rule, is biased

towards solution quality but includes some aspects of

feasibility by computing an overall score for each feasible

pattern for the nurse currently being scheduled.

The ‘Contribution’ rule is designed to take into account

the nurses’ preferences. It therefore works with shift

patterns rather than individual shifts. It also takes into

account some of the covering constraints in which it gives

preference to patterns that cover shifts that have not yet

been allocated sufficient nurses to meet their total

requirements. This is achieved by going through the entire

set of feasible shift patterns for a nurse and assigning each

one a score. The one with the highest (i.e. best) score is

chosen. If there is more than one shift pattern with the

best score, the first such shift pattern is chosen.

The score of a shift pattern is calculated as the

weighted sum of the nurse’s p

ij

value for that particular

shift pattern and its contribution to the cover of all three

grades. The latter is measured as a weighted sum of grade

one, two and three uncovered shifts that would be covered

if the nurse worked this shift pattern, i.e. the reduction in

shortfall. Obviously, nurses can only contribute to

uncovered demand of their own grade or below. More

precisely and using the same notation as before, the score

p

ij

of shift pattern j for nurse i is calculated with the

following parameters:

• d

ks

= 1 if there are still nurses needed on day k of

grade s otherwise d

ks

= 0;

• a

jk

= 1 if shift pattern j covers day k otherwise a

jk

= 0;

• w

s

is the weight of covering an uncovered shift of

grade s;

• w

p

is the weight of the nurse’s p

ij

value for the shift

pattern.

Finally, (100- p

ij

) must be used in the score, as higher p

ij

values are worse and the maximum for p

ij

is 100. Note

that (- w

p

p

ij

) could also have been used, but would have

led to some scores being negative. Thus, the scores are

calculated as follows:

##### Citations

More filters

••

TL;DR: In this paper, an intelligent decision support methodologies for nurse rostering problems in large modern hospital environments is presented. But the amount of computational time that is allowed plays a significant role and the authors analyse and discuss this.

172 citations

•

01 Jan 2005TL;DR: This chapter describes hBOA and its predecessor, the Bayesian optimization algorithm (BOA), and outlines some of the most important theoretical and empirical results in this line of research.

109 citations

••

TL;DR: In this paper, a meta-heuristic procedure for nurse scheduling problem based on the framework proposed by Birbil and Fang (J. Glob. Opt. 25, 263---282, 2003) is presented.

Abstract: In this paper, we present a novel meta-heuristic technique for the nurse scheduling problem (NSP). This well-known scheduling problem assigns nurses to shifts per day maximizing the overall quality of the roster while taking various constraints into account. The problem is known to be NP-hard.
Due to its complexity and relevance, many algorithms have been developed to solve practical and often case-specific models of the NSP. The huge variety of constraints and the several objective function possibilities have led to exact and meta-heuristic procedures in various guises, and hence comparison and state-of-the-art reporting of standard results seem to be a utopian idea.
We present a meta-heuristic procedure for the NSP based on the framework proposed by Birbil and Fang (J. Glob. Opt. 25, 263---282, 2003). The Electromagnetic (EM) approach is based on the theory of physics, and simulates attraction and repulsion of sample points in order to move towards a promising solution. Moreover, we present computational experiments on a standard benchmark dataset, and solve problem instances under different assumptions. We show that the proposed procedure performs consistently well under many different circumstances, and hence, can be considered as robust against case-specific constraints.

80 citations

••

TL;DR: A novel memetic algorithm is proposed which evolves good quality sequences of repairs generated by CABAROST, which was tested on instances of the real-world nurse rostering problem at the Queens Medical Centre NHS Trust in Nottingham.

Abstract: In this paper we present a novel Case-Based Reasoning (CBR) system called CABAROST (CAsed-BAsed ROSTering) which was developed for personnel scheduling problems. CBR is used to capture and store examples of personnel manager behaviour which are then used to solve future problems. Previous examples of constraint violations in schedules and the repairs that were used to solve the violations are stored as cases. The sequence in which violations are repaired can have a great impact on schedule quality. A novel memetic algorithm is proposed which evolves good quality sequences of repairs generated by CABAROST. The algorithm was tested on instances of the real-world nurse rostering problem at the Queens Medical Centre NHS Trust in Nottingham.

79 citations

••

TL;DR: The problem of constructing duty schedules for nurses at large hospitals is solved using a tabu search approach as a case study at Stikland Hospital, a large psychiatric hospital in the South African Western Cape, for which a computerized decision support system with respect to nurse scheduling was developed.

Abstract: Constructing duty schedules for nurses at large hospitals is a difficult problem. The objective is usually to ensure that there is always sufficient staff on duty, while taking into account individual preferences with respect to work patterns, requests for leave and financial restrictions, in such a way that all employees are treated fairly. The problem is typically solved via mixed integer programming or heuristic (local) search methods in the operations research literature. In this paper the problem is solved using a tabu search approach as a case study at Stikland Hospital, a large psychiatric hospital in the South African Western Cape, for which a computerized decision support system with respect to nurse scheduling was developed. This decision support system, called NuRoDSS (short for Nurse Rostering Decision Support System) is described in some detail.

69 citations

##### References

More filters

•

01 Jan 1988TL;DR: Probabilistic Reasoning in Intelligent Systems as mentioned in this paper is a complete and accessible account of the theoretical foundations and computational methods that underlie plausible reasoning under uncertainty, and provides a coherent explication of probability as a language for reasoning with partial belief.

Abstract: From the Publisher:
Probabilistic Reasoning in Intelligent Systems is a complete andaccessible account of the theoretical foundations and computational methods that underlie plausible reasoning under uncertainty. The author provides a coherent explication of probability as a language for reasoning with partial belief and offers a unifying perspective on other AI approaches to uncertainty, such as the Dempster-Shafer formalism, truth maintenance systems, and nonmonotonic logic. The author distinguishes syntactic and semantic approaches to uncertaintyand offers techniques, based on belief networks, that provide a mechanism for making semantics-based systems operational. Specifically, network-propagation techniques serve as a mechanism for combining the theoretical coherence of probability theory with modern demands of reasoning-systems technology: modular declarative inputs, conceptually meaningful inferences, and parallel distributed computation. Application areas include diagnosis, forecasting, image interpretation, multi-sensor fusion, decision support systems, plan recognition, planning, speech recognitionin short, almost every task requiring that conclusions be drawn from uncertain clues and incomplete information.
Probabilistic Reasoning in Intelligent Systems will be of special interest to scholars and researchers in AI, decision theory, statistics, logic, philosophy, cognitive psychology, and the management sciences. Professionals in the areas of knowledge-based systems, operations research, engineering, and statistics will find theoretical and computational tools of immediate practical use. The book can also be used as an excellent text for graduate-level courses in AI, operations research, or applied probability.

15,671 citations

••

01 Feb 1999

TL;DR: This paper presents an introduction to inference for Bayesian networks and a view of the EM algorithm that justifies incremental, sparse and other variants, as well as an information-theoretic analysis of hard and soft assignment methods for clustering.

Abstract: Part 1 Inference: introduction to inference for Bayesian networks, Robert Cowell advanced inference in Bayesian networks, Robert Cowell inference in Bayesian networks using nested junction trees, Uffe Kjoerulff bucket elimination - a unifying framework for probabilistic inference, R. Dechter an introduction to variational methods for graphical models, Michael I. Jordan et al improving the mean field approximation via the use of mixture distributions, Tommi S. Jaakkola and Michael I. Jordan introduction to Monte Carlo methods, D.J.C. MacKay suppressing random walls in Markov chain Monte Carlo using ordered overrelaxation, Radford M. Neal. Part 2 Independence: chain graphs and symmetric associations, Thomas S. Richardson the multiinformation function as a tool for measuring stochastic dependence, M. Studeny and J. Vejnarova. Part 3 Foundations for learning: a tutorial on learning with Bayesian networks, David Heckerman a view of the EM algorithm that justifies incremental, sparse and other variants, Radford M. Neal and Geoffrey E. Hinton. Part 4 Learning from data: latent variable models, Christopher M. Bishop stochastic algorithms for exploratory data analysis - data clustering and data visualization, Joachim M. Buhmann learning Bayesian networks with local structure, Nir Friedman and Moises Goldszmidt asymptotic model selection for directed networks with hidden variables, Dan Geiger et al a hierarchical community of experts, Geoffrey E. Hinton et al an information-theoretic analysis of hard and soft assignment methods for clustering, Michael J. Kearns et al learning hybrid Bayesian networks from data, Stefano Monti and Gregory F. Cooper a mean field learning algorithm for unsupervised neural networks, Lawrence Saul and Michael Jordan edge exclusion tests for graphical Gaussian models, Peter W.F. Smith and Joe Whittaker hepatitis B - a case study in MCMC, D.J. Spiegelhalter et al prediction with Gaussian processes - from linear regression to linear prediction and beyond, C.K.I. Williams.

1,885 citations