scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A Bayesian Optimisation Algorithm for the Nurse Scheduling Problem

TL;DR: In this paper, a Bayesian optimization algorithm for nurse scheduling is presented, which involves choosing a suitable scheduling rule from a set for each nurse's assignment, and the conditional probability of each variable in the network is computed according to an initial set of promising solutions.
Abstract: A Bayesian optimization algorithm for the nurse scheduling problem is presented, which involves choosing a suitable scheduling rule from a set for each nurse’s assignment. Unlike our previous work that used GAs to implement implicit learning, the learning in the proposed algorithm is explicit, i.e. eventually, we will be able to identify and mix building blocks directly. The Bayesian optimization algorithm is applied to implement such explicit learning by building a Bayesian network of the joint distribution of solutions. The conditional probability of each variable in the network is computed according to an initial set of promising solutions. Subsequently, each new instance for each variable is generated by using the corresponding conditional probabilities, until all variables have been generated, i.e. in our case, a new rule string has been obtained. Another set of rule strings will be generated in this way, some of which will replace previous strings based on fitness selection. If stopping conditions are not met, the conditional probabilities for all nodes in the Bayesian network are updated again using the current set of promising rule strings. Computational results from 52 real data instances demonstrate the success of this approach. It is also suggested that the learning mechanism in the proposed approach might be suitable for other scheduling problems.

Summary (3 min read)

1 Introduction

  • After much practice, the scheduler gradually masters the knowledge of which solution parts go well with others.
  • As a model of the selected strings, a Bayesian network (Pearl 1998) is used in the proposed Bayesian optimization algorithm to solve the nurse scheduling problem.
  • The conditional probabilities are computed according to an initial set of promising solutions.
  • Subsequently, each new instance for each node is generated by using the corresponding conditional probabilities, until values for all nodes have been generated, i.e. a new rule string has been generated.

2.1 General Problem

  • These schedules have to satisfy working contracts and meet the demand for a given number of nurses of different grades on each shift, while being seen to be fair by the staff concerned.
  • Thus scheduling the different grades independently is not possible.
  • Furthermore, the problem has a special day-night structure as most of the nurses are contracted to work either days or nights in one week but not both.
  • The latter two characteristics make this problem challenging for any local search algorithm, because finding and maintaining feasible solutions is extremely difficult.
  • For each nurse i and each shift pattern j all the information concerning the desirability of the pattern for this nurse is captured in a single numeric preference cost pij.

2.2 Integer Programming

  • Note that the definition of qis is such that higher graded nurses can substitute those at lower grades if necessary.
  • Thus, the Integer Programming formulation has about 12000 binary variables and 100 constraints.
  • Some problem cases remain unsolved after overnight computation using professional software.

3 Graphical Models and Bayesian Networks

  • The authors introduce concepts from graphical models in general and Bayesian networks in particular.
  • They provide a natural tool for dealing with uncertainty and complexity that occur throughout applied mathematics and engineering.
  • There are two main kinds of graphical models: undirected and directed.
  • The number of such events is exponential.
  • To achieve compactness, Bayesian networks factor the joint distribution into local conditional distributions for each variable given its parents.

4.1 The Construction of a Bayesian Network

  • In their nurse scheduling problem, the number of the nurse is fixed (up to 30), and the target is to create a weekly schedule by assigning each nurse one shift pattern in the most efficient way.
  • Due to human limitations, these rules are typically simple.
  • Nevertheless, human generated schedules are of high quality due to the ability of the scheduler to switch between the rules, based on the state of the current solution.
  • The authors envisage the Bayesian optimisation algorithm to perform this role.
  • In their particular implementation, an edge denotes a construction unit (or rule sub-string) for nurse i where the previous rule is j and the current rule is j’.

4.2 Learning based on the Bayesian Network

  • According to whether the structure of the model is known or unknown, and whether all variables are fully observed or some of them are hidden, there are four kinds of learning (Heckerman 1998).
  • In the proposed approach, learning amounts to counting and hence the authors use the symbol ‘#’ meaning ‘the number of’ in the following equations.
  • Since the first rule in a solution has no parents, it will be chosen from nodes N1j according to their probabilities.
  • Since all the probability values are normalized, the roulette-wheel method is good strategy for rule selection.
  • Because pure low-cost or random allocation produces low quality solutions, either rule 1 is used for the first 2-3 nurses and rule 2 on remainder or vice versa.

4.3 A Bayesian Optimization Algorithm

  • Based on the estimation of conditional probabilities, this section introduces a Bayesian optimization algorithm for the nurse scheduling problem.
  • It uses techniques from the field of modelling data by Bayesian networks to estimate the joint distribution of promising solutions.
  • New rule strings are generated by using these conditional probability values, and are added into the old population, replacing some of the old rule strings.
  • If the termination conditions are not met (the authors use 2000 generations), go to step 2.

4.4 Four Building Rules

  • As far as the domain knowledge of nurse scheduling is concerned, the following four ru les are currently investigated.
  • Due to the nature of this approach, nurses’ preference costs pij are not taken into account by this rule.
  • The fourth rule, called ‘Contribution’ rule, is biased towards solution quality but includes some aspects of feasibility by computing an overall score for each feasible pattern for the nurse currently being scheduled.
  • This is achieved by going through the entire set of feasible shift patterns for a nurse and assigning each one a score.
  • The one with the highest (i.e. best) score is chosen.

4.5 Fitness Function

  • Independent of the rules used, the fitness of completed solutions has to be calculated.
  • Therefore, the authors still need a penalty function approach.
  • Since the chosen encoding automatically satisfies constraint set (3) of the integer programming formulation, the authors can use the following formula, where wdemand is the penalty weight, to calculate the fitness of solutions.
  • Note that the penalty is proportional to the number of uncovered shifts. (9).

5 Computational Results

  • The authors present the results of extensive computer experiments and compare them to results of the same data instances found previously by other algorithms.
  • Table 1 lists the full and detailed computational results of 20 runs with different random seeds, where N/A indicates no feasible solution was found.
  • Figures 2 summarises this information, Figure 3 shows a single typical run and finally Figure 4 gives an overall comparison between various algorithms.

5.1 Details of Algorithms

  • The value of three units was chosen as it corresponds to the penalty cost of violating the least important level of requests in the original formulation.
  • Thus, these solutions are still acceptable to the hospital.
  • The executing time of the algorithm is approx.
  • At this stage, there are based on their experience and intuition.
  • When computing the mean a censored cost value of 255 has been used when an algorithm failed to find a feasible solution (N/A).

5.2 Analysis of Results

  • Comparing the computational results on various test instances , one can see that using the random rule alone does not yield a single feasible solution.
  • This underlines the difficulty of this problem.
  • In addition, without learning the conditional probabilities, the results are much weaker, as the CP column shows.
  • Thus, it is not simply enough to use the four rules to build solutions.
  • Overall, the Bayesian results found rival those found by the complex multi-population GA.

6 Conclusions

  • The approach is novel because it is the first time that Bayesian networks have been applied to the field of personnel scheduling.
  • An effective method is proposed to solve the problem about how to implement explicit learning from past solutions.
  • Unlike most existing approaches, the new approach has the ability to build schedules by using flexible, rather than fixed rules.
  • Experimental results from real-world nurse scheduling problems have demonstrated the strength of the proposed Bayesian optimization algorithm.
  • The proposed approach mimics human behaviour much more strongly than a standard GA based scheduling system.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

A Bayesian Optimization Algorithm for the Nurse Scheduling Problem
Proceedings of 2003 Congress on Evolutionary Computation (CEC2003), pp. 2149-2156, IEEE Press, Canberra, Australia, 2003.
Jingpeng Li
School of Computer Science
University of Nottingham
NG8 1BB UK
jpl@cs.nott.ac.uk
Uwe Aickelin
School of Computer Science
University of Nottingham
NG8 1BB UK
uxa@cs.nott.ac.uk
Abstract- A Bayesian optimization algorithm for the
nurse scheduling problem is presented, which involves
choosing a suitable scheduling rule from a set for each
nurses assignment. Unlike our previous work that
used GAs to implement implicit learning, the learning
in the proposed algorithm is explicit, i.e. eventually, we
will be able to identify and mix building blocks
directly. The Bayesian optimization algorithm is
applied to implement such explicit learning by
building a Bayesian network of the joint distribution
of solutions. The conditional probability of each
variable in the network is computed according to an
initial set of promising solutions. Subsequently, each
new instance for each variable is generated by using
the corresponding conditional probabilities, until all
variables have been generated, i.e. in our case, a new
rule string has been obtained. Another set of rule
strings will be generated in this way, some of which
will replace previous strings based on fitness selection.
If stopping conditions are not met, the conditional
probabilities for all nodes in the Bayesian network are
updated again using the current set of promising rule
strings. Computational results from 52 real data
instances demonstrate the success of this approach. It
is also suggested that the learning mechanism in the
proposed approach might be suitable for other
scheduling problems.
1 Introduction
Scheduling problems are generally NP-hard combinatorial
problems, and a lot of research has been done to solve
these problems heuristically (Aickelin and Dowsland,
2002 and 2003; Li and Kwan, 2001a and 2003). However,
most previous approaches are problem-specific and
research into the development of a general scheduling
algorithm is still in its infancy.
Genetic Algorithms (GAs) (Holland 1975; Goldberg
1989), mimicking the natural evolutionary process of the
survival of the fittest, have attracted much attention in
solving difficult scheduling problems in recent years.
Some obstacles exist when using GAs: there is no
canonical mechanism to deal with constraints, which are
commonly met in most real-world scheduling problems,
and small changes to a solution are difficult. To overcome
both difficulties, indirect approaches have been presented
(Aickelin and Dowsland, 2003; Li and Kwan, 2001b and
2003) for nurse and driver scheduling. In these indirect
GAs, the solution space is mapped and then a separate
decoding routine builds solutions to the original problem.
In our previous indirect GAs, learning was implicit
(‘black-box’) and restricted to the efficient adjustment of
weights for a set of rules that are used to construct
schedules. The major limitation of those approaches is
that they learn in a non-human way. Like most existing
construction algorithms, once the best weight combination
is found, the rules used in the construction process are
fixed at each iteration. However, normally a long
sequence of moves is needed to construct a schedule and
using fixed rules at each move is thus unreasonable and
not coherent with the human learning processes.
When a human scheduler works, he normally builds a
schedule systematically following a set of rules. After
much practice, the scheduler gradually masters the
knowledge of which solution parts go well with others. He
can identify good parts and is aware of the solution
quality even if the scheduling process is not completed
yet, thus having the ability to finish a schedule by using
flexible, rather than fixed, rules. In this paper, we design a
more human-like scheduling algorithm, by using a
Bayesian optimization algorithm to implement explicit
learning from past solutions. A nurse scheduling problem
with 52 real data instances gathered from a UK hospital is
used as the test problem.
Nurse scheduling has been widely studied in recent
years, and an extensive summary of the approaches can be
found in Hung (1995) and Sitompul and Randhawa
(1990). This problem is highly constrained, making it
extremely difficult for most local search algorithms to
find feasible solutions, let alone optimal ones. In our
nurse scheduling problem, the number of the nurses is
fixed (up to 30), and the target is to create a weekly
schedule by assigning each nurse one out of up to 411
shift patterns in the most efficient way. The proposed
Bayesian approach achieves this by choosing a suitable
rule, from a rule set containing a number of available
rules, for each nurse. A potential solution is therefore
represented as a rule string, or a sequence of rules
corresponding to nurses from the first one to the last.
As a model of the selected strings, a Bayesian network
(Pearl 1998) is used in the proposed Bayesian
optimization algorithm to solve the nurse scheduling
problem. A Bayesian network is a directed acyclic graph

with each node corresponding to one variable, and each
variable corresponding to the individual rule by which a
schedule will be constructed step by step. The causal
relationship between two variables is represented by a
directed edge between the two corresponding nodes.
The Bayesian optimization algorithm is applied to
learn to identify good partial solutions and to complete
them by building a Bayesian network of the joint
distribution of solutions (Pelikan et al, 1999; Pelikan and
Goldberg, 2000). The conditional probabilities are
computed according to an initial set of promising
solutions. Subsequently, each new instance for each node
is generated by using the corresponding conditional
probabilities, until values for all nodes have been
generated, i.e. a new rule string has been generated.
Another set of rule strings will be generated in the
same way, some of which will replace previous strings
based on roulette-wheel fitness selection. If stopping
conditions are not met, the conditional probabilities for all
nodes in the Bayesian network are updated again using
the current set of rule strings. The algorithm thereby tries
to explicitly identify and mix promising building blocks.
It should be noted that for most scheduling problems,
the structure of the network model is known and all
variables are fully observed. In this case, the goal of
learning is to find the rule values that maximize the
likelihood of the training data. Thus, learning can amount
to counting in the case of multinomial distributions.
The rest of this paper is organized as follows. Section 2
gives an overview on the nurse scheduling problem, and
the following section 3 introduces the general concepts
about graphical models and Bayesian networks. Section 4
discuses the proposed Bayesian optimization algorithm,
describing the construction of a Bayesian network,
learning based on the Bayesian network, and the four
building rules in detail. Computational results using 52
data instances gathered from a UK hospital are presented
in section 5. Concluding remarks are in section 6.
2 The Nurse Scheduling Problem
2.1 General Problem
Our nurse scheduling problem is to create weekly
schedules for wards of nurses by assigning one of a
number of possible shift patterns to each nurse. These
schedules have to satisfy working contracts and meet the
demand for a given number of nurses of different grades
on each shift, while being seen to be fair by the staff
concerned. The latter objective is achieved by meeting as
many of the nurses’ requests as possible and considering
historical information to ensure that unsatisfied requests
and unpopular shifts are evenly distributed.
The problem is complicated by the fact that higher
qualified nurses can substitute less qualified nurses but
not vice versa. Thus scheduling the different grades
independently is not possible. Furthermore, the problem
has a special day-night structure as most of the nurses are
contracted to work either days or nights in one week but
not both. However due to working contracts, the number
of days worked is not usually the same as the number of
nights. Therefore, it becomes important to schedule the
‘correct’ nurses onto days and nights respectively. The
latter two characteristics make this problem challenging
for any local search algorithm, because finding and
maintaining feasible solutions is extremely difficult.
The numbers of days or nights to be worked by each
nurse defines the set of feasible weekly work patterns for
that nurse. These will be referred to as shift patterns or
shift pattern vectors in the following. For each nurse i and
each shift pattern j all the information concerning the
desirability of the pattern for this nurse is captured in a
single numeric preference cost p
ij
. These costs were
determined in close consultation with the hospital and are
a weighted sum of the following factors: basic shift-
pattern cost, general day/night preferences, specific
requests, continuity problems, number of successive
working day, rotating nights/weekends and other working
history information. Patterns that violate mandatory
contractual requirements are marked as infeasible for a
particular nurse and week by giving them a suitably high
p
ij
value.
2.2 Integer Programming
The problem can be formulated as an integer linear
program as follows.
Indices:
i = 1...n nurse index;
j = 1...m shift pattern index;
k = 1...14 day and night index (1...7 are days and 8...14
are nights);
s = 1...p grade index.
Decision variables:
=
else ,0
pattern shift works nurse 1, ji
x
ij
.
Parameters:
m = Number of shift patterns;
n = Number of nurses;
p = Number of grades;
=
else ,0
day/night covers pattern shift 1, kj
a
jk
;
=
else ,0
higheror grade of is nurse 1, si
q
is
;
p
ij
= Preference cost of nurse i working shift pattern j;
R
ks
= Demand of nurses with grade s on day/night k;
N
i
= Working shifts per week of nurse i if night shifts are
worked;
D
i
= Working shifts per week of nurse i if day shifts are
worked;
B
i
= Working shifts per week of nurse i if both day and
night shifts are worked {for special nurses};
F(i) = Set of feasible shift patterns for nurse i, where F(i)
is defined as

.,
shifts combined ,
shiftsnight ,
shiftsday ,
)(
14
1
14
8
7
1
i
jBa
jNa
jDa
iF
k
ijk
i
k
jk
i
k
jk
=
=
=
=
=
=
=
(1)
Target function:
Minimize total preference cost of all nurses, denoted as
min!
1 )(
=
n
i
m
iFj
ijij
xp
. (2)
Subject to:
1. Every nurse works exactly one feasible shift pattern:
ix
iFj
ij
=
,1
)(
; (3)
2. The demand for nurses is fulfilled for every grade on
every day and night:
=
)( 1
,,
iFj
n
i
ksijjkis
skRxaq
(4)
Constraint set (3) ensures that every nurse works
exactly one shift pattern from his/her feasible set, and
constraint set (4) ensures that the demand for nurses is
covered for every grade on every day and night. Note that
the definition of q
is
is such that higher graded nurses can
substitute those at lower grades if necessary.
Typical problem dimensions are 30 nurses of three
grades and 400 shift patterns. Thus, the Integer
Programming formulation has about 12000 binary
variables and 100 constraints. This is a moderately sized
problem. However, some problem cases remain unsolved
after overnight computation using professional software.
3 Graphical Models and Bayesian Networks
In this section, we introduce concepts from graphical
models in general and Bayesian networks in particular.
Section 4 will then explain how we applied these concepts
to our nurse scheduling problem.
Graphical models are graphs in which nodes represent
random variables, and the lack of edges represents
conditional independence assumptions (Edwards 2000).
They have important applications in many multivariate
probabilistic systems in fields such as statistics, systems
engineering, information theory and pattern recognition.
In particular, they are playing an increasingly important
role in the design and analysis of machine learning
algorithms.
As described by Jordon (1999), graphical models are a
marriage between probability theory and graph theory.
They provide a natural tool for dealing with uncertainty
and complexity that occur throughout applied
mathematics and engineering. In a graphical model, the
fundamental notion of modularity is used to build a
complex system by combining simpler parts. Probability
theory provides the glue to combine the parts, ensuring
that the whole system is consistent, and providing ways to
interface models to data. The graph theory provides an
intuitively appealing interface by which humans can
model highly interacting sets of variables, and a data
structure that leads itself naturally to the design of
general-purpose algorithms.
There are two main kinds of graphical models:
undirected and directed. Undirected graphical models are
more popular with the physics and vision communities.
Directed graphical model, also called Bayesian networks,
are more popular with the artificial intelligence and
machine learning communities. Bayesian networks are
often used to model multinomial data with both discrete
and continuous variables by encoding the relationship
between the variables contained in the modelled data,
which represents the structure of a problem.
Moreover, Bayesian networks can be used to generate
new instances of the variables with similar properties as
those of given data. Each node in the network corresponds
to one variable, and each variable corresponds to one
position in the strings representing the solutions. The
relationship between two variables is represented by a
directed edge between the two corresponding nodes.
Any complete probabilistic model of a domain must
represent the joint distribution, the probability of every
possible event as defined by the values of all the variables.
The number of such events is exponential. To achieve
compactness, Bayesian networks factor the joint
distribution into local conditional distributions for each
variable given its parents.
Mathematically, an acyclic Bayesian network encodes
a full joint probability distribution by the product
))(|(),...,(
1
1
=
=
n
i
iin
XpaxPxxP
, (5)
where x
i
denotes some values of the variable X
i
, pa(X
i
)
denotes a set of values for parents of X
i
in the network
(the set of nodes from which there exists an individual
edge to X
i
), and P(x
i
| pa(X
i
)) denotes the conditional
probability of X
i
conditioned on variables pa(X
i
). This
distribution can be used to generate new instances using
the marginal and conditional probabilities.
4 A Bayesian Optimization Algorithm for
Nurse Scheduling
This section discusses the proposed Bayesian optimization
algorithm for the nurse scheduling problem, including the
construction of a Bayesian network, learning based on the
Bayesian network and the four building rules used.
4.1 The Construction of a Bayesian Network
In our nurse scheduling problem, the number of the nurse
is fixed (up to 30), and the target is to create a weekly
schedule by assigning each nurse one shift pattern in the
most efficient way. The proposed approach achieves this
by using one suitable rule, from a rule set that contains a
number of available rules, for each nurse’s assignment.
Thus, a potential solution is represented as a rule string, or

a sequence of rules corresponding to nurses from the first
one to the last one individually.
We chose this approach, as the longer-term aim of our
research is to model the explicit learning of a human
scheduler. Human schedulers can provide high quality
solutions, but the task is tedious and often requires a large
amount of time. Typically, they construct schedules based
on rules learnt during scheduling. Due to human
limitations, these rules are typically simple. Hence, our
rules will be relatively simple, too. Nevertheless, human
generated schedules are of high quality due to the ability
of the scheduler to switch between the rules, based on the
state of the current solution. We envisage the Bayesian
optimisation algorithm to perform this role.
Figure 1: A Bayesian network for nurse scheduling
Figure 1 is the Bayesian network constructed for the
nurse scheduling problem, which is a hierarchical and
acyclic directed graph representing the solution structure
of the problem.
The node
}),...,2,1{};,...,2,1{( njmiN
ij
in the network
denotes that nurse i is assigned using rule j, where m is the
number of nurses to be scheduled and n is the number of
rules to be used in the building process. The directed edge
from node N
ij
to node N
i+1,j’
denotes a causal relationship
of “N
ij
causing N
i+1,j’
”. In our particular implementation,
an edge denotes a construction unit (or rule sub-string) for
nurse i where the previous rule is j and the current rule is
j’. In this network, a possible solution (a complete rule
string) is represented as a directed path from nurse 1 to
nurse m connecting m nodes.
4.2 Learning based on the Bayesian Network
According to whether the structure (topology) of the
model is known or unknown, and whether all variables are
fully observed or some of them are hidden, there are four
kinds of learning (Heckerman 1998). According to
Heckerman, the learning process for the proposed
approach belongs to the category of “known structure and
full observation,” and the learning goal is to find the
variable values of all nodes N
ij
that maximize the
likelihood of the training date containing T independent
cases.
In the proposed approach, learning amounts to
counting and hence we use the symbol ‘#’ meaning ‘the
number of’ in the following equations. It calculates the
conditional probabilities of each possible value for each
node given all possible values of its parents. For example,
for node N
i+1,j’
with a parent node N
ij
, its conditional
probability is
),(#),(#
),(#
)(
),(
)|(
,1,1
,1
,1
,1
trueNfalseNtrueNtrueN
trueNtrueN
NP
NNP
NNP
ijjiijji
ijji
ij
ijji
ijji
==+==
==
=
=
++
+
+
+
.(6)
Note that nodes N
1j
have no parents. In this
circumstance, their probabilities are computed as
T
trueN
falseNtrueN
trueN
NP
j
jj
j
j
)(#
)(#)(#
)(#
)(
1
11
1
1
=
=
=+=
=
=
.(7)
These probability values can be used to generate new
rule strings, or new solutions. Since the first rule in a
solution has no parents, it will be chosen from nodes N
1j
according to their probabilities. The next rule will be
chosen from nodes N
ij
according to their probabilities
conditioned on the previous nodes. This building process
is repeated until the last node has been chosen from nodes
N
mj
, where m is number of the nurses. A link from nurse 1
to nurse m is thus created, representing a new possible
solution. Since all the probability values are normalized,
the roulette-wheel method is good strategy for rule
selection.
For clarity, consider the following toy example of
scheduling five nurses with two rules (1: random
allocation, 2: allocate nurse to low-cost shifts). In the
beginning of the search, the probabilities of choosing rule
1 or 2 for each nurse is equal, i.e. 50%. After a few
iterations, due to the selection pressure and reinforcement
learning, we experience two solution pathways: Because
pure low-cost or random allocation produces low quality
solutions, either rule 1 is used for the first 2-3 nurses and
rule 2 on remainder or vice versa. In essence, BOA learns
‘use rule 2 after 2-3x using rule 1’ or vice versa.
4.3 A Bayesian Optimization Algorithm
Based on the estimation of conditional probabilities, this
section introduces a Bayesian optimization algorithm for
the nurse scheduling problem. It uses techniques from the
field of modelling data by Bayesian networks to estimate
the joint distribution of promising solutions. The nodes, or
variables, in the Bayesian network correspond to the
individual rules by which a schedule will be built step by
step.
In the proposed Bayesian optimization algorithm, the
first population of rule strings is generated at random.
From the current population, a set of better rule strings is
. . .
. . .
. . .
. . .
. . .
. . .
. . . . . .
. . .
N
11
N
12
N
1,n
N
21
N
22
N
2,n
N
31
N
32
N
3,n
N
m-1,1
N
m-1,2
N
m-1,n
N
m,1
N
m,2
N
m,n

selected. Any selection method biased towards better
fitness can be used, and in this paper, the traditional
roulette-wheel Selection is applied. The conditional
probabilities of each node in the Bayesian network are
computed. New rule strings are generated by using these
conditional probability values, and are added into the old
population, replacing some of the old rule strings. In more
detail, the steps of the Bayesian optimization algorithm
for nurse scheduling are:
1. Set t = 0, and generate an initial population P(0) at
random;
2. Use roulette-wheel to select a set of promising
rule strings S(t) from P(t);
3. Compute the conditional probabilities of each
node according to this set of promising solutions ;
4. For the assignment of each nurse, the roulette-
wheel method is used to select one rule according
to the conditional probabilities of all available
nodes, thus obtaining a new rule string. A set of
new rule strings O(t) will be generated in this
way;
5. Create a new population P(t+1) by replacing some
rule strings from P(t) with O(t), and set t = t+1;
6. If the termination conditions are not met (we use
2000 generations), go to step 2.
4.4 Four Building Rules
Similar to the working pattern of a human scheduler, the
proposed schedule-constructing process uses a set of rules
to build a schedule step by step. As far as the domain
knowledge of nurse scheduling is concerned, the
following four rules are currently investigated.
4.4.1 Random Rule
The first rule, called ‘Random’ rule, is used to select a
nurse’s shift pattern at random. Its purpose is to introduce
randomness into the search thus enlarging the search
space, and most importantly to ensure that the proposed
algorithm has the ability to escape from local optimum.
This rule mirrors much of a scheduler’s creativeness to
come up with different solutions if required.
4.4.2 k-Cheapest Rule
The second rule is thek-Cheapest’ rule. Disregarding the
feasibility of the schedule, it randomly selects a shift
pattern from a k-length list containing patterns with k-
cheapest cost p
ij
, in an effort to reduce the total cost of a
schedule as more as possible.
4.4.3 Cover Rule
Compared with the first two rules, the ‘Cover’ rule and
last 'Contribution’ rule are relatively more complicated.
The third ‘Cover’ rule is designed to consider only the
feasibility of the schedule. It schedules one nurse at a time
in such a way as to cover those days and nights with the
highest number of uncovered shifts.
The ‘Cover’ rule constructs solutions as follows. For
each shift pattern in a nurse’s feasible set, calculate the
total number of uncovered shifts and would be covered if
the nurse worked that shift pattern. For simplicity, this
calculation does not take into account how many nurses
are still required in a particular shift. For instance, assume
that a shift pattern covers Monday to Friday nights.
Further assume that the current requirements for the
nights from Monday to Sunday are as follows: (-3, 0, +1, -
2, -1, -2, 0), where a negative number means undercover
and a positive over cover. The Monday to Friday shift
pattern hence has a cover value of 3, as the most negative
value it covers is -3. In this example, a Tuesday to
Saturday pattern would have a value of 2.
In order to ensure that high-grade nurses are not
‘wasted’ covering unnecessarily for nurses of lower
grades, for nurses of grade s, only the shifts requiring
grade s nurses are counted as long as there is a single
uncovered shift for this grade. If all these are covered,
shifts of the next lower grade are considered and once
these are filled those of the next lower grade. Due to the
nature of this approach, nurses’ preference costs p
ij
are not
taken into account by this rule. However, they will
influence decisions indirectly via the fitness function.
Hence, the ‘Cover’ rule can be summarised as finding
those shift patterns with corresponding largest amount of
undercover.
4.4.4 Contribution Rule
The fourth rule, called ‘Contribution’ rule, is biased
towards solution quality but includes some aspects of
feasibility by computing an overall score for each feasible
pattern for the nurse currently being scheduled.
The ‘Contribution’ rule is designed to take into account
the nurses’ preferences. It therefore works with shift
patterns rather than individual shifts. It also takes into
account some of the covering constraints in which it gives
preference to patterns that cover shifts that have not yet
been allocated sufficient nurses to meet their total
requirements. This is achieved by going through the entire
set of feasible shift patterns for a nurse and assigning each
one a score. The one with the highest (i.e. best) score is
chosen. If there is more than one shift pattern with the
best score, the first such shift pattern is chosen.
The score of a shift pattern is calculated as the
weighted sum of the nurse’s p
ij
value for that particular
shift pattern and its contribution to the cover of all three
grades. The latter is measured as a weighted sum of grade
one, two and three uncovered shifts that would be covered
if the nurse worked this shift pattern, i.e. the reduction in
shortfall. Obviously, nurses can only contribute to
uncovered demand of their own grade or below. More
precisely and using the same notation as before, the score
p
ij
of shift pattern j for nurse i is calculated with the
following parameters:
d
ks
= 1 if there are still nurses needed on day k of
grade s otherwise d
ks
= 0;
a
jk
= 1 if shift pattern j covers day k otherwise a
jk
= 0;
w
s
is the weight of covering an uncovered shift of
grade s;
w
p
is the weight of the nurse’s p
ij
value for the shift
pattern.
Finally, (100- p
ij
) must be used in the score, as higher p
ij
values are worse and the maximum for p
ij
is 100. Note
that (- w
p
p
ij
) could also have been used, but would have
led to some scores being negative. Thus, the scores are
calculated as follows:

Citations
More filters
Posted Content
TL;DR: In this article, the authors present a staff preference optimization framework which is solved with a greedy double swap heuristic, which yields good performance in speed at solving the problem and can demonstrate its performance by implementing it on open source spreadsheet software.
Abstract: One of the key challenges of nurse scheduling problem (NSP) is the number of constraints placed on preparing the timetable, both from the regulatory requirements as well as the patients' demand for the appropriate nursing care specialists. In addition, the preferences of the nursing staffs related to their work schedules add another dimension of complexity. Most solutions proposed for solving nurse scheduling involve the use of mathematical programming and generally considers only the hard constraints. However, the psychological needs of the nurses are ignored and this resulted in subsequent interventions by the nursing staffs to remedy any deficiency and often results in last minute changes to the schedule. In this paper, we present a staff preference optimization framework which is solved with a greedy double swap heuristic. The heuristic yields good performance in speed at solving the problem. The heuristic is simple and we will demonstrate its performance by implementing it on open source spreadsheet software.
References
More filters
Book
01 Jan 2002

17,039 citations

Book
01 Jan 1988
TL;DR: Probabilistic Reasoning in Intelligent Systems as mentioned in this paper is a complete and accessible account of the theoretical foundations and computational methods that underlie plausible reasoning under uncertainty, and provides a coherent explication of probability as a language for reasoning with partial belief.
Abstract: From the Publisher: Probabilistic Reasoning in Intelligent Systems is a complete andaccessible account of the theoretical foundations and computational methods that underlie plausible reasoning under uncertainty. The author provides a coherent explication of probability as a language for reasoning with partial belief and offers a unifying perspective on other AI approaches to uncertainty, such as the Dempster-Shafer formalism, truth maintenance systems, and nonmonotonic logic. The author distinguishes syntactic and semantic approaches to uncertainty—and offers techniques, based on belief networks, that provide a mechanism for making semantics-based systems operational. Specifically, network-propagation techniques serve as a mechanism for combining the theoretical coherence of probability theory with modern demands of reasoning-systems technology: modular declarative inputs, conceptually meaningful inferences, and parallel distributed computation. Application areas include diagnosis, forecasting, image interpretation, multi-sensor fusion, decision support systems, plan recognition, planning, speech recognition—in short, almost every task requiring that conclusions be drawn from uncertain clues and incomplete information. Probabilistic Reasoning in Intelligent Systems will be of special interest to scholars and researchers in AI, decision theory, statistics, logic, philosophy, cognitive psychology, and the management sciences. Professionals in the areas of knowledge-based systems, operations research, engineering, and statistics will find theoretical and computational tools of immediate practical use. The book can also be used as an excellent text for graduate-level courses in AI, operations research, or applied probability.

15,671 citations

BookDOI
01 Feb 1999
TL;DR: This paper presents an introduction to inference for Bayesian networks and a view of the EM algorithm that justifies incremental, sparse and other variants, as well as an information-theoretic analysis of hard and soft assignment methods for clustering.
Abstract: Part 1 Inference: introduction to inference for Bayesian networks, Robert Cowell advanced inference in Bayesian networks, Robert Cowell inference in Bayesian networks using nested junction trees, Uffe Kjoerulff bucket elimination - a unifying framework for probabilistic inference, R. Dechter an introduction to variational methods for graphical models, Michael I. Jordan et al improving the mean field approximation via the use of mixture distributions, Tommi S. Jaakkola and Michael I. Jordan introduction to Monte Carlo methods, D.J.C. MacKay suppressing random walls in Markov chain Monte Carlo using ordered overrelaxation, Radford M. Neal. Part 2 Independence: chain graphs and symmetric associations, Thomas S. Richardson the multiinformation function as a tool for measuring stochastic dependence, M. Studeny and J. Vejnarova. Part 3 Foundations for learning: a tutorial on learning with Bayesian networks, David Heckerman a view of the EM algorithm that justifies incremental, sparse and other variants, Radford M. Neal and Geoffrey E. Hinton. Part 4 Learning from data: latent variable models, Christopher M. Bishop stochastic algorithms for exploratory data analysis - data clustering and data visualization, Joachim M. Buhmann learning Bayesian networks with local structure, Nir Friedman and Moises Goldszmidt asymptotic model selection for directed networks with hidden variables, Dan Geiger et al a hierarchical community of experts, Geoffrey E. Hinton et al an information-theoretic analysis of hard and soft assignment methods for clustering, Michael J. Kearns et al learning hybrid Bayesian networks from data, Stefano Monti and Gregory F. Cooper a mean field learning algorithm for unsupervised neural networks, Lawrence Saul and Michael Jordan edge exclusion tests for graphical Gaussian models, Peter W.F. Smith and Joe Whittaker hepatitis B - a case study in MCMC, D.J. Spiegelhalter et al prediction with Gaussian processes - from linear regression to linear prediction and beyond, C.K.I. Williams.

1,885 citations

Frequently Asked Questions (1)
Q1. What are the contributions in "A bayesian optimization algorithm for the nurse scheduling problem" ?

A Bayesian optimization algorithm for the nurse scheduling problem is presented, which involves choosing a suitable scheduling rule from a set for each nurse ’ s assignment. The conditional probability of each variable in the network is computed according to an initial set of promising solutions. If stopping conditions are not met, the conditional probabilities for all nodes in the Bayesian network are updated again using the current set of promising rule strings. It is also suggested that the learning mechanism in the proposed approach might be suitable for other scheduling problems.