scispace - formally typeset
Journal ArticleDOI

Particle swarm optimization

Reads0
Chats0
TLDR
A snapshot of particle swarming from the authors’ perspective, including variations in the algorithm, current and ongoing research, applications and open problems, is included.
Abstract
A concept for the optimization of nonlinear functions using particle swarm methodology is introduced The evolution of several paradigms is outlined, and an implementation of one of the paradigms is discussed Benchmark testing of the paradigm is described, and applications, including nonlinear function optimization and neural network training, are proposed The relationships between particle swarm optimization and both artificial life and genetic algorithms are described

read more

Content maybe subject to copyright    Report

Particle
Swarm
Optimization
James Kennedy' and Russell Eberhart2
Washington,
DC
20212
kennedyjim @bls .gov
2Purdue School of Engineering and Technology
Indianapolis,
IN
46202-5160
eberhart
@
engr.iupui
.edu
1
ABSTRACT
A concept for the optimization of nonlinear functions using particle swarm methodology is introduced.
The evolution of several paradigms is outlined, and an implementation of one
of
the paradigms is
discussed. Benchmark testing of the paradigm is described, and applications, including nonlinear
function optimization and neural network training, are proposed. The relationships between particle
swarm
optimization and both artificial life and genetic algorithms are described,
1
INTRODUCTION
This paper introduces a method for optimization of continuous nonlinear functions. The method was
discovered through simulation
of
a simplified social model; thus the social metaphor is discussed,
though the algorithm stands without metaphorical support. This paper describes the particle swarm
optimization concept in terms of its precursors, briefly reviewing the stages
of
its development from
social simulation to optimizer. Discussed next are a few paradigms that implement the concept.
Finally, the implementation of one paradigm
is
discussed in more detail, followed by results obtained
from applications and tests upon which the paradigm has been shown to perform successfully.
Particle swarm optimization has roots in
two
main component methodologies. Perhaps more obvious
are its ties
to
artificial life (A-life)
in
general, and to bird flocking, fish schooling, and swarming theory
in
particular. It is also related, however, to evolutionary computation, and has ties to both genetic
algorithms and evolutionary programming. These relationships
are
briefly reviewed
in
the paper.
Particle swarm optimization
as
developed by the authors comprises a very simple concept, and
paradigms can
be
implemented
in
a few lines of computer code. It requires only primitive
mathematical operators, and is computationally inexpensive in terms
of
both memory requirements and
speed. Early testing has found the implementation to
be
effective with several kinds of problems. This
paper discusses application of the algorithm to the training of artificial neural network weights,
Particle swarm optimization has also been demonstrated to perform well on genetic algorithm test
functions. This paper discusses the performance on Schaffer's f6 function,
as
described
in
Davis
[l].
2
SIMULATING SOCIAL BEHAVIOR
A number
of
scientists have created computer simulations
of
various interpretations
of
the movement
of
organisms in a bird flock or fish school. Notably, Reynolds
[8]
and Heppner and Grenander [4]
presented simulations
of
bird flocking. Reynolds was intrigued by the aesthetics of bird flocking
choreography, and Heppner, a zoologist, was interested in discovering the underlying rules that enabled
large numbers
of
birds to
flock
synchronously, often changing direction suddenly, scattering and
regrouping, etc. Both
of
these scientists had the insight that local processes, such
as
those modeled by
0-7803-2768-3/95/$4.00
0
1995
IEEE
1942

cellular automata, might underlie the unpredic:table group dynamics
of
bird social behavior. Both
models
relied
heavily on manipulation of inter-individual distances; that is, the synchrony of flocking
behavior was thought to be a function
of
birds’ efforts to maintain an optimum distance between
themselves and their neighbors.
It does not seem a too-large leap of logic to suppose that some same rules underlie animal social
behavior, including herds, schools, and
flocks,
and that of humans. As sociobiologist
E.
0.
Wilson [9]
has written,
in
reference to fish schooling, “In theory at least, individual members of the school can
profit from the discoveries and previous experience of
all
other members of the school during the
search for food. This advantage can become decisive, outweighing the disadvantages of competition
for food items, whenever the resource is unpedictably distributed in patches”
(p.209).
This
statement
suggests that social sharing of information among conspeciates offers an evolutionary advantage: this
hypothesis was fundamental to the developnmt of particle swarm optimization.
One motive for developing the simdation was to model human social behavior, which is of course not
identical to fish schooling or bird flocking.
Che
important difference is its abstractness. Birds and
fish
adjust their physical movement to avoid prechtors,
seek
food and mates, optimize environmental
parameters such
as
temperature, etc. Humans; adjust not only physical movement but cognitive or
experiential variables
as
well. We do not usually walk
in
step and
tum
in unison (though some
fascinating research in human conformity shows that we
are
capable of it); rather, we tend to adjust our
beliefs and attitudes to conform with those cd
our
social
peers.
This is a major distinction in
terms
of
contriving a computer simulation, for at least one obvious
reason: collision. Two individuals can hold identical attitudes and beliefs without banging together, but
two birds cannot occupy the
same position
in
space without colliding. It seems reasonable, in
discussing human social behavior, to map
thie
concept of change into the birdfish analog of
movement.
This is consistent with the classic Aristotelim view of qualitative and quantitative change
as
types of
movement. Thus, besides moving through tlhree-dimensional physical space, and avoiding collisions,
humans change in abstract multidimensional space, colision-free. Physical space of course affects
informational inputs, but it
is
arguably a trivial component
of
psychological experience. Humans learn
to avoid physical collision by
an
early age, hit navigation of n-dimensional psychosocial space requires
decades of practice
-
and many of
us
never seem to acquire quite
all
the skills we need!
3
PRECURSORS: THE
ETIOLOGY
OF
PARTICLE SWARM OPTIMIZATION
The particle swarm optimizer is probably kcst presented by explaining its conceptual development.
As
mentioned above, the algorithm began
as
a simulation of a simplified social milieu. Agents were
thought of
as
collision-proof birds, and the original intent was to graphically simulate the graceful but
unpredictable choreography of a bird
flock.
3.1
Nearest
Neighbor
Velocity
Matching
and
Craziness
A
satisfying simulation was rather quickly written, which
relied
on two props: nearest-neighbor
velocity matching and “craziness.”
A
populartion of birds was randomly initialized with a position for
each on a torus pixel grid and with
X
and
Y
velocities. At each iteration a loop
in
the program
determined, for each agent (a more appropriate term than bird), which other agent was its nearest
neighbor, then assigned that agent’s
X
and
Y
velocities to the agent in focus. Essentially this simple
de
created
a synchrony of movement.
Unfortunately, the
flock
quickly settled on
ii
lunanimous, unchanging direction. Therefore, a stochastic
variable called
craziness
was
introduced. At each iteration some change was added to randomly
1943

chosen
X
and
Y
velocities. This intrdud enough variation into the system to give the simulation
an
interesting and “lifelike” appearance, though of course the variation was wholly artiticial.
3.2
The
Cornfield
Vector
Heppner’s bird simulations had a feature which introduced a dynamic force into the simulation.
His
birds flocked around a “roost,” a position on the pixel screen that attracted them until they finally
landed there. This eliminated
the
need
for a variable like craziness,
as
the simulation took on a lie of
its own. While the idea of
a
roost was intriguing, it led to another question which
seemed
even more
stimulating. Heppner’s birds knew where their roost was, but in real
life
birds land on any
tree
or
telephone wire that meets their immediate needs. Even more importantly, bird flocks land where there
is
food.
How do they find food? Anyone who has ever put out a bird feeder knows that within hours a
great number of birds will likely find it, even though they had no previous knowledge
of
its location,
appearance, etc. It seems possible that something about the flock dynamic enables members of the
flock to capitalize on one another’s knowledge,
as
in Wilson’s quote above.
The second variation
of the simulation defined a “comfield vector,” a two-dimensional vector of
XY
coordinates on the pixel plane. Each agent was programmed to evaluate its present position in terms of
the equation:
Eval=
Jw
+
4-
so
that at the (100,100) position the value was zero.
Each agent “remembered” the best value and the
XY
position which had resulted in that value. The
value was called
pbest[]
and the positions
pbestx[]
and
pbestyl]
(brackets indicate that these
are
arrays, with number of elements
=
number of agents).
As
each agent moved through the pixel space
evaluating positions, its
X
and
Y
velocities were adjusted in a simple manner.
If
it was to the right
of
its
pbestx,
then its
X
velocity (call it
vx)
was adjusted negatively by a random amount weighted by a
parameter
of
the system:
vx[]=vx[]
-
rand()*p-increment.
If
it was to the left of
pbestx,
rand()*p-increment
was added to
vx[].
Similarly,
Y
velocities
vy[]
were adjusted up and down,
depending on whether the agent was above or below
pbesty.
Secondly, each agent “knew” the globally best position that one member of the flock had found, and its
value. This
was
accomplished by simply assigning the
array
index of the agent with the best value to a
variable called
gbest,
so that
pbestx[gbest]
was the group’s best
X
position, and
pbesty[gbest]
its best
Y
position,
and
this
information was available to all flock members.
Again,
each member’s
vx[]
and
vy[]
were adjusted
as
follows, where
g-increment
is
a system parameter.
ifpresentx[l> pbestx[gbest] then
vx[]
=
vx[]
-
rand()
*g-increment
ifpresentx[]
<
pbestx[gbest] then
vx[]
=
vx[]
+
rand()
*g-increment
ifpresenty[]
>
pbesty[gbestl then
vy[]
=
vy[]
-
rand()
*g-increment
ifpresenty[l< pbesty[gbestl then
vy[]
=
vy[]
+
rand() *g-increment
In
the simulation, a circle marked the (100,100) position on the pixel field, and agents were represented
as
colored points. Thus an observer could watch the flocking agents circle around until they found the
simulated cornfield. The results were
surprising.
With
p-increment
and
g-increment
set relatively
high, the
flock
seemed to be sucked violently into the cornfield.
In
a very few iterations the entire flock,
usually
15
to 30 individuals, was seen to be clustered within the tiny circle surrounding the goal. With
p-increment
and
g-increment
set low, the flock swirled around the
goal,
realistically approaching it,
swinging out rhythmically with subgroups synchronized, and finally “landing” on the target.
1944

33
Eliminating
Ancillary
Variables
Once it was clear that the paradigm could
og~imize
simple, two-dimensional, linear functions, it was
important to identify the parts of the paradip that
are
necessary for the task. For instance, the
authors quickly found that the algorithm worlts just as well, and looks just
as
realistic, without
craziness,
so
it was removed. Next it was shown that optimization actually occurs slightly faster when
nearest neighbor velocity matching is removed, though the visual effect
is
changed. Theflock
is
now a
swam,
but it
is
well able to find the codielidl.
The variables
pbest
and
gbest
and their
increments
are
both necessary. Conceptually
pbest
resembles
autobiographical memory,
as
each individual remembers its own experience (though only one fact
about it), and the velocity adjustment associarted with
pbest
has been called “simple nostalgia” in that
the individual tends to return to the place thiat most satisfied it in the past. On the other hand,
gbest
is
conceptually similar to publicized knowledge, or a group norm or standard, which individuals seek to
attain.
In
the simulations, a high value of
princrement
relative to
g-increment
results in excessive
wandering of isolated individuals through the problem space, while the reverse (relatively high
g-increment)
results in the
flock
rushing prematurely toward local minima. Approximately
equal
values of the two
increments
Seem to result in the most effective search
of
the problem domain.
3.4
Multidimensional
Search
We
the algorithm
seems
to impressively ”del a
flock
searching for a cornfield, most interesting
optimization problems are neither linear nor two-dimensional. Since one of the authors’ objectives is
to model social behavior, which is multidimensional and collision-free, it
seemed
a simple step to
change
presentx
and
presenty
(and of course
vx[]
and
vy[n
from onedimensional arrays to
D
x
N
matrices, where
D
is any number of dimensions and
N
is the number of agents.
Multidimensional experiments were performed, using
a
nonlinear, multidimensional problem: adjusting
weights to
train
a feedforward multilayer pe:nceptron neural network
(NN).
One of the authors’ first
experiments involved training weights for a tluee-layer
NN
solving the exclusive-or (XOR) problem.
This problem requires two input and one output processing elements
(PES),
plus some number of
hidden
PES.
Besides connections from the piwious layer, the hidden and output
PE
layers each has a
bias
PE
associated with it. Thus a
2,3,1 NN
requires optimization of
13
parameters. This problem
was approached by flying
the
agents through 13-dimensional space until
an
average sum-squared error
per
PE
criterion was met. The algorithm performed very well on this problem. The thirteen-
dimensional XOR network
was
trained, to
am
e
<
0.05
criterion, in an average of
30.7
iterations with
20
agents. More complex
NN
architectures, look longer of course, but results, discussed in
Section
5:
Results and Early Applications,
were still very good.
3.5
Acceleration
by
Distance
Though the algorithm worked well, there
WiU;
something aesthetically displeasing and hard to
understand about it. Velocity adjustments were based on a crude inequality test:
ifpresentx
>
bestx,
make it smaller;
ifpresentx
c
bestx,
make it bigger. Some experimentation revealed that further
revising the algorithm made it easier to und~eirstand and improved its performance. Rather than simply
testing the sign of the inequality, velocities
were
adjusted according to their difference, per dimension,
from
best locations:
vx[][]
=
vx[][]
+
rand()
*p_increment*(pbt?stx[][]
-
presentx[l[l)
1945

(note the parameters
vx
and
presentx
have two sets of brackets because they are now matrices of agents
by dimensions;
increment
and
bestx
could also have a
g
instead of
p
at their beginnings.)
3.6
Current
Simplified
Version
It was soon realized that there is no
good
way to guess whether
p-
or
g-increment
should
be
larger.
Thus, these terms were also stripped out
of
the algorithm. The stochastic factor was multiplied
by
2
to
give it
a
mean
of
1,
so
that agents would “overfly” the target about half the time. This version
outperforms the previous versions. Further research will show whether there is
an
optimum value for
the constant currently set at
2,
whether the value should be evolved for each problem, or whether the
value can be determined from some knowledge of a particular problem. The current simplified particle
swarm optimizer now adjusts velocities by the following formula:
vxLlLl=
VXLILl
+
2
*
rand()
*
(pbestx[][]
-
presentx[]fl)
+
2
*
rand()
*
(pbestxfllgbesfl
-
presentxflf])
3.7
Other Experiments
Other variations on the algorithm were tried, but none seemed to improve on the current simplified
version. For instance, it is apparent that the agent
is
propelled toward a weighted average of the two
“best” points in the problem space. One version
of
the algorithm reduced the two terms to one, which
was the point on each dimension midway between
pbest
and
gbest
positions. This version had
an
unfortunate tendency, however, to converge on that pint whether it was an optimum or not.
Apparently the two stochastic “kicks” are a necessary part of the process.
Another version considered using
two
types
of agents, conceived
as
“explorers” and “settlers.”
Explorers used the inequality test, which tended to cause them to overrun the target
by
a large distance,
while settlers used the difference term. The hypothesis was that explorers would extrapolate outside
the “known” region of the problem domain,
and
the settlers would hill-climb or micro-explore regions
that had been found to
be
good.
Again, this method showed no improvement over the current
simplified version. Occam’s razor slashed
again.
Another version that was tested removed the momentum
of
vx[][].
The new adjustment was:
VXilLl
=
2
*
rand()
*
(pbestxflf]
-
presentx[lfl
)
+
2
*
rand()
*
(pbestx[][gbestJ
-
presentx[l[]
)
This
version, though
simplified,
tumed out
to
be
quite ineffective at finding global optima.
4
SWARMS
AND
PARTICLES
As
was described
in
Section
3.3,
it
became obvious
during
the simplification of the paradigm that the
behavior of the population of agents
is
now more like
a
swarm than a
flock.
The term
swarm
has a
basis
in
the literature.
In
particular, the authors use the term in accordance with a paper by Millonas
[6],
who developed his models for applications
in
artificial life, and articulated five basic principles
of
swarm intelligence. First is the proximity principle: the population should
be
able to
carry
out simple
space
and
time computations. Second is the quality principle: the population should
be
able to respond
to quality factors
in
the environment. Third is the principle of diverse response: the population should
not commit
its
activities along excessively narrow channels. Fourth is the principle of stability: the
population should not change its mode of behavior every time the environment changes. Fifth
is
the
1946

Citations
More filters
Book ChapterDOI

Introduction to Algorithms

Xin-She Yang
TL;DR: This chapter provides an overview of the fundamentals of algorithms and their links to self-organization, exploration, and exploitation.
Book

Practical Genetic Algorithms

TL;DR: Introduction to Optimization The Binary genetic Algorithm The Continuous Parameter Genetic Algorithm Applications An Added Level of Sophistication Advanced Applications Evolutionary Trends Appendix Glossary Index.
Journal ArticleDOI

Teaching-learning-based optimization: A novel method for constrained mechanical design optimization problems

TL;DR: The effectiveness of the TLBO method is compared with the other population-based optimization algorithms based on the best solution, average solution, convergence rate and computational effort and results show that TLBO is more effective and efficient than the other optimization methods.
Journal ArticleDOI

Firefly algorithm, stochastic test functions and design optimisation

TL;DR: This paper shows how to use the recently developed firefly algorithm to solve non-linear design problems and proposes a few new test functions with either singularity or stochastic components but with known global optimality and thus they can be used to validate new optimisation algorithms.
References
More filters
Book

Adaptation in natural and artificial systems

TL;DR: Names of founding work in the area of Adaptation and modiication, which aims to mimic biological optimization, and some (Non-GA) branches of AI.
Proceedings ArticleDOI

A new optimizer using particle swarm theory

TL;DR: The optimization of nonlinear functions using particle swarm methodology is described and implementations of two paradigms are discussed and compared, including a recently developed locally oriented paradigm.
Proceedings ArticleDOI

Flocks, herds and schools: A distributed behavioral model

TL;DR: In this article, an approach based on simulation as an alternative to scripting the paths of each bird individually is explored, with the simulated birds being the particles and the aggregate motion of the simulated flock is created by a distributed behavioral model much like that at work in a natural flock; the birds choose their own course.
Book

Handbook of Genetic Algorithms

TL;DR: This book sets out to explain what genetic algorithms are and how they can be used to solve real-world problems, and introduces the fundamental genetic algorithm (GA), and shows how the basic technique may be applied to a very simple numerical optimisation problem.