scispace - formally typeset
Open AccessJournal ArticleDOI

Differential Evolution Using a Neighborhood-Based Mutation Operator

Reads0
Chats0
TLDR
A family of improved variants of the DE/target-to-best/1/bin scheme, which utilizes the concept of the neighborhood of each population member, and is shown to be statistically significantly better than or at least comparable to several existing DE variants as well as a few other significant evolutionary computing techniques over a test suite of 24 benchmark functions.
Abstract
Differential evolution (DE) is well known as a simple and efficient scheme for global optimization over continuous spaces. It has reportedly outperformed a few evolutionary algorithms (EAs) and other search heuristics like the particle swarm optimization (PSO) when tested over both benchmark and real-world problems. DE, however, is not completely free from the problems of slow and/or premature convergence. This paper describes a family of improved variants of the DE/target-to-best/1/bin scheme, which utilizes the concept of the neighborhood of each population member. The idea of small neighborhoods, defined over the index-graph of parameter vectors, draws inspiration from the community of the PSO algorithms. The proposed schemes balance the exploration and exploitation abilities of DE without imposing serious additional burdens in terms of function evaluations. They are shown to be statistically significantly better than or at least comparable to several existing DE variants as well as a few other significant evolutionary computing techniques over a test suite of 24 benchmark functions. The paper also investigates the applications of the new DE variants to two real-life problems concerning parameter estimation for frequency modulated sound waves and spread spectrum radar poly-phase code design.

read more

Content maybe subject to copyright    Report

526 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 13, NO. 3, JUNE 2009
Differential Evolution Using a Neighborhood-Based
Mutation Operator
Swagatam Das, Ajith Abraham, Senior Member, IEEE, Uday K. Chakraborty, and Amit Konar, Member, IEEE
Abstract Differential evolution (DE) is well known as a simple
and efficient scheme for global optimization over continuous
spaces. It has reportedly outperformed a few evolutionary algo-
rithms (EAs) and other search heuristics like the particle swarm
optimization (PSO) when tested over both benchmark and real-
world problems. DE, however, is not completely free from the
problems of slow and/or premature convergence. This paper
describes a family of improved variants of the DE/target-to-
best/1/bin scheme, which utilizes the concept of the neighborhood
of each population member. The idea of small neighborhoods,
defined over the index-graph of parameter vectors, draws inspi-
ration from the community of the PSO algorithms. The proposed
schemes balance the exploration and exploitation abilities of DE
without imposing serious additional burdens in terms of function
evaluations. They are shown to be statistically significantly better
than or at least comparable to several existing DE variants as
well as a few other significant evolutionary computing techniques
over a test suite of 24 benchmark functions. The paper also
investigates the applications of the new DE variants to two real-
life problems concerning parameter estimation for frequency
modulated sound waves and spread spectrum radar poly-phase
code design.
Index Terms Differential evolution, evolutionary algo-
rithms, meta-heuristics, numerical optimization, particle swarm
optimization.
I. INTRODUCTION
D
IFFERENTIAL EVOLUTION (DE), proposed by Storn
and Price [1]–[3], is a simple yet powerful algorithm
for real parameter optimization. Recently, the DE algorithm
has become quite popular in the machine intelligence and
cybernetics communities. It has successfully been applied to
diverse domains of science and engineering, such as mechani-
cal engineering design [4], [5], signal processing [6], chemical
engineering [7], [8], machine intelligence, and pattern recog-
nition [9], [10]. It has been shown to perform better than the
genetic algorithm (GA) [11] or the particle swarm optimization
(PSO) [12] over several numerical benchmarks [13]. Many of
Manuscript received November 6, 2007; revised March 23, 2008 and July
11, 2008; accepted September 2, 2008. Current version published June 10,
2009.
S. Das and A. Konar are with the Department of Electronics and Telecom-
munication Engineering, Jadavpur University, Kolkata 700032, India (e-mail:
swagatamdas19@yahoo.co.in; konaramit@yahhoo.co.in).
A. Abraham is with the Center of Excellence for Quantifiable Quality of
Service, Norwegian University of Science and Technology, Trondheim, NO-
7491, Norway and Machine Intelligence Research Labs (MIR Labs), USA
(e-mail: ajith.abraham@ieee.org).
U. K. Chakraborty is with the Department of Math and Computer
Science, University of Missouri, St. Louis, MO 63121 USA (e-mail:
chakrabortyu@umsl.edu).
Digital Object Identifier 10.1109/TEVC.2008.2009457
the most recent developments in DE algorithm design and
applications can be found in [14]. Like other evolutionary
algorithms, two fundamental processes drive the evolution of a
DE population: the variation process, which enables exploring
different regions of the search space, and the selection process,
which ensures the exploitation of previous knowledge about
the fitness landscape.
Practical experience, however, shows that DE may occasion-
ally stop proceeding toward the global optimum even though
the population has not converged to a local optimum or any
other point [15]. Occasionally, even new individuals may enter
the population, but the algorithm does not progress by finding
any better solutions. This situation is usually referred to as
stagnation. DE also suffers from the problem of premature
convergence, where the population converges to some local
optima of a multimodal objective function, losing its diversity.
The probability of stagnation depends on how many different
potential trial solutions are available and also on their ca-
pability to enter into the population of the subsequent gen-
erations [15]. Like other evolutionary computing algorithms,
the performance of DE deteriorates with the growth of the
dimensionality of the search space as well. There exists a
good volume of works (a review of which can be found in
Section III), attempting to improve the convergence speed and
robustness (ability to produce similar results over repeated
runs) of DE by tuning the parameters like population size
NP, the scale factor F, and the crossover rate Cr.
In the present work, we propose a family of variants of
the DE/target-to-best/1 scheme [3, p.140], which was also
referred to as “Scheme DE2” in the first technical paper on
DE [1]. In some DE literature this algorithm is referred to as
DE/current-to-best/1 [16], [17]. To combine the exploration
and exploitation capabilities of DE, we propose a new hybrid
mutation scheme that utilizes an explorative and an exploitive
mutation operator, with an objective of balancing their effects.
The explorative mutation operator (referred to as the local
mutation model) has a greater possibility of locating the
minima of the objective function, but generally needs more
iterations (generations). On the other hand, the exploitative
mutation operator (called by us the global mutation model)
rapidly converges to a minimum of the objective function. In
this case there exists the danger of premature convergence to a
suboptimal solution. In the hybrid model we linearly combine
the two mutation operators using a new parameter, called the
weight factor. Four different schemes have been proposed and
investigated for adjusting the weight factor, with a view to alle-
viating user intervention and hand tuning as much as possible.
1089-778X/$25.00 © 2009 IEEE
Authorized licensed use limited to: IEEE Xplore. Downloaded on July 16, 2009 at 15:52 from IEEE Xplore. Restrictions apply.

DAS et al.: DIFFERENTIAL EVOLUTION USING A NEIGHBORHOOD-BASED MUTATION OPERATOR 527
Here we would like to mention that although a preliminary
version of this paper appeared as a conference paper in [18],
the present version has been considerably enhanced and it
differs in many aspects from [18]. It critically examines
the effects of the global and local neighborhoods on the
performance of DE and explores a few different ways of tuning
of the weight factor (see Section IV) used for unification
of the neighborhood models. In addition, it compares the
performance of the proposed approaches with several state-of-
the-art DE variants as well as other evolutionary algorithms
over a testbed of 24 well-known numerical benchmarks and
one real-world optimization problem in contrast to [18], which
uses only six benchmarks and provides limited comparison
results.
The remainder of this paper is organized as follows. In
Section II, we provide a brief outline of the DE family of
algorithms. Section III provides a short survey of previous
research on improving the performance of DE. Section IV
introduces the proposed family of variants of the DE/target-
to-best/1 algorithm. Experimental settings for the benchmarks
and simulation strategies are explained in Section V. Results
are presented and discussed in Section VI. Finally, conclusions
are drawn in Section VII.
II. DE A
LGORITHM
Like any other evolutionary algorithm, DE starts with a
population of NP D-dimensional parameter vectors repre-
senting the candidate solutions. We shall denote subsequent
generations in DE by G = 0, 1,...,G
max
. Since the parameter
vectors are likely to be changed over different generations,
we may adopt the following notation for representing the ith
vector of the population at the current generation as
X
i,G
=
x
1,i,G
, x
2,i,G
, x
3,i,G
,...,x
D,i,G
. (1)
For each parameter of the problem, there may be a certain
range within which the value of the parameter should lie
for better search results. The initial population (at G =
0) should cover the entire search space as much as possi-
ble by uniformly randomizing individuals within the search
space constrained by the prescribed minimum and maximum
bounds:
X
min
={x
1,min
, x
2,min
,...,x
D,min
} and
X
max
=
{x
1,max
, x
2,max
,...,x
D,max
}. Hence we may initialize the j th
component of the ith vector as
x
j,i,0
= x
j,min
+rand
i, j
(0, 1) · (x
j,max
x
j,min
) (2)
where rand
i, j
(0, 1) is a uniformly distributed random number
lying between 0 and 1 and is instantiated independently for
each component of the i -th vector. The following steps are
taken next: mutation, crossover, and selection (in that order),
which are explained in the following subsections.
A. Mutation
After initialization, DE creates a donor vector
V
i,G
corresponding to each population member or target vec-
tor
X
i,G
in the current generation through mutation and
sometimes using arithmetic recombination too. It is the
method of creating this donor vector that differentiates
one DE scheme from another. Five most frequently re-
ferred strategies implemented in the public-domain DE
codes for producing the donor vectors (available online at
http://www.icsi.berkeley.edu/storn/code.html) are listed below
“DE/rand/1”:
V
i,G
=
X
r
i
1
,G
+ F · (
X
r
i
2
,G
X
r
i
3
,G
) (3)
“DE/best/1”:
V
i,G
=
X
bes t,G
+ F · (
X
r
i
1
,G
X
r
i
2
,G
) (4)
“DE/target-to-best/1”:
V
i,G
=
X
i,G
+ F · (
X
bes t,G
X
i,G
)
+ F · (
X
r
i
1
,G
X
r
i
2
,G
) (5)
“DE/best/2”:
V
i,G
=
X
bes t,G
+ F · (
X
r
i
1
,G
X
r
i
2
,G
)
+ F · (
X
r
i
3
,G
X
r
i
4
,G
) (6)
“DE/rand/2”:
V
i,G
=
X
r
i
1
,G
+ F · (
X
r
i
2
,G
X
r
i
3
,G
)
+ F · (
X
r
i
4
,G
X
r
i
5
,G
). (7)
The indices r
i
1
, r
i
2
, r
i
3
, r
i
4
, and r
i
5
are mutually exclusive integers
randomly chosen from the range [1, NP], and all are different
from the base index i. These indices are randomly generated
once for each donor vector. The scaling factor F is a positive
control parameter for scaling the difference vectors.
X
bes t,G
is the best individual vector with the best fitness (i.e., lowest
objective function value for a minimization problem) in the
population at generation G. Note that some of the strategies
for creating the donor vector may be mutated recombinants, for
example, (5) listed above, basically mutates a two-vector re-
combinant:
X
i,G
+F ·(
X
bes t,G
X
i,G
). The general convention
used for naming the various mutation strategies is DE/x/y/z,
where DE stands for differential evolution, x represents a
string denoting the vector to be perturbed, y is the number
of difference vectors considered for perturbation of x, and z
stands for the type of crossover being used (exp: exponential;
bin: binomial). The following section discusses the crossover
step in DE.
B. Crossover
To increase the potential diversity of the population, a
crossover operation comes into play after generating the
donor vector through mutation. The DE family of algorithms
can use two kinds of crossover schemes—exponential and
binomial [1]–[3]. The donor vector exchanges its components
with the target vector
X
i,G
under this operation to form
the trial vector
U
i,G
=
u
1,i,G
, u
2,i,G
, u
3,i,G
,...,u
D,i,G
.In
exponential crossover, we first choose an integer n randomly
among the numbers [1, D]. This integer acts as a starting point
in the target vector, from where the crossover or exchange
of components with the donor vector starts. We also choose
another integer L from the interval [1, D]. L denotes the
number of components; the donor vector actually contributes
Authorized licensed use limited to: IEEE Xplore. Downloaded on July 16, 2009 at 15:52 from IEEE Xplore. Restrictions apply.

528 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 13, NO. 3, JUNE 2009
x
'
2
x'
1
x
1
x
2
V
i,G
U
4_i,G
U
3_i,G
U
1_i,G
U
2_i,G
X
i,G
Fig. 1. Change of the trial vectors generated through the crossover operation
described in (9) due to rotation of the coordinate system.
to the target. After a choice of n and L, the trial vector is
obtained as
u
j,i,G
=
v
j,i,G
, for j =n
D
, n + 1
D
,...,n + L 1
D
x
j,i,G
, for all other j
[
1, D
]
(8)
where the angular brackets

D
denote a modulo function with
modulus D. The integer L is drawn from [1, D] according to
the following pseudo-code:
L = 0;
DO
{
L = L + 1;
}WHILE(((rand(0, 1)<Cr) AND (L < D)).
Cr is called the crossover rate and appears as a control
parameter of DE just like F . Hence in effect, probability (L
υ) = (Cr )
υ1
for any υ>0. For each donor vector, a new
set of n and L must be chosen randomly as shown above.
On the other hand, binomial crossover is performed on
each of the D variables whenever a randomly picked number
between 0 and 1 is less than or equal to the Cr value. In this
case, the number of parameters inherited from the donor has
a (nearly) binomial distribution. The scheme may be outlined
as
u
j,i,G
=
v
j,i,G
, if (rand
i, j
(0, 1) Cr or j = j
rand
)
x
j,i,G
, otherwise
(9)
where rand
i, j
(0, 1)
[
0, 1
]
is a uniformly distributed random
number, which is called a new for each j th component of
the i th parameter vector. j
rand
[
1, 2,...,D
]
is a randomly
chosen index, which ensures that
U
i,G
gets at least one
component from
V
i,G
.
The crossover operation described in (9) is basically a
discrete recombination [3]. Fig. 1 illustrates a two-dimensional
example of recombining the parameters of two vectors
X
i,G
and
V
i,G
, according to this crossover operator, where the po-
tential trial vectors are generated at the corners of a rectangle.
Note that
V
i,G
can itself be the trial vector (i.e.,
U
i,G
=
V
i,G
)
when Cr = 1. As can be seen from Fig. 1, discrete recombi-
nation is a rotationally variant operation. Rotation transforms
the coordinates of both vectors and thus changes the shape of
the rectangle as shown in Fig. 1. Consequently, the potential
location of the trial vector moves from the possible set (
U
1_i ,G
,
U
2_i,G
)to(
U
3_i ,G
,
U
4_i,G
). To overcome this limitation, a
new trial vector generation strategy “DE/current-to-rand/1”
is proposed in [19], which replaces the crossover operator
prescribed in (9) with the rotationally invariant arithmetic
crossover operator to generate the trial vector
U
i,G
by linearly
combining the target vector
X
i,G
and the corresponding donor
vector
V
i,G
as follows:
U
i,G
=
X
i,G
+ K · (
V
i,G
X
i,G
).
Now incorporating (3) in (10) we have
U
i,G
=
X
i,G
+ K · (
X
r
1
,G
+ F · (
X
r
2
,G
X
r
3
,G
)
X
i,G
)
which further simplifies to
U
i,G
=
X
i,G
+K ·(
X
r
1
,G
X
i,G
)+F
/
·(
X
r
2
,G
X
r
3
,G
) (10)
where K is the combination coefficient, which has been
shown [19] to be effective when it is chosen with a uniform
random distribution from [0, 1] and F
/
= K · F is a new
constant here.
C. Selection
To keep the population size constant over subsequent gen-
erations, the next step of the algorithm calls for selection to
determine whether the target or the trial vector survives to the
next generation i.e., at G = G +1. The selection operation is
described as
X
i,G+1
=
U
i,G
, if f (
U
i,G
) f (
X
i,G
)
=
X
i,G
, if f (
U
i,G
)> f (
X
i,G
) (11)
where f (
X) is the function to be minimized. So if the new
trial vector yields an equal or lower value of the objective
function, it replaces the corresponding target vector in the next
generation; otherwise the target is retained in the population.
Hence the population either gets better (with respect to the
minimization of the objective function) or remains the same
in fitness status, but never deteriorates. The complete pseudo-
code of the DE is given below:
1) Pseudo-Code for the DE Algorithm Family:
Step 1. Set the generation number G = 0 and randomly initial-
ize a population of NP individuals P
G
={
X
1,G
,...,
X
NP,G
}
with
X
i,G
=
x
1,i,G
, x
2,i,G
, x
3,i,G
,...,x
D,i,G
and each in-
dividual uniformly distributed in the range
X
min
,
X
max
,
where
X
min
={x
1,min
, x
2,min
,...,x
D,min
} and
X
max
=
{x
1,max
, x
2,max
,...,x
D,max
} with i =
1, 2,...,NP
.
Step 2. WHILE the stopping criterion is not satisfied
DO
FOR i = 1toNP //do for each individual sequentially
Step 2.1 Mutation Step
Generate a donor vector
V
i,G
={v
1,i,G
,...,v
D,i,G
}
corresponding to the ith target vector
X
i,G
via one
of the different mutation schemes of DE [(3) to (7)].
Step 2.2 Crossover Step
Generate a trial vector
U
i,G
={u
1,i,G
,...,u
D,i,G
}
for the ith target vector
X
i,G
through binomial
Authorized licensed use limited to: IEEE Xplore. Downloaded on July 16, 2009 at 15:52 from IEEE Xplore. Restrictions apply.

DAS et al.: DIFFERENTIAL EVOLUTION USING A NEIGHBORHOOD-BASED MUTATION OPERATOR 529
crossover (9) or exponential crossover (8) or through
the arithmetic crossover (10).
Step 2.3 Selection Step
Evaluate the trial vector
U
i,G
.
IF f (
U
i,G
) f (
X
i,G
)
THEN
X
i,G+1
=
U
i,G
, f (
X
i,G+1
) = f (
U
i,G
)
IF f (
U
i,G
)< f (
X
bes t,G
)
THEN
X
bes t,G
=
U
i,G
, f (
X
bes t,G
) = f (
U
i,G
)
END IF
END IF
ELSE
X
i,G+1
=
X
i,G
, f (
X
i,G+1
) = f (
X
i,G
).
END FOR
Step 2.4 Increase the Generation Count G = G + 1.
END WHILE
III. A R
EVIEW OF PREVIOUS WORK ON IMPROVING THE
DE ALGORITHM
Over the past few years researchers have been investigat-
ing ways of improving the ultimate performance of the DE
algorithm by tuning its control parameters. Storn and Price
in [1] have indicated that a reasonable value for NP could
be between 5D and 10D (D being the dimensionality of the
problem), and a good initial choice of F could be 0.5. The
effective value of F usually in the range [0.4, 1].
Gamperle et al. [20] evaluated different parameter settings
for DE on the Sphere, Rosenbrock’s, and Rastrigin’s functions.
Their experimental results revealed that the global optimum
searching capability and the convergence speed are very sen-
sitive to the choice of control parameters NP, F, and Cr .
Furthermore, a plausible choice of the population size NP is
between 3D and 8D, with the scaling factor F = 0.6 and the
crossover rate Cr in [0.3, 0.9]. Recently, the authors in [16]
claim that typically 0.4 < F < 0.95 with F = 0.9 is a good
first choice. Cr typically lies in (0, 0.2) when the function is
separable, while in (0.9, 1) when the function’s parameters are
dependent.
As can be seen from the literature, several claims and
counterclaims were reported concerning the rules for choosing
the control parameters, confusing engineers who try to solve
real-world optimization problems with DE. Further, many of
these claims lack sufficient experimental justification. There-
fore researchers consider techniques such as self-adaptation to
avoid manual tuning of the parameters of DE. Usually self-
adaptation is applied to tune the control parameters F and
Cr. Liu and Lampinen introduced fuzzy adaptive differential
evolution (FADE) [21] using fuzzy logic controllers, whose
inputs incorporate the relative function values and individuals
of successive generations to adapt the parameters for the
mutation and crossover operation. Based on the experimental
results over a set of benchmark functions, the FADE algorithm
outperformed the conventional DE algorithm. In this context,
Qin et al. proposed a self-adaptive DE (SaDE) [22] algorithm,
in which both the trial vector generation strategies and their
associated parameters are gradually self-adapted by learn-
ing from their previous experiences of generating promising
solutions.
Zaharie proposed a parameter adaptation strategy for DE
(ADE) based on the idea of controlling the population diver-
sity, and implemented a multipopulation approach [23]. Fol-
lowing the same line of thinking, Zaharie and Petcu designed
an adaptive Pareto DE algorithm for multiobjective optimiza-
tion and also analyzed its parallel implementation [24]. [25]
self-adapted the crossover rate Cr for multiobjective opti-
mization problems, by encoding the value of Cr into each
individual and simultaneously evolving it with other search
variables. The scaling factor F was generated for each variable
from a Gaussian distribution N
(0, 1).
[26] introduced a self-adaptive scaling factor parameter
F. They generated the value of Cr for each individual from
a normal distribution N (0.5, 0.15). This approach (called
SDE) was tested on four benchmark functions and performed
better than other versions of DE. Besides adapting the con-
trol parameters F or Cr, some researchers also adapted the
population size. Teo proposed DE with self-adapting popula-
tions (DESAP) [27], based on Abbass’s self-adaptive Pareto
DE [25]. Recently, [28] encoded control parameters F and Cr
into the individual and evolved their values by using two new
probabilities τ
1
and τ
2
. In their algorithm (called SADE), a set
of F values was assigned to each individual in the population.
With probability τ
1
, F is reinitialized to a new random value
in the range [0.1, 1.0], otherwise it is kept unchanged. The
control parameter Cr , assigned to each individual, is adapted
in an identical fashion, but with a different re-initialization
range [0, 1] and with the probability τ
2
. With probability τ
2
,
Cr takes a random value in [0, 1], otherwise it retains its
earlier value in the next generation.
[29] introduced two schemes for adapting the scale factor
F in DE. In the first scheme (called DERSF: DE with random
scale factor) they varied F randomly between 0.5 and 1.0 in
successive iterations. They suggested decreasing F linearly
from 1.0 to 0.5 in their second scheme (called DETVSF: DE
with time varying scale factor). This encourages the individ-
uals to sample diverse zones of the search space during the
early stages of the search. During the later stages, a decaying
scale factor helps to adjust the movements of trial solutions
finely so that they can explore the interior of a relatively small
space in which the suspected global optimum lies.
DE/rand/1/either-or is a state-of-the-art DE variant de-
scribed by Price et al. [3, p.118]. In this algorithm, the trial
vectors that are pure mutants occur with a probability p
F
and
those that are pure recombinants occur with a probability 1
p
F
. The scheme for trial vector generation may be outlined as
U
i,G
=
X
r
i
1
,G
+ F
· (
X
r
i
2
,G
X
r
i
3
,G
), if rand
i
(0, 1)<p
F
=
X
r
i
1
,G
+ K
· (
X
r
i
2
,G
+
X
r
i
3
,G
2.
X
r
i
1
,G
), otherwise (12)
where, according to Price et al., K = 0.5 · (F + 1) serves as
a good choice of the parameter K for a given F.
Rahnamayan et al. have proposed an opposition-based DE
(ODE) [30] that is specially suited for noisy optimization
problems. The conventional DE algorithm was enhanced by
Authorized licensed use limited to: IEEE Xplore. Downloaded on July 16, 2009 at 15:52 from IEEE Xplore. Restrictions apply.

530 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 13, NO. 3, JUNE 2009
utilizing the opposition number-based optimization concept
in three levels, namely, population initialization, generation
jumping, and local improvement of the population’s best
member.
[31] proposed a hybridization of DE with the neighborhood
search (NS), which appears as a main strategy underpinning
evolutionary programming (EP) [32]. The resulting algorithm,
known as NSDE, performs mutation by adding a normally
distributed random value to each target-vector component in
the following way:
V
i,G
=
X
r
i
1
,G
+
d
i,G
.N(0.5, 0.5), if rand
i
(0, 1)<0.5
d
i,G
, otherwise
(13)
where
d
i,G
=
X
r
i
2
,G
X
r
i
3
,G
is the usual difference vector,
N (0.5, 0.5) denotes a Gaussian random number with mean 0.5
and standard deviation 0.5, and δ denotes a Cauchy random
variable with scale parameter t = 1. Recently [33] used a self-
adaptive NSDE in the cooperative coevolution framework that
is capable of optimizing large-scale nonseparable problems
(up to 1000 dimensions). They proposed a random grouping
scheme and adaptive weighting for problem decomposition
and coevolution. Somewhat similar in spirit to the present pa-
per is the study by [34] on self-adaptive differential evolution
with neighborhood search (SaNSDE). SaNSDE incorporates
self-adaptation ideas from the SaDE [22] and proposes three
self-adaptive strategies: self-adaptive choice of the mutation
strategy between two alternatives, self-adaptation of the scale
factor F, and self-adaptation of the crossover rate Cr .We
would like to point out here that in contrast to Yang et al.’s
works on NSDE and SaNSDE, we keep the scale factor
nonrandom and use a ring-shaped neighborhood topology
(inspired by PSO [37]), defined on the index graph of the
parameter vectors, in order to derive a local neighborhood-
based mutation model. Also instead of F and Cr , the weight
factor that unifies two kinds of mutation models have been
made self-adaptive in one of the variants of DE/target-to-best/1
scheme, proposed by us. Section IV describes these issues in
sufficient details.
Noman and Iba [35], [36] proposed the Fittest Individual
Refinement (FIR); a crossover-based local search method for
DE. The FIR scheme accelerates DE by enhancing its search
capability through exploration of the neighborhood of the best
solution in successive generations.
As will be evident from Section IV, the proposed method
differs significantly from the works described in the last couple
of paragraphs. It draws inspiration from the neighborhood
topologies used in PSO [37]. Similar to DE, PSO has also
emerged as a powerful real parameter optimization technique
during the late 1990s. It emulates the swarm behavior of
insects, animals herding, birds flocking, and fish schooling,
where these swarms search for food in a collaborative manner.
A number of significantly improved variants of basic PSO have
been proposed in the recent past to solve both benchmark and
real-world optimization problems, for example, see [38], [39].
Earlier attempts to hybridize DE with different operators of
the PSO algorithm may be traced to [40] and [41].
IV. DE
WITH A NEIGHBORHOOD-BASED
MUTATION OPERATOR
A. DE/target-to-best/1—A Few Drawbacks
Most of the population-based search algorithms try to bal-
ance between two contradictory aspects of their performance:
exploration and exploitation. The first one means the ability
of the algorithm to “explore” or search every region of the
feasible search space, while the second denotes the ability to
converge to the near-optimal solutions as quickly as possible.
The DE variant known as DE/target-to-best/1 (5) uses the best
vector of the population to generate donor vectors. By “best”
we mean the vector that corresponds to the best fitness (e.g.,
the lowest objective function value for a minimization prob-
lem) in the entire population at a particular generation. The
scheme promotes exploitation since all the vectors/genomes
are attracted towards the same best position (pointed to by
the “best” vector) on the fitness landscape through iterations,
thereby converging faster to that point. But as a result of
such exploitative tendency, in many cases, the population may
lose its global exploration abilities within a relatively small
number of generations, thereafter getting trapped to some
locally optimal point in the search space.
In addition, DE employs a greedy selection strategy (the
better between the target and the trial vectors is selected) and
uses a fixed scale factor F (typically in [0.4, 1]). Thus if
the difference vector
X
r
1
,G
X
r
2
,G
used for perturbation is
small (this is usually the case when the vectors come very
close to each other and the population converges to a small
domain), the vectors may not be able to explore any better
region of the search space, thereby finding it difficult to escape
large plateaus or suboptimal peaks/valleys. Mezura-Montes
et al., while comparing the different variants of DE for global
optimization in [17], have noted that DE/target-to-best/1 shows
a poor performance and remains inefficient in exploring the
search space, especially for multimodal functions. The same
conclusions were reached by Price et al. [3, p.156].
B. Motivations for the Neighborhood-Based Mutation
A proper tradeoff between exploration and exploitation
is necessary for the efficient and effective operation of a
population-based stochastic search technique like DE, PSO,
etc. The DE/target-to-best/1, in its present form, favors ex-
ploitation only, since all the vectors are attracted by the same
best position found so far by the entire population, thereby
converging faster towards the same point.
In this context we propose two kinds of neighborhood
models for DE. The first one is called the local neighborhood
model, where each vector is mutated using the best position
found so far in a small neighborhood of it and not in the entire
population. On the other hand, the second one, referred to as
the global mutation model, takes into account the globally best
vector
X
bes t,G
of the entire population at current generation
G for mutating a population member. Note that DE/target-to-
best/1 employs only the global mutation strategy.
A vector’s neighborhood is the set of other parameter
vectors that it is connected to; it considers their experience
when updating its position. The graph of interconnections is
Authorized licensed use limited to: IEEE Xplore. Downloaded on July 16, 2009 at 15:52 from IEEE Xplore. Restrictions apply.

Citations
More filters
Journal ArticleDOI

Differential Evolution: A Survey of the State-of-the-Art

TL;DR: A detailed review of the basic concepts of DE and a survey of its major variants, its application to multiobjective, constrained, large scale, and uncertain optimization problems, and the theoretical studies conducted on DE so far are presented.
Journal ArticleDOI

Recent advances in differential evolution – An updated survey

TL;DR: It is found that it is a high time to provide a critical review of the latest literatures published and also to point out some important future avenues of research on DE.
Journal ArticleDOI

Differential Evolution With Composite Trial Vector Generation Strategies and Control Parameters

TL;DR: A novel method, called composite DE (CoDE), has been proposed, which uses three trial vector generation strategies and three control parameter settings and randomly combines them to generate trial vectors.
Journal ArticleDOI

Differential evolution algorithm with ensemble of parameters and mutation strategies

TL;DR: The performance of EPSDE is evaluated on a set of bound-constrained problems and is compared with conventional DE and several state-of-the-art parameter adaptive DE variants.
Journal ArticleDOI

Recent advances in differential evolution: a survey and experimental analysis

TL;DR: Numerical results show that, among the algorithms considered in this study, the most efficient additional components in a DE framework appear to be the population size reduction and the scale factor local search.
References
More filters
Proceedings ArticleDOI

Particle swarm optimization

TL;DR: A concept for the optimization of nonlinear functions using particle swarm methodology is introduced, and the evolution of several paradigms is outlined, and an implementation of one of the paradigm is discussed.
Book

Adaptation in natural and artificial systems

TL;DR: Names of founding work in the area of Adaptation and modiication, which aims to mimic biological optimization, and some (Non-GA) branches of AI.
Journal ArticleDOI

Differential Evolution – A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces

TL;DR: In this article, a new heuristic approach for minimizing possibly nonlinear and non-differentiable continuous space functions is presented, which requires few control variables, is robust, easy to use, and lends itself very well to parallel computation.
Book

Introduction to Algorithms

TL;DR: The updated new edition of the classic Introduction to Algorithms is intended primarily for use in undergraduate or graduate courses in algorithms or data structures and presents a rich variety of algorithms and covers them in considerable depth while making their design and analysis accessible to all levels of readers.
Related Papers (5)
Frequently Asked Questions (13)
Q1. What are the contributions mentioned in the paper "Differential evolution using a neighborhood-based mutation operator" ?

It has reportedly outperformed a few evolutionary algorithms ( EAs ) and other search heuristics like the particle swarm optimization ( PSO ) when tested over both benchmark and realworld problems. This paper describes a family of improved variants of the DE/target-tobest/1/bin scheme, which utilizes the concept of the neighborhood of each population member. The paper also investigates the applications of the new DE variants to two reallife problems concerning parameter estimation for frequency modulated sound waves and spread spectrum radar poly-phase code design. 

In addition, the performance of the competitor DE variants may also be improved by blending other mutation strategies with judicious parameter tuning, a topic of future research. Future research may focus on providing some empirical or theoretical guidelines for selecting the neighborhood size over different types of optimization problems. It would be interesting to study the performance of the DEGL family when the various control parameters ( N P, F, and Cr ) are self-adapted following the ideas presented in [ 22 ], [ 28 ]. The conclusion the authors can draw at this point is that DE with the suggested modifications can serve as an attractive alternative for optimizing a wide variety of objective functions. 

The neighborhood-based DE mutation, equipped with selfadaptive weight factor, attempts to make a balanced use of the exploration and exploitation abilities of the search mechanism and is therefore more likely to avoid false or premature convergence in many cases. 

Although various neighborhood topologies (like star, wheel, pyramid, 4-clusters, and circular) have been proposed in the literature for the PSO algorithms [42], after some initial experimentation over numerical benchmarks, the authors find that in the case of DE (where the population size is usually larger than in the case of PSO) the circular or ring topology provides best performance compared to other salient neighborhood structures. 

The vector indices are sorted only randomly (as obtained during initialization) in order to preserve the diversity of each neighborhood. 

This suggests that a judicious tradeoff between the explorative and the exploitative mutation operators is the key to the success of the search-dynamics of DEGL. 

G is the best individual vector with the best fitness (i.e., lowest objective function value for a minimization problem) in the population at generation G. Note that some of the strategies for creating the donor vector may be mutated recombinants, for example, (5) listed above, basically mutates a two-vector recombinant: Xi,G +F ·( Xbest,G − Xi,G). 

Now the authors combine the local and global donor vectors using a scalar weight w ∈ (0, 1) to form the actual donor vector of the proposed algorithmVi,G = w. gi,G + (1 − w). 

For multimodal functions, the final results are much more important since they reflect an algorithm’s ability of escaping from poor local optima and locating a good near-global optimum. 

Usually in the community of stochastic search algorithms, robust search is weighted over the highest possible convergence rate [56], [57]. 

The advantage of measuring the runtime complexity by counting the number of FEs is that the correspondence between this measure and the processor time becomes stronger as the function complexity increases. 

To overcome this limitation, a new trial vector generation strategy “DE/current-to-rand/1” is proposed in [19], which replaces the crossover operator prescribed in (9) with the rotationally invariant arithmetic crossover operator to generate the trial vector Ui,G by linearly combining the target vector Xi,G and the corresponding donor vector Vi,G as follows:Ui,G = Xi,G + K · ( Vi,G − Xi,G). 

The authors have used a test bed of 21 traditional numerical benchmarks (Table IV) [47] and three composition functions from the benchmark problems suggested in CEC 2005 [48] to evaluate the performance of the new DE variant.