scispace - formally typeset
Open AccessJournal ArticleDOI

The optimal convergence factor of the gradient based iterative algorithm for linear matrix equations

Xiang Wang, +1 more
- 01 Jan 2012 - 
- Vol. 26, Iss: 3, pp 607-613
TLDR
The theoretical results of this paper can be extended to other methods of gradient-type based and results of numerical experiments are consistent with the theoretical findings.
Abstract
A hierarchical gradient based iterative algorithm of (L. Xie et al., Computers and Mathematics with Applications 58 (2009) 1441-1448) has been presented for finding the numerical solution for general linear matrix equations, and the convergent factor has been discussed by numerical experiments. However, they pointed out that how to choose a best convergence factor is still a project to be studied. In this paper, we discussed the optimal convergent factor for the gradient based iterative algorithm and obtained the optimal convergent factor. Moreover, the theoretical results of this paper can be extended to other methods of gradient-type based. Results of numerical experiments are consistent with the theoretical findings.

read more

Content maybe subject to copyright    Report

Filomat 26:3 (2012), 607–613
DOI 10.2298/FIL1203607W
Published by Faculty of Sciences and Mathematics,
University of Ni
ˇ
s, Serbia
Available at: http://www.pmf.ni.ac.rs/filomat
The optimal convergence factor of the gradient based iterative
algorithm for linear matrix equations
Xiang Wang
a
, Dan Liao
a
a
Department of Mathematics, Nanchang University, Nanchang 330031, China
Abstract. A hierarchical gradient based iterative algorithm of [L. Xie et al., Computers and Mathematics
with Applications 58 (2009) 1441-1448] has been presented for finding the numerical solution for general
linear matrix equations, and the convergent factor has been discussed by numerical experiments. However,
they pointed out that how to choose a best convergence factor is still a project to be studied. In this paper,
we discussed the optimal convergent factor for the gradient based iterative algorithm and obtained the
optimal convergent factor. Moreover, the theoretical results of this paper can be extended to other methods
of gradient-type based. Results of numerical experiments are consistent with the theoretical findings.
1. Introduction
Numerical methods for solving matrix equations become interesting as soon as they play an important
role in various fields, such as neural network [18], model reduction [14] and image processing [1] etc.
Recently, iterative approaches for solving matrix equations and recursive identification for parameter
estimation have received much attention, e.g., ([6]-[10], [11], [15]-[16], [20]-[22]). In [17], Xie et al. presented
an ecient gradient based iterative algorithms for solving a class of matrix equations by applying the
hierarchical identification principle ([2]-[4]) and the convergence properties of the method are investigated.
Because the convergent rate relies on the convergent factor µ and the larger the convergent factor µ, the
faster the convergent rate of algorithm. However, when the convergent factor µ is too large, the algorithm
will diverge. Therefore, there exists a best convergent factor µ. In [17], the authors pointed out that how
to choose a best convergent factor µ is still a project to be studied. In this paper, we derived the optimal
convergent factor. Results of numerical experiments verify the theoretical findings.
The paper is organized as follows: In Section 2, we introduce the gradient based iterative algorithm for
a class of general linear matrix equations proposed by Xie et al. in [17]. The main results are presented in
Section 3. Finally, we present three numerical experiments to verify the theoretical finding.
2010 Mathematics Subject Classification. Primary 15A24; Secondary 65F20, 65F22, 65K10
Keywords. Linear matrix equations, optimal convergent factor, gradient based iterative algorithm
Received: 11 April 2011; Accepted: 14 August 2011
Communicated by Dragana Cvetkovi
´
c-Ili
´
c
This work is supported by NNSF of China No. 11101204, NSF of Jiangxi, China No. 20114BAB201004, Science Funds of The
Education Department of Jiangxi Province No. GJJ12011, and the Scientific Research Foundation of Graduate School of Nanchang
University with No. YC2011-S007
Email address: wangxiang49@ncu.edu.cn (Xiang Wang)

X. Wang, D. Liao / Filomat 26:3 (2012), 607–613 608
2. The gradient based iterative algorithm for a class of general linear matrix equations
Considering the following general linear matrix equations of form
p
i=1
A
i
XB
i
+
q
i=1
C
i
X
T
D
i
= F, (1)
where A
i
R
r×m
, B
i
R
n×s
, C
i
R
r×n
and D
i
R
m×s
and F = [ f
1
, f
2
, · · · , f
s
] R
r×s
are given constant matrix,
X R
m×n
is the unknown matrix to be solved.
Let us introduce some notations which are used in [17]. The symbol I or I
n
stands for an identity matrix
of appropriate sizes or size n × n. For two matrices M and N, M N is their Kronecker product; for an m × n
matrix X = [x
1
, x
2
, · · · , x
n
] R
m×n
, x
k
R
m
, col[X] is an mndimensional vector formed by the columns of
X, i.e. col[X] =
x
T
1
, x
T
2
, . . . , x
T
n
T
R
mn
.
Referring to Al Zhour and Kilicman’s work [19], the mn × mn square matrix is defined by
P
mn
=
m
i=1
n
j=1
E
ij
E
T
ij
, (2)
where E
ij
= e
i
e
T
j
called an elementary matrix of order m × n, and e
i
(e
j
) is a column vector with a unity in
the ith(jth) positions and zeros elsewhere of order m × 1(n × 1). According to the definition above, we have
P
mn
col[X
T
] = col[X], P
mn
P
nm
= I
mn
, P
T
mn
= P
1
mn
= P
nm
.
In [17], the authors presented the following lemma.
Lemma 2.1. Let S =
p
i=1
B
T
i
A
i
+
q
i=1
(D
T
i
C
i
)P
nm
, then (1) has a unique solution if and only if rand [S,
col[F]]=rank[S]=min(i.e., S has a full column rank). In this case, the unique solution is given by col[X] =
(S
T
S)
1
S
T
col[F], and the corresponding homogeneous matrix equation in (1) with F = 0 has a unique solution
X = 0.
By Lemma 2.1, we can obtain the solution of (1) but it requires excessive computer memory. Therefore,
the iterative methods are preferred.
In [17], the authors presented the following gradient based iterative algorithm for solving (1) by apply-
ing the hierarchical identification principle.
Algorithm 1: The gradient based iterative algorithm for Eq. (1)
1. X(k) =
1
p+q
[
p
j=1
X
j
(k) +
q
l=1
X
p+l
(k)],
2. X
j
(k) = X(k 1) + µA
T
j
[F
p
i=1
A
i
X(k 1)B
i
q
i=1
C
i
X
T
(k 1)D
i
]B
T
j
,
3. X
p+l
(k) = X(k 1) + µD
l
[F
p
i=1
A
i
X(k 1)B
i
q
i=1
C
i
X
T
(k 1)D
i
]
T
C
l
.
The algorithm above has been proved to be convergent under certain conditions.
Lemma 2.2. ([17]) If the equation in (1) has a unique solution X and the convergent factor µ satisfies the following
condition
0 < µ < 2(
p
j=1
λ
max
[A
j
A
T
j
]λ
max
[B
T
j
B
j
] +
q
l=1
λ
max
[C
l
C
T
l
]λ
max
[D
T
l
D
l
])
1
,
then the iterative solution X(k) given by Algorithm 1 converges to X, i.e., lim
k→∞
X(k) = X; or, the error X(k) X
converges to zero for any initial value X(0).

X. Wang, D. Liao / Filomat 26:3 (2012), 607–613 609
Denoted by
X(k) = X(k) X, we can get the error equation of Eq. (1) (see [17]) as
col[
X(k)] = (I
mn
µ
p + q
Φ)col[
X(k 1)], (3)
where
Φ
p
j=1
p
i=1
B
j
B
T
i
A
T
j
A
i
+
q
l=1
q
i=1
C
T
l
C
i
D
l
D
T
i
+
p
j=1
q
i=1
(B
j
D
T
i
A
T
j
C
i
)P
nm
+
q
l=1
p
i=1
(C
T
l
A
i
D
l
B
T
i
)P
nm
.
According to Eq. (3), we can easily get the following lemma.
Lemma 2.3. If the equation in (1) has a unique solution X , then Algorithm 1 converges for any initial value X(0) if
and only if the spectrum radius of the matrix I
mn
µ
p+q
Φ is less than one, that is, ρ(I
mn
µ
p+q
Φ) < 1; If the matrix Φ
is positive definite, then Algorithm 1 converges if and only if the convergent factor satisfies the following condition
0 < µ <
2(p + q)
λ
max
(Φ)
.
In other words, the closer the spectrum radius of I
mn
µ
p+q
Φ is to 0, the faster the error
X(k) converges
to zero. However, in [17] the authors didn’t discuss that how to choose a best convergent factor µ, and
pointed out that which is a project to be studied in the future.
In the next section, we will discuss this problem and derive the optimal convergent factor formula.
3. The selection of optimal convergent factor
Firstly, we present the following lemma, which is Theorem 3.4 in [19].
Lemma 3.1. Let P
mn
be the nm×nm matrix defined by (2), A R
n×m
, B R
m×n
, then we have P
mn
(AB)P
mn
= BA.
According to Lemma 3.1, it is trivial to prove that the matrix Φ is a symmetric matrix. The following
lemma plays an important role in determining the optimal convergent factor µ, and the proof is similar
with Lemma 5 of [12].
Lemma 3.2. Let b, a R, b > a and µ > 0, then we have
(a) If b > a > 0, then min
0<µ<2/b
{max{|1 µa|, |1 µb|}} =
ba
b+a
, and the minimizer can be reached at the point µ =
2
a+b
;
(b) If b > 0 > a, then min
0
{max{|1 µa|, |1 µb|}} > 1, that is, for any µ > 0 we have max{|1 µa|, |1 µb|} > 1;
(c) If a < b < 0, then min
0
{max{|1 µa|, |1 µb|}} > 1, and for any µ > 0 we have max{|1 µa|, | 1 µb|} > 1.
Proof. (a) If b > a > 0, then we have
max{|1 µa|, |1 µb|} =
1 µa, µ
2
a+b
,
µb 1, µ
2
a+b
,
which implies 1 µa 1
2
a+b
a =
ba
b+a
and µb 1
2
a+b
b =
ba
b+a
, i.e., min
0<µ<2/b
{max{|1 µa|, |1 µb|}} =
ba
b+a
, and
the minimizer can be reached at the point µ =
2
a+b
. The proof of (b) and (c) is trivial.
The following lemma is an immediately subsequence of Lemma 3.2.
Lemma 3.3. If Algorithm 1 converges to the solution of Eq. (1) for any initial value X(0), then the matrix Φ must
be positive definite and so is symmetric positive definite.

X. Wang, D. Liao / Filomat 26:3 (2012), 607–613 610
Now we can present the main result of this paper.
Theorem 3.4. If the matrix Φ is negative definite and indefinite, then Algorithm 1 will diverge for some initial
values, otherwise if 0 < µ <
2(p+q)
λ
max
(Φ)
it will converge and in this case the optimal convergent factor should be
µ
optimal
=
2(p + q)
λ
min
(Φ) + λ
max
(Φ)
. (4)
Proof. The first part is trivial by Lemma 3.1. According to (3), we can see that the optimal convergent factor
µ should been chosen to minimize the spectrum radius of the matrix Φ. As Φ is symmetric positive definite,
the spectrum radius can be obtained by the following formula
max{|1
µ
p + q
λ
min
(Φ)|, |1
µ
p + q
λ
max
(Φ)|},
which is less than one when 0 < µ <
2(p+q)
λ
max
(Φ)
, and so Algorithm 1 will converge. Then by Lemma 3.1, the
optimal convergent factor can be achieved by (4).
Remark 1. F. Ding et al. presented the gradient based algorithm for Sylvester matrix equations in [5] and
Q. Niu et al. presented the relaxed gradient based algorithm for the same matrix equations in [13]. The
authors only discussed the convergent factor µ by numerical experiments, and pointed out that how to
choose the best convergent factor in [6] and the best relaxed factor in [13] is very dicult and is a subject
to be studied in the future. In fact, by similar idea, we can get the optimal convergent factor of [6] and the
optimal relaxed factor of [13] easily. As well, we can show Ding’s algorithm in [6] and Niu’s Algorithm in
[13] are completely equivalent (both numerical and mathematical) if take µ
d
= ω(1 ω)µ
n
, where µ
d
and µ
n
are the convergent factors of Ding’s algorithm and Niu’s algorithm respectively.
Remark 2. According to Theorem 3.4, to achieve good convergence, eigenvalues estimates are required
in order to obtain the optimal or a near-optimal µ, and this may cause diculties. In addition, when
λ
max
(Φ) is very large, the curve ρ(I
mn
µ
p+q
Φ) can be extremely sensitive near the optimal value of µ. These
observations are common to many iterative approaches that depend on an acceleration parameter.
4. Numerical examples
This section gives two examples which are the same as those in [17] to verify the theoretical findings.
Example 1. ([17]) Suppose that AX + X
T
B = F, where
A =
1 1
2 1
, B =
1 1
1 1
, F =
8 8
5 2
.
The exact solution of the matrix equations above can be obtained as follows X =
1 2
3 4
. The matrix
P
22
=
1 0 0 0
0 0 1 0
0 1 0 0
0 0 0 1
, Φ =
9 1 1 1
1 6 0 0
1 0 3 2
1 0 2 2
.
By the Theorem 3.4, the optimal convergent factor is 0.3961 and
2(1+1)
λ
max
(Φ)
0.414. Take X(0) = 10
6
1
2×2
and µ = 0.3961, 0.0961, 0.1961, 0.2961, 0.412. Applying Algorithm 1 to compute X(k), the iterative errors
δ := X(k) X
F
/X
F
versus k are shown in Fig. 1 by Matlab command semilogy.
According to Fig. 1 and Fig. 2, it is clear that the larger the convergence factor µ, the faster the convergent
rate and when the convergent factor µ is taken to be 0.3961, the convergent rate is the fastest. However,

X. Wang, D. Liao / Filomat 26:3 (2012), 607–613 611
0 10 20 30 40 50 60 70 80 90 100
10
−16
10
−14
10
−12
10
−10
10
−8
10
−6
10
−4
10
−2
10
0
Fig. 1. The comparision of convergent rate for different µ .
µ =0.3961
µ=0.1961
µ=0.2961
µ=0.0961
µ=0.412
0 50 100 150 200 250 300 350 400 450 500
10
−1
10
0
10
1
10
2
10
3
10
4
10
5
10
6
10
7
10
8
Fig. 2. The convergent curve for µ=0.414
µ =0.414
when the convergent factor µ is greater than 0.3961, the convergent rate will slow and when µ is greater
than 0.414, Algorithm 1 will diverge (see Fig. 2), which verifies the theoretical findings.
Also, if we take the termination condition of Algorithm 1 to be the relative error δ 10
5
, we can plot
the figure of the iterative steps k versus the convergent factor µ (see Fig. 3).
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
0
50
100
150
200
250
300
Fig. 3. The iterative steps k versus the convergent factor µ
Iterative steps
0 10 20 30 40 50 60 70 80 90 100
10
−16
10
−14
10
−12
10
−10
10
−8
10
−6
10
−4
10
−2
10
0
Fig. 4. The comparision of convergent rate for different µ .
µ =0.0207
µ=0.0100
µ=0.0050
µ=0.0030
µ=0.0210
Example 2. ([17]) Suppose that A
1
XB
1
+ A
2
XB
2
+ C
1
X
T
D
1
+ C
2
X
T
D
2
= F, where
A
1
=
1 0
2 1
, A
2
=
0 1
3 1
, B
1
=
2 1
1 1
, B
2
=
3 1
2 1
,
C
1
=
1 2
1 2
, C
2
=
1 3
1 2
, D
1
=
2 1
1 2
, D
2
=
1 1
1 0
, F =
35 9
20 7
.
The exact solution of the above linear matrix equations is
X =
1 2
3 1
.
The matrix
P
22
=
1 0 0 0
0 0 1 0
0 1 0 0
0 0 0 1
, Φ =
143 38 133 52
38 51 28 25
133 28 289 14
52 25 14 39
.

Citations
More filters
Journal ArticleDOI

The accelerated gradient based iterative algorithm for solving a class of generalized Sylvester-transpose matrix equation

TL;DR: An accelerated gradient based algorithm by minimizing certain criterion quadratic function for solving the generalized Sylvester-transpose matrix equation A X B + C X T D = F converges to the exact solution for any initial value provided that some appropriate assumptions are made.
Journal ArticleDOI

Gradient-based iterative algorithm for solving the generalized coupled Sylvester-transpose and conjugate matrix equations over reflexive (anti-reflexive) matrices

TL;DR: In this paper, the generalized coupled Sylvester-transpose and conjugate matrix equations were considered and an iterative algorithm was proposed for solving the above coupled linear matrix equations over the group of reflexive (anti-reflexive) matrices.
Journal ArticleDOI

Gradient Based Iterative Algorithm to Solve General Coupled DiscreteTime Periodic Matrix Equations over Generalized Reflexive Matrices

Abstract: The discretetime periodic matrix equations are encountered in periodic state feedback problems and model reduction of periodic descriptor systems. The aim of this paper is to compute the generalized reflexive solutions of the general coupled discretetime periodic matrix equations. We introduce a gradientbased iterative (GI) algorithm for finding the generalized reflexive solutions of the general coupled discrete time periodic matrix equations. It is shown that the introduced GI algorithm always converges to the generalized reflexive solutions for any initial generalized reflexive matrices. Finally, two numerical examples are investigated to confirm the efficiency of GI algorithm.
Journal ArticleDOI

Alternating direction method for generalized Sylvester matrix equation AXB + CYD = E

TL;DR: Numerical experiments show that the proposed algorithms tend to deliver higher quality solutions with less iteration steps and less computing times than recent algorithms on the tested problems.
Journal ArticleDOI

Gradient-based iterative algorithms for generalized coupled Sylvester-conjugate matrix equations

TL;DR: By applying the hierarchical identification principle, the gradient-based iterative algorithm is suggested to solve a class of complex matrix equations to guarantee that the iterative solutions given by the proposed algorithm converge to the exact solution for any initial matrices.
References
More filters
Book

Model Reduction for Control System Design

TL;DR: In this article, model and controller reduction based on Coprime Factorization (CF) is proposed for low-order controller design, where the model is reduced by multiplicative approximation and the controller is reduced based on time factorization.
Journal ArticleDOI

A recurrent neural network for solving Sylvester equation with time-varying coefficients

TL;DR: The recurrent neural network with implicit dynamics is deliberately developed in the way that its trajectory is guaranteed to converge exponentially to the time-varying solution of a given Sylvester equation.
Journal ArticleDOI

Gradient based iterative algorithms for solving a class of matrix equations

TL;DR: A hierarchical identification principle is applied to study solving the Sylvester and Lyapunov matrix equations, and it is proved that the iterative solution consistently converges to the true solution for any initial value.
Journal ArticleDOI

Iterative least-squares solutions of coupled Sylvester matrix equations

TL;DR: A general family of iterative methods to solve linear equations, which includes the well-known Jacobi and Gauss–Seidel iterations as its special cases, are presented and it is proved that the iterative solution consistently converges to the exact solution for any initial value.
Journal ArticleDOI

On Iterative Solutions of General Coupled Matrix Equations

TL;DR: This paper extends the well-known Jacobi and Gauss--Seidel iterations and presents a large family of iterative methods, which are then applied to develop iterative solutions to coupled Sylvester matrix equations and proves that the iterative algorithm always converges to the (unique) solutions for any initial values.
Related Papers (5)
Frequently Asked Questions (9)
Q1. What have the authors contributed in "The optimal convergence factor of the gradient based iterative algorithm for linear matrix equations" ?

However, they pointed out that how to choose a best convergence factor is still a project to be studied. In this paper, the authors discussed the optimal convergent factor for the gradient based iterative algorithm and obtained the optimal convergent factor. Moreover, the theoretical results of this paper can be extended to other methods of gradient-type based. 

Because the convergent rate relies on the convergent factor µ and the larger the convergent factor µ, the faster the convergent rate of algorithm. 

Numerical methods for solving matrix equations become interesting as soon as they play an important role in various fields, such as neural network [18], model reduction [14] and image processing [1] etc. 

In [17], Xie et al. presented an efficient gradient based iterative algorithms for solving a class of matrix equations by applying the hierarchical identification principle ([2]-[4]) and the convergence properties of the method are investigated. 

iterative approaches for solving matrix equations and recursive identification for parameter estimation have received much attention, e.g., ([6]-[10], [11], [15]-[16], [20]-[22]). 

If the matrix Φ is negative definite and indefinite, then Algorithm 1 will diverge for some initial values, otherwise if 0 < µ < 2(p+q)λmax(Φ) it will converge and in this case the optimal convergent factor should beµoptimal = 2(p + q)λmin(Φ) + λmax(Φ) . 

For two matrices M and N,M⊗N is their Kronecker product; for an m× n matrix X = [x1, x2, · · · , xn] ∈ Rm×n, xk ∈ Rm, col[X] is an mn−dimensional vector formed by the columns of X, i.e. col[X] = [ xT1 , x T 2 , . . . , x T n ]T ∈ Rmn.Referring to Al Zhour and Kilicman’s work [19], the mn ×mn square matrix is defined byPmn = m∑i=1 n∑ j=1 Ei j ⊗ ETij, (2)where Ei j = eieTj called an elementary matrix of order m × n, and ei(e j) is a column vector with a unity in the ith( jth) positions and zeros elsewhere of order m× 1(n× 1). 

According to Fig. 1 and Fig. 2, it is clear that the larger the convergence factorµ, the faster the convergent rate and when the convergent factor µ is taken to be 0.3961, the convergent rate is the fastest. 

In this case, the unique solution is given by col[X] = (STS)−1STcol[F], and the corresponding homogeneous matrix equation in (1) with F = 0 has a unique solution X = 0.