What is the main purpose of this paper?

Numerical methods for solving matrix equations become interesting as soon as they play an important role in various fields, such as neural network [18], model reduction [14] and image processing [1] etc.

What is the main topic of the paper?

iterative approaches for solving matrix equations and recursive identification for parameter estimation have received much attention, e.g., ([6]-[10], [11], [15]-[16], [20]-[22]).

what is the optimal factor for a convergent matrix?

If the matrix Φ is negative definite and indefinite, then Algorithm 1 will diverge for some initial values, otherwise if 0 < µ < 2(p+q)λmax(Φ) it will converge and in this case the optimal convergent factor should beµoptimal = 2(p + q)λmin(Φ) + λmax(Φ) .

What is the simplest solution to the linear matrix equations?

In this case, the unique solution is given by col[X] = (STS)−1STcol[F], and the corresponding homogeneous matrix equation in (1) with F = 0 has a unique solution X = 0.

(Open Access) The optimal convergence factor of the gradient based iterative algorithm for linear matrix equations (2012) | Xiang Wang

Q: What have the authors contributed in "The optimal convergence factor of the gradient based iterative algorithm for linear matrix equations" ?

However, they pointed out that how to choose a best convergence factor is still a project to be studied. In this paper, the authors discussed the optimal convergent factor for the gradient based iterative algorithm and obtained the optimal convergent factor. Moreover, the theoretical results of this paper can be extended to other methods of gradient-type based.

Q: Why is the convergent rate based on the hierarchical identification principle?

Because the convergent rate relies on the convergent factor µ and the larger the convergent factor µ, the faster the convergent rate of algorithm.

Q: What is the main conclusion of the paper?

In [17], Xie et al. presented an efficient gradient based iterative algorithms for solving a class of matrix equations by applying the hierarchical identification principle ([2]-[4]) and the convergence properties of the method are investigated.

Q: what is the mn mn square matrix?

For two matrices M and N,M⊗N is their Kronecker product; for an m× n matrix X = [x1, x2, · · · , xn] ∈ Rm×n, xk ∈ Rm, col[X] is an mn−dimensional vector formed by the columns of X, i.e. col[X] = [ xT1 , x T 2 , . . . , x T n ]T ∈ Rmn.Referring to Al Zhour and Kilicman’s work [19], the mn ×mn square matrix is defined byPmn = m∑i=1 n∑ j=1 Ei j ⊗ ETij, (2)where Ei j = eieTj called an elementary matrix of order m × n, and ei(e j) is a column vector with a unity in the ith( jth) positions and zeros elsewhere of order m× 1(n× 1).

Q: What is the convergent rate for X(k)?

According to Fig. 1 and Fig. 2, it is clear that the larger the convergence factorµ, the faster the convergent rate and when the convergent factor µ is taken to be 0.3961, the convergent rate is the fastest.

Q: What is the simplest solution to the linear matrix equations?

In this case, the unique solution is given by col[X] = (STS)−1STcol[F], and the corresponding homogeneous matrix equation in (1) with F = 0 has a unique solution X = 0.

Filomat 26:3 (2012), 607–613

DOI 10.2298/FIL1203607W

Published by Faculty of Sciences and Mathematics,

University of Ni

s, Serbia

Available at: http://www.pmf.ni.ac.rs/filomat

The optimal convergence factor of the gradient based iterative

algorithm for linear matrix equations

Xiang Wang

, Dan Liao

Department of Mathematics, Nanchang University, Nanchang 330031, China

Abstract. A hierarchical gradient based iterative algorithm of [L. Xie et al., Computers and Mathematics

with Applications 58 (2009) 1441-1448] has been presented for ﬁnding the numerical solution for general

linear matrix equations, and the convergent factor has been discussed by numerical experiments. However,

they pointed out that how to choose a best convergence factor is still a project to be studied. In this paper,

we discussed the optimal convergent factor for the gradient based iterative algorithm and obtained the

optimal convergent factor. Moreover, the theoretical results of this paper can be extended to other methods

of gradient-type based. Results of numerical experiments are consistent with the theoretical ﬁndings.

1. Introduction

Numerical methods for solving matrix equations become interesting as soon as they play an important

role in various ﬁelds, such as neural network [18], model reduction [14] and image processing [1] etc.

Recently, iterative approaches for solving matrix equations and recursive identiﬁcation for parameter

estimation have received much attention, e.g., ([6]-[10], [11], [15]-[16], [20]-[22]). In [17], Xie et al. presented

an eﬃcient gradient based iterative algorithms for solving a class of matrix equations by applying the

hierarchical identiﬁcation principle ([2]-[4]) and the convergence properties of the method are investigated.

Because the convergent rate relies on the convergent factor µ and the larger the convergent factor µ, the

faster the convergent rate of algorithm. However, when the convergent factor µ is too large, the algorithm

will diverge. Therefore, there exists a best convergent factor µ. In [17], the authors pointed out that how

to choose a best convergent factor µ is still a project to be studied. In this paper, we derived the optimal

convergent factor. Results of numerical experiments verify the theoretical ﬁndings.

The paper is organized as follows: In Section 2, we introduce the gradient based iterative algorithm for

a class of general linear matrix equations proposed by Xie et al. in [17]. The main results are presented in

Section 3. Finally, we present three numerical experiments to verify the theoretical ﬁnding.

2010 Mathematics Subject Classiﬁcation. Primary 15A24; Secondary 65F20, 65F22, 65K10

Keywords. Linear matrix equations, optimal convergent factor, gradient based iterative algorithm

Received: 11 April 2011; Accepted: 14 August 2011

Communicated by Dragana Cvetkovi

c-Ili

This work is supported by NNSF of China No. 11101204, NSF of Jiangxi, China No. 20114BAB201004, Science Funds of The

Education Department of Jiangxi Province No. GJJ12011, and the Scientiﬁc Research Foundation of Graduate School of Nanchang

University with No. YC2011-S007

Email address: wangxiang49@ncu.edu.cn (Xiang Wang)

X. Wang, D. Liao / Filomat 26:3 (2012), 607–613 608

2. The gradient based iterative algorithm for a class of general linear matrix equations

Considering the following general linear matrix equations of form



i=1



i=1

= F, (1)

where A

∈ R

r×m

, B

∈ R

n×s

, C

∈ R

r×n

and D

∈ R

m×s

and F = [ f

, f

, · · · , f

] ∈ R

r×s

are given constant matrix,

X ∈ R

m×n

is the unknown matrix to be solved.

Let us introduce some notations which are used in [17]. The symbol I or I

stands for an identity matrix

of appropriate sizes or size n × n. For two matrices M and N, M ⊗ N is their Kronecker product; for an m × n

matrix X = [x

, x

, · · · , x

] ∈ R

m×n

, x

∈ R

, col[X] is an mn−dimensional vector formed by the columns of

X, i.e. col[X] =



, x

, . . . , x



∈ R

Referring to Al Zhour and Kilicman’s work [19], the mn × mn square matrix is deﬁned by



i=1



j=1

⊗ E

, (2)

where E

= e

called an elementary matrix of order m × n, and e

) is a column vector with a unity in

the ith(jth) positions and zeros elsewhere of order m × 1(n × 1). According to the deﬁnition above, we have

col[X

] = col[X], P

= I

, P

= P

−1

= P

In [17], the authors presented the following lemma.

Lemma 2.1. Let S =



i=1

⊗ A



i=1

⊗ C

, then (1) has a unique solution if and only if rand [S,

col[F]]=rank[S]=min(i.e., S has a full column rank). In this case, the unique solution is given by col[X] =

−1

col[F], and the corresponding homogeneous matrix equation in (1) with F = 0 has a unique solution

X = 0.

By Lemma 2.1, we can obtain the solution of (1) but it requires excessive computer memory. Therefore,

the iterative methods are preferred.

In [17], the authors presented the following gradient based iterative algorithm for solving (1) by apply-

ing the hierarchical identiﬁcation principle.

Algorithm 1: The gradient based iterative algorithm for Eq. (1)

1. X(k) =

p+q

[



j=1

(k) +



l=1

p+l

(k)],

2. X

(k) = X(k − 1) + µA

[F −



i=1

X(k − 1)B

−



i=1

(k − 1)D

3. X

p+l

(k) = X(k − 1) + µD

[F −



i=1

X(k − 1)B

−



i=1

(k − 1)D

]

The algorithm above has been proved to be convergent under certain conditions.

Lemma 2.2. ([17]) If the equation in (1) has a unique solution X and the convergent factor µ satisﬁes the following

condition

0 < µ < 2(



j=1

max

]λ

max

] +



l=1

max

]λ

max

])

−1

then the iterative solution X(k) given by Algorithm 1 converges to X, i.e., lim

k→∞

X(k) = X; or, the error X(k) − X

converges to zero for any initial value X(0).

X. Wang, D. Liao / Filomat 26:3 (2012), 607–613 609

Denoted by



X(k) = X(k) − X, we can get the error equation of Eq. (1) (see [17]) as

col[



X(k)] = (I

−

p + q

Φ)col[



X(k − 1)], (3)

where

Φ ≡



j=1



i=1

⊗ A



l=1



i=1

⊗ D



j=1



i=1

⊗ A



l=1



i=1

⊗ D

According to Eq. (3), we can easily get the following lemma.

Lemma 2.3. If the equation in (1) has a unique solution X , then Algorithm 1 converges for any initial value X(0) if

and only if the spectrum radius of the matrix I

−

p+q

Φ is less than one, that is, ρ(I

−

p+q

Φ) < 1; If the matrix Φ

is positive deﬁnite, then Algorithm 1 converges if and only if the convergent factor satisﬁes the following condition

0 < µ <

2(p + q)

max

(Φ)

In other words, the closer the spectrum radius of I

−

p+q

Φ is to 0, the faster the error



X(k) converges

to zero. However, in [17] the authors didn’t discuss that how to choose a best convergent factor µ, and

pointed out that which is a project to be studied in the future.

In the next section, we will discuss this problem and derive the optimal convergent factor formula.

3. The selection of optimal convergent factor

Firstly, we present the following lemma, which is Theorem 3.4 in [19].

Lemma 3.1. Let P

be the nm×nm matrix deﬁned by (2), A ∈ R

n×m

, B ∈ R

m×n

, then we have P

(A⊗B)P

= B⊗A.

According to Lemma 3.1, it is trivial to prove that the matrix Φ is a symmetric matrix. The following

lemma plays an important role in determining the optimal convergent factor µ, and the proof is similar

with Lemma 5 of [12].

Lemma 3.2. Let b, a ∈ R, b > a and µ > 0, then we have

(a) If b > a > 0, then min

0<µ<2/b

{max{|1 − µa|, |1 − µb|}} =

b−a

b+a

, and the minimizer can be reached at the point µ =

a+b

;

(b) If b > 0 > a, then min

0<µ

{max{|1 − µa|, |1 − µb|}} > 1, that is, for any µ > 0 we have max{|1 − µa|, |1 − µb|} > 1;

0<µ

{max{|1 − µa|, |1 − µb|}} > 1, and for any µ > 0 we have max{|1 − µa|, | 1 − µb|} > 1.

Proof. (a) If b > a > 0, then we have

max{|1 − µa|, |1 − µb|} =



1 − µa, µ ≤

a+b

µb − 1, µ ≥

a+b

which implies 1 − µa ≥ 1 −

a+b

a =

b−a

b+a

and µb − 1 ≥

a+b

b =

b−a

b+a

, i.e., min

0<µ<2/b

{max{|1 − µa|, |1 − µb|}} =

b−a

b+a

, and

the minimizer can be reached at the point µ =

a+b

. The proof of (b) and (c) is trivial.

The following lemma is an immediately subsequence of Lemma 3.2.

Lemma 3.3. If Algorithm 1 converges to the solution of Eq. (1) for any initial value X(0), then the matrix Φ must

be positive deﬁnite and so is symmetric positive deﬁnite.

X. Wang, D. Liao / Filomat 26:3 (2012), 607–613 610

Now we can present the main result of this paper.

Theorem 3.4. If the matrix Φ is negative deﬁnite and indeﬁnite, then Algorithm 1 will diverge for some initial

values, otherwise if 0 < µ <

2(p+q)

max

(Φ)

it will converge and in this case the optimal convergent factor should be

optimal

2(p + q)

min

(Φ) + λ

max

(Φ)

. (4)

Proof. The ﬁrst part is trivial by Lemma 3.1. According to (3), we can see that the optimal convergent factor

µ should been chosen to minimize the spectrum radius of the matrix Φ. As Φ is symmetric positive deﬁnite,

the spectrum radius can be obtained by the following formula

max{|1 −

p + q

min

(Φ)|, |1 −

p + q

max

(Φ)|},

which is less than one when 0 < µ <

2(p+q)

max

(Φ)

, and so Algorithm 1 will converge. Then by Lemma 3.1, the

optimal convergent factor can be achieved by (4).

Remark 1. F. Ding et al. presented the gradient based algorithm for Sylvester matrix equations in [5] and

Q. Niu et al. presented the relaxed gradient based algorithm for the same matrix equations in [13]. The

authors only discussed the convergent factor µ by numerical experiments, and pointed out that how to

choose the best convergent factor in [6] and the best relaxed factor in [13] is very diﬃcult and is a subject

to be studied in the future. In fact, by similar idea, we can get the optimal convergent factor of [6] and the

optimal relaxed factor of [13] easily. As well, we can show Ding’s algorithm in [6] and Niu’s Algorithm in

[13] are completely equivalent (both numerical and mathematical) if take µ

= ω(1 − ω)µ

, where µ

and µ

are the convergent factors of Ding’s algorithm and Niu’s algorithm respectively.

Remark 2. According to Theorem 3.4, to achieve good convergence, eigenvalues estimates are required

in order to obtain the optimal or a near-optimal µ, and this may cause diﬃculties. In addition, when

max

(Φ) is very large, the curve ρ(I

−

p+q

Φ) can be extremely sensitive near the optimal value of µ. These

observations are common to many iterative approaches that depend on an acceleration parameter.

4. Numerical examples

This section gives two examples which are the same as those in [17] to verify the theoretical ﬁndings.

Example 1. ([17]) Suppose that AX + X

B = F, where

A =



1 1

2 −1



, B =



1 −1

1 1



, F =



8 8

5 2



The exact solution of the matrix equations above can be obtained as follows X =



1 2

3 4



. The matrix







1 0 0 0

0 0 1 0

0 1 0 0

0 0 0 1







, Φ =







9 1 1 1

1 6 0 0

1 0 3 2

1 0 2 2







By the Theorem 3.4, the optimal convergent factor is 0.3961 and

2(1+1)

max

(Φ)

≈ 0.414. Take X(0) = 10

−6

2×2

and µ = 0.3961, 0.0961, 0.1961, 0.2961, 0.412. Applying Algorithm 1 to compute X(k), the iterative errors

δ := ∥X(k) − X∥

/∥X∥

versus k are shown in Fig. 1 by Matlab command semilogy.

According to Fig. 1 and Fig. 2, it is clear that the larger the convergence factor µ, the faster the convergent

rate and when the convergent factor µ is taken to be 0.3961, the convergent rate is the fastest. However,

X. Wang, D. Liao / Filomat 26:3 (2012), 607–613 611

0 10 20 30 40 50 60 70 80 90 100

−16

−14

−12

−10

−8

−6

−4

−2

Fig. 1. The comparision of convergent rate for different µ .

µ =0.3961

µ=0.1961

µ=0.2961

µ=0.0961

µ=0.412

0 50 100 150 200 250 300 350 400 450 500

−1

Fig. 2. The convergent curve for µ=0.414

µ =0.414

when the convergent factor µ is greater than 0.3961, the convergent rate will slow and when µ is greater

than 0.414, Algorithm 1 will diverge (see Fig. 2), which veriﬁes the theoretical ﬁndings.

Also, if we take the termination condition of Algorithm 1 to be the relative error δ ≤ 10

−5

, we can plot

the ﬁgure of the iterative steps k versus the convergent factor µ (see Fig. 3).

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

100

150

200

250

300

Fig. 3. The iterative steps k versus the convergent factor µ

Iterative steps

0 10 20 30 40 50 60 70 80 90 100

−16

−14

−12

−10

−8

−6

−4

−2

Fig. 4. The comparision of convergent rate for different µ .

µ =0.0207

µ=0.0100

µ=0.0050

µ=0.0030

µ=0.0210

Example 2. ([17]) Suppose that A

+ A

+ C

= F, where



1 0

2 −1



, A



0 1

3 −1



, B



2 −1

1 1



, B



3 −1

2 1





1 2

−1 2



, C



−1 3

−1 2



, D



2 −1

1 2



, D



1 1

−1 0



, F =



35 9

20 7



The exact solution of the above linear matrix equations is

X =



1 2

3 1



The matrix







1 0 0 0

0 0 1 0

0 1 0 0

0 0 0 1







, Φ =







143 −38 133 −52

−38 51 −28 25

133 −28 289 −14

−52 25 −14 39







The optimal convergence factor of the gradient based iterative algorithm for linear matrix equations

Figures

Citations

The accelerated gradient based iterative algorithm for solving a class of generalized Sylvester-transpose matrix equation

Gradient-based iterative algorithm for solving the generalized coupled Sylvester-transpose and conjugate matrix equations over reflexive (anti-reflexive) matrices

Gradient Based Iterative Algorithm to Solve General Coupled DiscreteTime Periodic Matrix Equations over Generalized Reflexive Matrices

Alternating direction method for generalized Sylvester matrix equation AXB + CYD = E

Gradient-based iterative algorithms for generalized coupled Sylvester-conjugate matrix equations

References

Model Reduction for Control System Design

A recurrent neural network for solving Sylvester equation with time-varying coefficients

Gradient based iterative algorithms for solving a class of matrix equations

Iterative least-squares solutions of coupled Sylvester matrix equations

On Iterative Solutions of General Coupled Matrix Equations

Related Papers (5)

On Iterative Solutions of General Coupled Matrix Equations

Iterative least-squares solutions of coupled Sylvester matrix equations

Gradient based iterative algorithms for solving a class of matrix equations

Gradient based and least squares based iterative algorithms for matrix equations AXB + CXTD = F ☆

A relaxed gradient based algorithm for solving sylvester equations

Frequently Asked Questions (9)

Q1. What have the authors contributed in "The optimal convergence factor of the gradient based iterative algorithm for linear matrix equations" ?

Q2. Why is the convergent rate based on the hierarchical identification principle?

Q3. What is the main purpose of this paper?

Q4. What is the main conclusion of the paper?

Q5. What is the main topic of the paper?

Q6. what is the optimal factor for a convergent matrix?

Q7. what is the mn mn square matrix?

Q8. What is the convergent rate for X(k)?

Q9. What is the simplest solution to the linear matrix equations?