scispace - formally typeset
Open AccessJournal ArticleDOI

Performance analysis of the generalised projection identification for time-varying systems

Feng Ding, +2 more
- 01 Dec 2016 - 
- Vol. 10, Iss: 18, pp 2506-2514
Reads0
Chats0
TLDR
In this paper, a generalised projection identification algorithm (or a finite data window stochastic gradient identification algorithm) for time-varying systems is presented and its convergence is analyzed by using the Stochastic Process Theory.
Abstract
The least mean square methods include two typical parameter estimation algorithms, which are the projection algorithm and the stochastic gradient algorithm, the former is sensitive to noise and the latter is not capable of tracking the time-varying parameters. On the basis of these two typical algorithms, this study presents a generalised projection identification algorithm (or a finite data window stochastic gradient identification algorithm) for time-varying systems and studies its convergence by using the stochastic process theory. The analysis indicates that the generalised projection algorithm can track the time-varying parameters and requires less computational effort compared with the forgetting factor recursive least squares algorithm. The way of choosing the data window length is stated so that the minimum parameter estimation error upper bound can be obtained. The numerical examples are provided.

read more

Content maybe subject to copyright    Report

Performance analysis of the generalized projection
identification for time-varying systems
Feng Ding
1
, Ling Xu
1,2
, Quanmin Zhu
3
1. Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education),
School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, PR China
2. School of Internet of Things Technology, Wuxi Vocational Institute of Commerce, Wuxi 214153, PR China
3. Department of Engineering Design and Mathematics, Bristol BS16 1QY, University of the West of England
* Corresponding author: fding@jiangnan.edu.cn
February 23, 2016
Abstract: The least mean square methods include two typical parameter estimation algorithms, which are the
projection algorithm and the stochastic gradient algorithm, the former is sensitive to noise and the latter is
not capable of tracking the time-varying parameters. On the basis of these two typical algorithms, this paper
presents a generalized projection identification algorithm for time-varying systems and studies its convergence
by using the stochastic process theory. The analysis indicates that the generalized projection algorithm can
track the time-varying parameters and requires less computational effort compared with the forgetting factor
recursive least squares algorithm. The way of choosing the data window length is stated so that the minimum
parameter estimation error upper bound can be obtained. The numerical examples are provided.
Keywords: Parameter estimation; recursive identification; time-varying system
1 Introduction
Establishing the mathematical models of things or systems is the main task of natural sciences. Mathematical
models are very important in many areas such as controller design [1, 2], information filtering [3, 4], fault
detection and diagnosis [5, 6], and state filtering and estimation [7–9]. System identification is the theory and
methods of establishing the mathematical models of systems [10–12]. Parameter estimation methods are basic
for system identification. Recently, Ding and Gu analyzed the performance analysis of the auxiliary model-based
stochastic gradient parameter estimation algorithm for state space systems with one-step state delay [13]. This
paper considers the identification algorithm and its performance analysis for time-varying systems [14, 15],
A(t, z)y(t) = B(t, z)u(t) + v(t), (1)
where {u(t)} and {y(t)} are the input and output sequences of the system, respectively, {v(t)} is a stochastic
noise sequence with zero mean, and z
1
is a unit backward shift operator: z
1
y(t) = y(t1), A(t, z) and B(t, z)
are time-varying coefficient polynomials in z
1
, and
A(t, z) := 1 + a
1
(t)z
1
+ a
2
(t)z
2
+ · · · + a
n
a
(t)z
n
a
,
Corresponding author
E-mail: fding@jiangnan.edu.cn (F. Ding)
1

B(t, z) := b
1
(t)z
1
+ b
2
(t)z
2
+ · · · + b
n
b
(t)z
n
b
.
Define the time-varying parameter vector ϑ(t 1) R
n
to be identified and the regressive information vector
ψ(t) R
n
consisting of the observations up to and including time (t 1),
ϑ(t 1) := [a
1
(t), · · · , a
n
a
(t), b
1
(t), · · · , b
n
b
(t)]
T
R
n
,
ψ(t) := [y(t 1), y(t 2), · · · , y(t n
a
), u(t 1), u(t 2), · · · , u(t n
b
)]
T
R
n
,
where the superscript T denotes a vector transpose. Equation (1) can be written in vector form
y(t) = ψ
T
(t)ϑ(t 1) + v(t). (2)
The forgetting factor recursive least squares (FF-RLS) algorithm is effective for estimating the time-varying
parameter vector [16]. In this literature, Lozano [17], and Canetti and Espana [18] analyzed the performance
of the FF-RLS algorithms for time-invariant and time-varying systems, respectively. Unfortunately, as the
forgetting factor approaches unity, their results (i.e., the parameter estimation errors) goes to infinity even for
time-invariant systems whose parameters are constant [15]. Bittanti et al studied the convergence properties of
the directional FF-RLS algorithms for time-invariant deterministic systems [19]; for ergodic input-output data,
Ljung and Prioret [20, 21] and Guo et al [22] obtained a parameter estimation error (PEE) upper bound like
E[k
ˆ
ϑ(t) ϑ(t)k
2
] 6 k
1
(1 λ) sup E[v
2
(t)] +
k
2
1 λ
sup E[kw(t)k
2
]
+O
³
(1 λ)
3/2
+ c(1 λ)
1/2
´
(3)
for large enough t, where
ˆ
ϑ(t) is the estimate of ϑ(t), E represents the expectation operator, 0 < λ < 1 is
the forgetting factor, k
1
, k
2
and c are positive constants, w(t) is the parameter changing rate. However, for
deterministic time-varying systems, i.e., the observation noise v(t) 0, as λ 0, the PEE upper bound [i.e.,
the expression on the right-hand side of (3)] is bounded. Unfortunately, this result is incompatible with the
existing ones because as λ 0, the covariance matrix goes to infinity; it is impossible to obtain the bounded
PEE. This motivates us to present a novel generalized stochastic gradient algorithm.
Although the forgetting factor recursive least squares algorithm can estimate the time-varying parameter
vector ϑ(t), its computational load is heavy due to computing the covariance matrix [15,23]. From the perspec-
tive of decreasing computational complexity, the projection algorithm is sensitive to noise and the stochastic
gradient (SG) algorithm is not capable of tracking the time-varying parameters [23]. On the basis of the work
in [24], this paper combines the advantages of the projection algorithm and the SG algorithm to present a gen-
eralized projection identification algorithm for time-varying systems, and studies the convergence performance
of the proposed algorithm by using the stochastic process theory. The generalized projection algorithm can
track the time-varying parameters and requires less computational effort compared with the forgetting factor
recursive least squares algorithm.
This paper is organized as follows. Section 2 gives several time-varying parameter estimation algorithms and
derives the generalized projection algorithm. Section 3 provides several lemmas to prove the main convergence
results in Section 4. Section 5 provides two numerical examples and summarizes some conclusions.
2

2 The generalized projection algorithm
Let us introduce some notations. Let I
n
be an identity matrix of order n, tr[X] denote the trace of a square
matrix X, and the norm of X be kXk :=
p
tr[X
T
X] =
p
tr[XX
T
] and λ
max
[X] represent the maximum
eigenvalue of the positive definite matrix X.
The following discusses several time-varying parameter estimation algorithms to real-time identify the pa-
rameter vector ϑ(t) by using the input-output data, i.e., the observations {u(j), y(j), j 6 t}.
Using the Newton method and minimizing the cost function
J(ϑ(t)) :=
t
X
j=1
[y(j) ψ
T
(j)ϑ(t)]
2
give the following recursive algorithm of estimating the parameters ϑ(t) [14]:
ˆ
ϑ(t) =
ˆ
ϑ(t 1) + R
1
(t)ψ(t)[y(t) ψ
T
(t)
ˆ
ϑ(t 1)], (4)
where
ˆ
ϑ(t) is the estimate of ϑ(t) at time t. The difference of the matrix R(t) R
n×n
will lead to different
identification algorithms, e.g., the forgetting factor recursive least squares algorithm, the projection algorithm,
the finite data window least squares algorithm, the stochastic gradient algorithm, and so on.
1. If we take R(t) := P
1
(t), and
P
1
(t) := λP
1
(t 1) + ψ(t)ψ
T
(t), 0 < λ < 1, (5)
then Equations (4) and (5) form the FF-RLS algorithm [15]:
ˆ
ϑ(t) =
ˆ
ϑ(t 1) + P (t)ψ(t)[y(t) ψ
T
(t)
ˆ
ϑ(t 1)], (6)
P
1
(t) = λP
1
(t 1) + ψ(t)ψ
T
(t), 0 < λ < 1. (7)
As the forgetting factor λ goes to unity, the FF-RLS algorithm in (6) and (7) reduces to the recursive least
squares (RLS) algorithm:
ˆ
ϑ(t) =
ˆ
ϑ(t 1) + P (t)ψ(t)[y(t) ψ
T
(t)
ˆ
ϑ(t 1)], (8)
P
1
(t) = P
1
(t 1) + ψ(t)ψ
T
(t), P (0) = p
0
I
n
. (9)
The number p
0
should be large enough if the regression variables take very small values.
2. If we take R(t) := P
1
(t) and
P
1
(t) =
q1
X
i=0
ψ(t i)ψ
T
(t i) (10)
= P
1
(t 1) + ψ(t)ψ
T
(t) ψ(t q)ψ
T
(t q), (11)
then Equations (4) and (11) form the finite data window recursive least squares (FDW-RLS) algorithm:
ˆ
ϑ(t) =
ˆ
ϑ(t 1) + P (t)ψ(t)[y(t) ψ
T
(t)
ˆ
ϑ(t 1)], (12)
P
1
(t) = P
1
(t 1) + ψ(t)ψ
T
(t) ψ(t q)ψ
T
(t q), P (0) = p
0
I
n
, (13)
where q is the length of the data window.
3

3. If we take R(t) := r(t)I
n
and the trace of b oth sides of Equation (7) and define
r(t) := tr[P
1
(t)] = r(t 1) + kψ(t)k
2
, (14)
then Equations (4) and (14) form the forgetting factor stochastic gradient (FFSG) algorithm (the forgetting
gradient algorithm for short):
ˆ
ϑ(t) =
ˆ
ϑ(t 1) +
ψ(t)
r(t)
[y(t) ψ
T
(t)
ˆ
ϑ(t 1)], (15)
r(t) = λr(t 1) + kψ(t)k
2
, 0 < λ < 1, r(0) = 1. (16)
As the forgetting factor λ goes to unity, the algorithm in (15)–(16) becomes the following stochastic gradient
(SG) algorithm [23]:
ˆ
ϑ(t) =
ˆ
ϑ(t 1) +
ψ(t)
r(t)
[y(t) ψ
T
(t)
ˆ
ϑ(t 1)], (17)
r(t) = r(t 1) + kψ(t)k
2
, r (0) = 1. (18)
Recently, a multi-innovation stochastic gradient algorithm was proposed to track time-varying parameters for
linear regression models [25].
4. If we take λ = 0 in (16), then the forgetting factor stochastic gradient algorithm in (15)–(16) becomes
the projection (PJ) identification algorithm:
ˆ
ϑ(t) =
ˆ
ϑ(t 1) +
ψ(t)
kψ(t)k
2
[y(t) ψ
T
(t)
ˆ
ϑ(t 1)]. (19)
5. If we take R(t) := r(t)I
n
and the trace of b oth sides of Equation (10) and define
r(t) := tr[P
1
(t)] =
q1
X
i=0
kψ(t i)k
2
, (20)
then Equations (4) and (20) form the generalized projection (GPJ) identification algorithm:
ˆ
ϑ(t) =
ˆ
ϑ(t 1) +
ψ(t)
r(t)
[y(t) ψ
T
(t)
ˆ
ϑ(t 1)], (21)
r(t) = r(t 1) + kψ(t)k
2
kψ(t q)k
2
, r (0) = 1. (22)
As the data window length q = 1, the GPJ algorithm is the projection algorithm in (19); as we take q = t, the
GPJ algorithm becomes the stochastic gradient algorithm in (17)–(18).
3 The basic lemmas
Define the transition matrix
Φ(t + 1, j) = [I
n
ψ(t)ψ
T
(t)/r(t)] Φ(t, j), Φ(j, j) = I
n
and the maximum eigenvalue
ρ(t) := λ
max
[Φ
T
(t + s, t)Φ(t + s, t)].
It follows that
Φ(t + 1, t) = I
n
ψ(t)ψ
T
(t)/r(t) R
n×n
.
4

Lemma 1 For the system in (2) and the GPJ algorithm in (21)–(22), assume that there exist positive constants
α and β and an integer s > n such that the following strong persistent excitation (SPE) condition holds,
(SPE) αI
n
6
1
s
s1
X
j=0
ψ(t + j)ψ
T
(t + j) 6 βI
n
, a.s., t > 0.
Then the maximum eigenvalue rho(t) satisfies
ρ(t) 6 1 α
2
{(q + s)()
2
(s + 1)
2
}
1
=: ρ < 1, a.s.
Proof Here, we refer to the way in [25] for proving this lemma. Let ζ
0
R
n
be the unit eigenvector correspond-
ing to the maximum eigenvalue ρ(t) of the matrix Φ
T
(t + s, t)Φ(t + s, t), i.e., Φ
T
(t + s, t)Φ(t + s, t)ζ
0
= ρ(t)ζ
0
.
Use the transition matrix Φ(t + 1, j) to construct the difference equation,
ξ(j + 1) = Φ(j + 1, j)ξ(j), ξ
t
= ζ
0
. (23)
Using the relation Φ(t, i)Φ(i, j) = Φ(t, j), we have
ξ(t + s) = Φ(t + s, t)ξ(t) = Φ(t + s, t)ζ
0
.
Taking the norm to both sides and using the definition of the eigenvalue, we have
kξ(t + s)k
2
= ζ
T
0
Φ
T
(t + s, t)Φ(t + s, t)ζ
0
= ζ
T
0
ρ(t)ζ
0
= ρ(t).
Taking the norm to both sides of (23) gives
kξ(j + 1)k
2
= ξ
T
(j + 1)ξ(j + 1)
= ξ
T
(j)Φ
T
(j + 1, j)Φ(j + 1, j)ξ(j)
= ξ
T
(j)[I
n
ψ(j)ψ
T
(j)/r(j)]
2
ξ(j)
6 ξ
T
(j)[I
n
ψ(j)ψ
T
(j)/r(j)]ξ(j)
= ξ
T
(j)ξ(j) [ψ
T
(j)ξ(j)]
2
/r(j)
= kξ(j)k
2
[ψ
T
(j)ξ(j)]
2
/r(j).
Thus we have
kψ
T
(j)ξ(j)k
2
/r(j) 6 kξ(j)k
2
kξ(j + 1)k
2
.
Replacing t + j with j gives
kψ
T
(t + j)ξ(t + j)k
2
/r(t + j) 6 kξ(t + j)k
2
kξ(t + j + 1)k
2
.
Summing for j from j = 0 to j = s 1 gives
s1
X
j=0
kψ
T
(t + j)ξ(t + j)k
2
/r(t + j) 6
s1
X
j=0
kξ(t + j)k
2
kξ(t + j + 1)k
2
= kξ(t)k
2
kξ(t + s)k
2
= 1 ρ(t). (24)
5

Citations
More filters
Journal ArticleDOI

A filtering based multi-innovation extended stochastic gradient algorithm for multivariable control systems

TL;DR: A filtering based extended stochastic gradient algorithm and a filtering based multi-innovation ESG algorithm for improving the parameter estimation accuracy for a multivariable system with moving average noise.
Journal ArticleDOI

Hierarchical Parameter Estimation for the Frequency Response Based on the Dynamical Window Data

TL;DR: In this paper, a hierarchical multi-innovation stochastic gradient estimation method is derived through parameter decomposition, and the forgetting factor and the convergence factor are introduced to improve the performance of the algorithm.
Journal ArticleDOI

The least squares based iterative algorithms for parameter estimation of a bilinear system with autoregressive noise using the data filtering technique

TL;DR: A two-stage least squares based iterative algorithm and a filtering based least squares iterative algorithms are proposed for estimating the parameters of bilinear systems with colored noises by using the hierarchical identification principle and the data filtering technique.
Journal ArticleDOI

The parameter estimation algorithms based on the dynamical response measurement data

TL;DR: In this article, the authors studied the parameter estimation to the system response from the discrete measurement data, by constructing the dynamical rolling cost functions and using the nonlinear optimization, t
Journal ArticleDOI

A hierarchical least squares identification algorithm for Hammerstein nonlinear systems using the key term separation

TL;DR: This paper considers the parameter identification for Hammerstein controlled autoregressive systems by using the key term separation technique to express the system output as a linear combination of the system parameters, and then a hierarchical least squares algorithm is developed for estimating all parameters involving in the subsystems.
References
More filters
Journal ArticleDOI

A New Approach to Stability Analysis and Stabilization of Discrete-Time T-S Fuzzy Time-Varying Delay Systems

TL;DR: This paper investigates the problems of stability analysis and stabilization for a class of discrete-time Takagi-Sugeno fuzzy systems with time-varying state delay with a novel fuzzy Lyapunov-Krasovskii functional and proposes a delay partitioning method.
Journal ArticleDOI

A Novel Approach to Filter Design for T–S Fuzzy Discrete-Time Systems With Time-Varying Delay

TL;DR: Sufficient conditions for the obtained filtering error system are proposed by applying an input-output approach and a two-term approximation method, which is employed to approximate the time-varying delay.
Journal ArticleDOI

Brief paper: Robust mixed H2/H∞ control of networked control systems with random time delays in both forward and backward communication links

TL;DR: This paper is concerned with the two-mode-dependent robust control synthesis of networked control systems where random delays existing in both forward controller-to-actuator (C-A) and feedback sensor- to-controller (S-C) communication links are modeled as Markov chains.
Journal ArticleDOI

Performance analysis of multi-innovation gradient type identification methods

TL;DR: The performance analysis and simulation results show that the proposed MISG and MIFG algorithms have faster convergence rates and better tracking performance than their corresponding SG algorithms.
Journal ArticleDOI

Model Approximation for Discrete-Time State-Delay Systems in the T–S Fuzzy Framework

TL;DR: The H∞ model approximation problem is solved by using the projection approach, which casts the model approximation into a sequential minimization problem subject to linear matrix inequality (LMI) constraints by employing the cone complementary linearization algorithm.
Related Papers (5)
Frequently Asked Questions (17)
Q1. What are the contributions in "Performance analysis of the generalized projection identification for time-varying systems" ?

The least mean square methods include two typical parameter estimation algorithms, which are the projection algorithm and the stochastic gradient algorithm, the former is sensitive to noise and the latter is not capable of tracking the time-varying parameters. On the basis of these two typical algorithms, this paper presents a generalized projection identification algorithm for time-varying systems and studies its convergence by using the stochastic process theory. The analysis indicates that the generalized projection algorithm can track the time-varying parameters and requires less computational effort compared with the forgetting factor recursive least squares algorithm. The numerical examples are provided. 

The proposed method in the paper can be extended to study identification problems of other time-varying or time-invariant scalar or multivariable systems [ ?, 28–31 ]. 

From the perspective of decreasing computational complexity, the projection algorithm is sensitive to noise and the stochastic gradient (SG) algorithm is not capable of tracking the time-varying parameters [23]. 

Although the forgetting factor recursive least squares algorithm can estimate the time-varying parameter vector ϑ(t), its computational load is heavy due to computing the covariance matrix [15,23]. 

for deterministic time-varying systems, i.e., the observation noise v(t) ≡ 0, as λ → 0, the PEE upper bound [i.e., the expression on the right-hand side of (3)] is bounded. 

If the authors take R(t) := P−1(t) andP−1(t) = q−1∑i=0ψ(t− i)ψT(t− i) (10)= P−1(t− 1) + ψ(t)ψT(t)−ψ(t− q)ψT(t− q), (11)then Equations (4) and (11) form the finite data window recursive least squares (FDW-RLS) algorithm:ϑ̂(t) = ϑ̂(t− 1) + P (t)ψ(t)[y(t)−ψT(t)ϑ̂(t− 1)], (12) P−1(t) = P−1(t− 1) + ψ(t)ψT(t)−ψ(t− q)ψT(t− q), P (0) = p0In, (13)where q is the length of the data window.3. 

In simulation, the input {u(t)} is taken as an uncorrelated stochastic sequence with zero mean and unit variance and {v(t)} as a white noise sequence with zero mean and variance σ2v = 0.202, the noise-to-signal ratio is δns = 24.51%. 

For practical identification problem, the task first is collecting the input-output data {u(t), y(t)} and uses them to construct the information vector ψ(t), and then Remark x A longer window permits better performance of the GPJ algorithm for slowly time varying systems. 

If the information vector ψ(t) has the lower and upper bounds with 0 < α 6 ‖ψ(t)‖2 6 β, then r(t) in the FFSG algorithm (15)–(16) satisfiesα 

The generalized projection algorithm has better stability performance for a larger data window length and its performance is superior to the projection algorithm and stochastic gradient algorithm. 

This paper considers the identification algorithm and its performance analysis for time-varying systems [14,15],A(t, z)y(t) = B(t, z)u(t) + v(t), (1)where {u(t)} and {y(t)} are the input and output sequences of the system, respectively, {v(t)} is a stochastic noise sequence with zero mean, and z−1 is a unit backward shift operator: z−1y(t) = y(t−1), A(t, z) and B(t, z) are time-varying coefficient polynomials in z−1, andA(t, z) := 1 + a1(t)z−1 + a2(t)z−2 + · · ·+ ana(t)z−na , ∗ Corresponding authorE-mail: fding@jiangnan.edu.cn (F. Ding)B(t, z) := b1(t)z−1 + b2(t)z−2 + · · ·+ bnb(t)z−nb . 

If the conditions in Lemma 1 hold, then the parameter estimation error vector ϑ̂(t)−ϑ(t) given by the generalized projection algorithm is mean square bounded, i.e.,E[‖ϑ̂(t)− ϑ(t)‖2 6 ρb ts cE[‖ϑ̂(0)− ϑ(0)‖2 + k1 (q + s)σ 2 v(q − s + 1)2 + k2(q + s)σ 2 w,where k1 = 2ns3(s + 1)2β3/α4, k2 = 2n2s2(s + 1)2β2/α2.Proof 

Let q = t in (22), i.e., r(t) = r(t− 1) + ‖ψ(t)‖2, the authors havelim t→∞ E[‖ϑ̂(t)− ϑ‖2] 6 lim t→∞k1 (t + s)σ2v(t− s + 1)2 = 0.Thus, for the time invariant stochastic systems, the parameter estimates given by the stochastic gradient algorithm converges to the true parameters – see Theorem 2.Corollary 2 

On the basis of the work in [24], this paper combines the advantages of the projection algorithm and the SG algorithm to present a generalized projection identification algorithm for time-varying systems, and studies the convergence performance of the proposed algorithm by using the stochastic process theory. 

From Theorem 1, the authors can see that the identification algorithms encounter difficulties for fast-changingparameter systems because the fast-changing parameters have large variance σ2w and the parameter estimation error upper bound become large. 

when the generalized projection algorithm works at the beginning of operation, the authors choose a smaller data window length and then a large data window length as the time passes. 

Rn be the unit eigenvector corresponding to the maximum eigenvalue ρ(t) of the matrix ΦT(t + s, t)Φ(t + s, t), i.e., ΦT(t + s, t)Φ(t + s, t)ζ0 = ρ(t)ζ0.