What is the final justification for the reduced-rank multistage Wiener filter?

A final justification for the reduced-rank multistage Wiener filter is that the covariance matrix of the process tends tobecome white as the filter structure increases in stages.

What is the error covariance matrix associated with the error process in (51)?

The error covariance matrix , which is associated with the error process in (51), is a diagonal matrix given by(57)where the operator , with a vector operand, represents a diagonal matrix whose only nonzero elements are along the main diagonal and provided by the corresponding element of the operand.

How can the Wiener filter be constructed?

In particular, it is demonstrated that the unconstrained Wiener filter can always be constructed via a chain of constrained Wiener filters.

What is the definition of a cross-spectral metric?

This cross-spectral metric is recognized in [31] to be a vector which has components that are the weighted squared magnitudes of the direction cosines between each basis vector and the cross-correlation vector for the two aforementioned correlated random processes.

How many jammers are there in this example?

For this example the number of jammers is five, which corresponds to the effective rank of the signal subspace for that covariance matrix.

What is the simplest way to represent the error-synthesis filterbank?

It is demonstrated also that the error-synthesis filterbank can be interpreted as an implementation of a Gram–Schmidt orthogonalization process.

What is the smallest degree of freedom to estimate a scalar random process?

This method utilizes a measure, termed the cross-spectral metric, to determine the smallestnumber of degrees of freedom to linearly estimate with little loss a scalar random process from a set of correlated complex random processes.

(Open Access) A multistage representation of the Wiener filter based on orthogonal projections (1998) | J.S. Goldstein

Q: What are the contributions in "A multistage representation of the wiener filter based on orthogonal projections" ?

A dual-port analysis of the Wiener filter leads to a decomposition based on orthogonal projections and results in a new multistage method for implementing the Wiener filter using a nested chain of scalar Wiener filters. This new representation of the Wiener filter provides the capability to perform an information-theoretic analysis of previous, basis-dependent, reduced-rank Wiener filters. This analysis demonstrates that the recently introduced cross-spectral metric is optimal in the sense that it maximizes mutual information between the observed and desired processes. A new reduced-rank Wiener filter is developed based on this new structure which evolves a basis using successive projections of the desired signal onto orthogonal, lower dimensional subspaces.

Q: What is the optimal linear filter for a stationary process?

The optimal linear filter, which minimizes the mean-square error between the desired signal and its estimate,(4)is the classical Wiener filter(5)for complex stationary processes.

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998 2943

A Multistage Representation of the Wiener

Filter Based on Orthogonal Projections

J. Scott Goldstein, Senior Member, IEEE, Irving S. Reed, Fellow, IEEE, and Louis L. Scharf,

Fellow, IEEE

Abstract—The Wiener ﬁlter is analyzed for stationary complex

Gaussian signals from an information-theoretic point of view. A

dual-port analysis of the Wiener ﬁlter leads to a decomposition

based on orthogonal projections and results in a new multistage

method for implementing the Wiener ﬁlter using a nested chain

of scalar Wiener ﬁlters. This new representation of the Wiener

ﬁlter provides the capability to perform an information-theoretic

analysis of previous, basis-dependent, reduced-rank Wiener ﬁl-

ters. This analysis demonstrates that the recently introduced

cross-spectral metric is optimal in the sense that it maximizes

mutual information between the observed and desired processes.

A new reduced-rank Wiener ﬁlter is developed based on this new

structure which evolves a basis using successive projections of the

desired signal onto orthogonal, lower dimensional subspaces. The

performance is evaluated using a comparative computer analysis

model and it is demonstrated that the low-complexity multistage

reduced-rank Wiener ﬁlter is capable of outperforming the more

complex eigendecomposition-based methods.

Index Terms—Adaptive ﬁltering, mutual information, orthog-

onal projections, rank reduction, Wiener ﬁltering.

I. INTRODUCTION

HIS paper is concerned with the discrete-time Wiener

ﬁlter. Here the desired signal, also termed a reference

signal, is assumed to be a scalar process and the observed

signal is assumed to be a vector process. By contrast, a

scalar Wiener ﬁlter is described by a desired signal and an

observed signal which are both scalar processes. The so-called

matrix Wiener ﬁlter, which is not addressed in this paper, is

characterized by both a desired signal and an observed signal

which are vector processes.

A new approach to Wiener ﬁltering is presented and ana-

lyzed in this paper. The process observed by the Wiener ﬁlter

is ﬁrst decomposed by a sequence of orthogonal projections.

This decomposition has the form of an analysis ﬁlterbank,

whose output is shown to be a process which is characterized

by a tridiagonal covariance matrix. The corresponding error-

synthesis ﬁlterbank is realized by means of a nested chain of

scalar Wiener ﬁlters. These Wiener ﬁlters can be interpreted as

well to be a Gram–Schmidt orthogonalization which results in

Manuscript received February 22, 1997; revised March 25, 1998. This work

was supported in part under a Grant from the Okawa Research Foundation.

J. S. Goldstein was with the Department of Electrical Engineering, Uni-

versity of Southern California, Los Angeles, CA 90089-2565 USA. He is

now with MIT Lincoln Laboratory, Lexington, MA 02173-9108 USA (e-mail:

scott@LL.MIT.EDU).

I. S. Reed is with the Department of Electrical Engineering, University of

Southern California, Los Angeles, CA 90089-2565 USA.

L. L. Scharf is with the Department of Electrical and Computer Engineering,

University of Colorado, Boulder, CO 80309-0425 USA.

Publisher Item Identiﬁer S 0018-9448(98)06747-9.

an error sequence for the successive stages of the decomposed

Wiener ﬁlter. This new multistage ﬁlter structure achieves the

identical minimum mean-square error that is obtained by the

original multidimensional Wiener ﬁlter.

The advantages realized by this new multistage Wiener

ﬁlter are due to the decomposition being designed from a

point of view in which the Wiener ﬁlter is treated as a dual-

port problem. The multistage decomposition of the Wiener

ﬁlter in the space spanned by the observed-data covariance

matrix utilizes all of the information available to determine a

“best” basis representation of the Wiener ﬁlter. Since all full-

rank decompositions of the space spanned by the observed-

data covariance matrix are simply different representations

of the same Wiener ﬁlter, the term “best” is used here to

describe that basis representation which comes the closest

to most compactly representing the estimation energy in the

lowest rank subspace without knowledge of the observed-data

covariance matrix inverse. Clearly, if the covariance matrix

inverse were known then also the Wiener ﬁlter would be

known, and the rank-one subspace spanned by the Wiener

ﬁlter would itself be the optimal basis vector.

Previous decompositions of the space spanned by the

observed-data covariance matrix only consider the Wiener

ﬁltering problem from the perspective of a single-port problem.

In other words, the decompositions considered were based on

Gram–Schmidt, Householder, Jacobi, or principal-components

analyses of the observed-data covariance matrix (for example,

see [1]–[5] and the references contained therein). Treating the

Wiener ﬁlter as a dual-port problem, however, seems more

logical since the true problem at hand is not determining the

best representation of the observed data alone, but instead

ﬁnding the best representation of the useful portion of the

observed data in the task of estimating one scalar signal from

a vector observed-data process. Here, the projection of the

desired signal onto the space spanned by the columns of the

observed-data covariance matrix is utilized to determine the

basis set. This basis set is generated in a stage-wise manner

which maximizes the projected estimation energy in each

orthogonal coordinate.

An interesting decomposition which at ﬁrst appears sim-

ilar in form to that presented in this paper is developed

for constrained adaptive Wiener ﬁlters in [6] and [7]. This

decomposition also treats the constrained Wiener ﬁlter as a

single-port problem and does not use the constraint (in place

of the desired signal, as detailed in Appendix B) in basis

determination; in fact, the technique proposed in [6] and [7]

removes the constraint itself from the adaptive portion of

0018–9448/98$10.00

 1998 IEEE

2944 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998

the processor. In addition, this method decomposes a rank-

reduction matrix as opposed to the Wiener ﬁlter itself and

results in a modular structure which is not a nested recursion.

Thus while the modular structure in [6] and [7] is very

interesting in its own right, neither the decomposition nor the

recursion technique are similar to the new multistage Wiener

ﬁlter presented here.

Reduced-rank Wiener ﬁltering is concerned with the com-

pression, or reduction in dimensionality, of the observed data

prior to Wiener ﬁltering. The purpose of rank reduction is

to obtain a minimum mean-square error which is as close as

possible to that obtainable if all of the observed data were

available to linearly estimate the desired signal. The new

multistage Wiener ﬁlter structure leads to a natural means to

obtain rank reduction.

The performance of such a reduced-rank multistage Wiener

ﬁlter is compared by computer analysis to the well-known

principal components and the lesser known cross-spectral

methods of rank reduction. This analysis demonstrates that

the new method of rank reduction, using quite simply a

truncated version of the above-described nested chain of scalar

Wiener ﬁlters, is capable of outperforming these previous

approaches. Also an information-theoretic analysis of entropy

and mutual information is now possible due to this new

structure, which provides insight into these results. In partic-

ular, it is demonstrated that the cross-spectral method of rank

reduction maximizes the mutual information as a function of

rank relative to the eigenvector basis. The new reduced-rank

multistage Wiener ﬁlter does not utilize eigendecomposition

or eigenvector-pruning techniques.

Section II provides a brief description of the Wiener ﬁlter

in terms of the framework to be used in the remainder of this

paper. An introduction and analysis of this new representation

of the Wiener ﬁlter is presented in Section III. A summary of

previous reduced-rank Wiener ﬁltering techniques is provided

in Section IV, where the reduced-rank multistage Wiener ﬁlter

is presented and its performance is evaluated via a comparative

computer analysis. Concluding remarks are given in Section V.

II. P

RELIMINARIES

The classical Wiener ﬁltering problem is depicted in Fig. 1,

where there is a desired scalar signal

,an -dimensional

observed-data vector

, and an -dimensional Wiener

ﬁlter

. The error signal is denoted by . The Wiener

ﬁlter requires that the signals be modeled as wide-sense-

stationary random processes, and the information-theoretic

analysis to be considered makes as well the complex Gaussian

assumption. Thus in both cases there is no loss in generality

to assume that all signals are zero-mean, jointly stationary,

complex Gaussian random processes. The covariance matrix

of the input vector process

is given by

(1)

where

denotes the expected-value operator and

is the complex conjugate transpose operator. Similarly, the

Fig. 1. The classical Wiener ﬁlter.

variance of the desired process is

(2)

The complex cross-correlation vector between the processes

and is given by

(3)

where

is the complex conjugate operator. The optimal

linear ﬁlter, which minimizes the mean-square error between

the desired signal

and its estimate,

(4)

is the classical Wiener ﬁlter

(5)

for complex stationary processes. The resulting error is

(6)

The minimum mean-square error (MMSE) is

(7)

where the squared canonical correlation

[8]–[11] is

(8)

As will be seen in Section III-D, the squared canonical

correlation provides a measure of the information present in

the observed vector random process

that is used to

estimate the scalar random process

Because of the assumed Gaussianity, the self-information or

entropy of the signal process

is given by (see [12]–[15])

(9)

and the entropy of the vector input process

(10)

where

denotes the determinant operator. Next deﬁne an

augmented vector

(11)

Then, using (1)–(3) and (11), the covariance matrix associated

GOLDSTEIN et al.: A MULTISTAGE REPRESENTATION OF THE WIENER FILTER 2945

with the vector process is given by

(12)

so that, by (10), the joint entropy of the random processes

and is given by

(13)

Thus by Shannon’s chain rule the conditional entropy

, or what Shannon called the equivocation of

given , is given by

(14)

Now the mutual information

is the relative en-

tropy between the joint distribution and the product distribu-

tion. That is,

represents the reduction in uncertainty

due to the knowledge of . This mutual infor-

mation is given by

(15)

By deﬁnition the Wiener ﬁlter minimizes the mean-square

error between the desired process and the ﬁltered observed

process. Therefore, the operation of this ﬁlter must determine

that portion of the observed process which contains the most

information about the desired process. Intuitively for Gaussian

processes one expects that a minimization of the mean-square

error and a maximization of the mutual information are equiv-

alent. This insight is mathematically realized through the

multistage representation of the Wiener ﬁlter presented next

in this paper.

III. T

HE MULTISTAGE WIENER FILTER

The analysis developed herein emphasizes the standard, un-

constrained Wiener ﬁlter. It is noted that an identical approach

also solves the problem of quadratic minimization with linear

constraints [16] and the joint-process estimation problem, both

of which can be interpreted as a constrained Wiener ﬁlter.

The partitioned solution presented in [16] decomposes the

constraint in such a manner that the resulting Wiener ﬁlter

is unconstrained, as is further explored in the example given

in Section IV-C and Appendix B. It is further noted that

other constraints also may be decomposed similarly [17]. Thus

the constrained Wiener ﬁlter can be represented as an uncon-

strained Wiener ﬁlter with a preﬁltering operation determined

by the constraint. It is seen next that the unconstrained Wiener

ﬁlter can also be represented as a nested chain of constrained

Wiener ﬁlters.

This new representation of the Wiener ﬁlter is achieved by

a multistage decomposition. This decomposition forms two

subspaces at each stage; one in the direction of the cross-

correlation vector at the previous stage and one in the subspace

orthogonal to this direction. Then the data orthogonal to the

cross-correlation vector is decomposed again in the same

manner, stage by stage. This process reduces the dimension of

the data vector at each stage. Thus a new coordinate system

for Wiener ﬁltering is determined via a pyramid-like structured

decomposition which serves as an analysis ﬁlterbank. This

decomposition decorrelates the observed vector process at lags

greater than one, resulting in a tridiagonal covariance matrix

associated with the transformed vector process.

A nested chain of scalar Wiener ﬁlters form an error-

synthesis ﬁlterbank, which operates on the output of the

analysis ﬁlterbank to yield an error process with the same

MMSE as the standard multidimensional Wiener ﬁlter. It is

demonstrated also that the error-synthesis ﬁlterbank can be

interpreted as an implementation of a Gram–Schmidt orthog-

onalization process.

A. An Equivalent Wiener Filtering Model

To obtain this new multistage decomposition, note that

the preprocessing of the observation data by a full-rank,

nonsingular, linear operator prior to Wiener ﬁltering does not

modify the MMSE. This fact is demonstrated in Appendix A.

Now consider the particular nonsingular operator

with the

structure

(16)

where

is the normalized cross-correlation vector, a unit

vector in the direction of

, given by

(17)

and

is an operator which spans the nullspace of

; i.e., is the blocking matrix which annihilates those

signal components in the direction of the vector

[18],

[19] such that

Two fast algorithms for obtaining such a unitary matrix

are described in [20] which use either the singular-

value decomposition or the QR decomposition. For a

which is nonsingular, but not unitary, a new, very efﬁcient,

implementation of the blocking matrix

is presented in

Appendix A.

Let the new transformed data vector, formed by

operat-

ing on the observed-data vector, be given by

(18)

The transform-domain Wiener ﬁlter with the preprocessor

is shown in Fig. 2. The Wiener ﬁlter for the transformed

process is computed now to have the form

(19)

Next, the covariance matrix

, its inverse , and the

cross-correlation vector

are expressed as

(20)

(21)

(22)

2946 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998

Fig. 2. The transform-domain Wiener ﬁlter.

where denotes the standard matrix transpose operator and,

by (16)–(18), the scalar

in (22) is obtained as

(23)

The variance of

in (18) is calculated to be

(24)

The covariance matrix

is given by

(25)

The cross-correlation vector

is computed to be

(26)

and the matrix

in (21) is determined by the matrix

inversion lemma for partitioned matrices [16]. In terms of the

joint-process covariance matrix

in (12), the transformations

described above in (20)–(26) may be represented by

(27)

where

(28)

The structure of the matrix

in (20), its inverse in

(21), and the diagrams in Figs. 1 and 2 suggest that a new

-dimensional “weight” vector be deﬁned by

(29)

which is the Wiener ﬁlter for estimating the scalar

from

the vector

. Then a new error , given by

(30)

can be deﬁned for the new Wiener ﬁlter

which is similar

in form to the Wiener ﬁlter depicted in Fig. 1. The variance

of the error

in (30) is computed readily to be

(31)

Since

is the covariance of the scalar process , the

identical MMSE,

, is achieved by the ﬁltering diagram in

Fig. 3. The ﬁrst stage of the decomposition.

Fig. 4. The ﬁrst chain in the nested recursion.

Fig. 3, where the new scalar Wiener ﬁlter is deﬁned by

(32)

and where by (18), (22), (23), and (30) the identity of the

correlation between the scalar processes

and with

is shown by

(33)

It is evident by (19)–(21) that the ﬁltering diagrams in Figs.

2 and 3 are identical since

(34)

Note that the scalar

is also the MMSE of the nested

lower dimensional Wiener ﬁlter with a new scalar signal

and a new observed signal vector . The ﬁrst stage of

this decomposition partitions the

-dimensional Wiener ﬁlter

into a scalar Wiener ﬁlter and an

-dimensional vector

Wiener ﬁlter, where the reduced-dimension vector ﬁlter spans

a space which is orthogonal to the space spanned by the

scalar ﬁlter. Also note that the nested ﬁltering structure in

Fig. 3, which uses

to estimate from , may be

interpreted as a constrained Wiener ﬁlter which minimizes the

error

subject to the constraint that the desired signal

has the gain and phase provided by the ﬁlter .

B. A Multistage Representation of the Wiener Filter

The ﬁrst stage decomposition results in the structure de-

picted in Fig. 3. The new

-dimensional vector Wiener

ﬁlter

operates on the transformed -dimensional data

vector

to estimate the new scalar signal , as shown

in Fig. 4. This represents a Wiener ﬁlter which is identical

in form to the original

-dimensional Wiener ﬁlter, except

that it is one dimension smaller. Thus a recursion of scalar

Wiener ﬁlters can be derived by following the outline given

in Section III-A until the dimension of both the data and the

corresponding Wiener ﬁlter is reduced to one at level

in the tree. The error signal at each stage serves as the scalar

observed process for the Wiener ﬁlter at the next stage. At

GOLDSTEIN et al.: A MULTISTAGE REPRESENTATION OF THE WIENER FILTER 2947

each stage , , the normalized cross-correlation

vector

is computed in the same manner as (17) to be

(35)

The blocking matrix,

, may be computed using

the method detailed in Appendix A, that presented in [20], or

any other method which results in a valid

. The covariance

matrix

is computed corresponding to (25) as follows:

(36)

and the cross-correlation vector

is found recursively in

the manner of (26) to be

(37)

The scalar signal

and the -dimensional

observed-data vector

at the th stage are found in

accordance with (18) as follows:

(38)

(39)

The error signals at each stage, in analogy to (30), are given by

(40)

where it is notationally convenient to deﬁne the scalar output

of the last signal blocking matrix in the chain

be the

th element of both the sequences and as

follows:

(41)

The variances associated with the signals

, are deﬁned by

(42)

where

. The scalar cross-correlations are

computed in the same manner as (33) to be

(43)

where, using (41), the last term of the recursion in (43) is

provided by the identity

(44)

The scalar Wiener ﬁlters

are found from the Wiener–Hopf

equation to be

(45)

where, for

, the MMSE recursion yields

(46)

TABLE I

ECURSION EQUATIONS

similar to the results of (29), (31), and (32). In accordance

with (41), the MMSE of the last stage is given by

The complete series of required recursion relationships are

listed in Table I. An example of this decomposition for

is provided in Fig. 5. Note that this new multistage Wiener

ﬁlter does not require an estimate of the covariance matrix

or its inverse when the statistics are unknown since the only

requirements are for estimates of the cross-correlation vectors

and scalar correlations, which can be calculated directly from

the data.

C. Analysis of the Multistage Wiener Filter

This new Wiener ﬁlter structure is naturally partitioned

into an analysis ﬁlterbank and a synthesis ﬁlterbank. The

analysis ﬁlterbank is pyramidal, and the resulting tree structure

successively reﬁnes the signal

in terms of , its

component in the direction of the cross-correlation vector,

and

, its components in the orthogonal subspace. The

subspaces formed at level

and level in the tree satisfy

the direct-sum relationship

(47)

where

denotes the linear subspace spanned by the columns

of the covariance matrix

and represents a direct

sum.

The operation of the analysis ﬁlterbanks are combined next

into one lossless

transfer matrix , which is given

A multistage representation of the Wiener filter based on orthogonal projections

Figures

Citations

Random Matrix Theory and Wireless Communications

Analog Beamforming in MIMO Communications With Phase Shift Networks and Online Channel Estimation

MIMO Radar Space–Time Adaptive Processing Using Prolate Spheroidal Wave Functions

Adaptive reduced-rank interference suppression based on the multistage Wiener filter

Adaptive Reduced-Rank Processing Based on Joint and Iterative Interpolation, Decimation, and Filtering

References

A mathematical theory of communication

Elements of information theory

Matrix computations

Adaptive Filter Theory

An Introduction to Multivariate Statistical Analysis

Related Papers (5)

Adaptive reduced-rank interference suppression based on the multistage Wiener filter

Reduced-rank adaptive filtering

Rapid Convergence Rate in Adaptive Arrays

An eigenanalysis interference canceler

Adaptive Filter Theory

Frequently Asked Questions (9)

Q1. What are the contributions in "A multistage representation of the wiener filter based on orthogonal projections" ?

Q2. What is the optimal linear filter for a stationary process?

Q3. What is the final justification for the reduced-rank multistage Wiener filter?

Q4. What is the error covariance matrix associated with the error process in (51)?

Q5. How can the Wiener filter be constructed?

Q6. What is the definition of a cross-spectral metric?

Q7. How many jammers are there in this example?

Q8. What is the simplest way to represent the error-synthesis filterbank?

Q9. What is the smallest degree of freedom to estimate a scalar random process?