Solving Rank-Deficient and Ill-posed Problems Using UTV and QR Factorizations

doi:10.1137/S089547980037785X

Solving Rank-Deﬁcient and Ill-posed Problems using

UTV and QR Factorizations

∗

Leslie V. Foster

Department of Mathematics and Computer Science

San Jose State University

San Jose, CA 95192

foster@math.sjsu.edu

March 19, 2003

Abstract

The algorithm of Mathias and Stewart [A block QR algorithm and the singu-

lar value decomposition, Linear Algebra and Its Applications, 182:91-100, 1993]

is examined as a tool for constructing regularized solutions to rank-deﬁcient and

ill-posed linear equations. The algorithm is based on a sequence of QR factor-

izations. If it is stopped after the ﬁrst step it produces that same solution as the

complete orthogonal decomposition used in LAPACK’s xGELSY. However we

show that for low-rank problems a careful implementation can lead to an order

of magnitude improvement in speed over xGELSY as implemented in LAPACK.

We prove, under assumptions similar to assumptions used by others, that if the

numerical rank is chosen at a gap in the singular value spectrum and if the

initial factorization is rank-revealing then, even if the algorithm is stopped after

the ﬁrst step, approximately half the time its solutions are closer to the desired

solution than are the singular value decomposition (SVD) solutions. Conversely,

the SVD will be closer approximately half the time and in this case overall the

two algorithms are very similar in accuracy. We conﬁrm this with numerical

experiments. Although the algorithm works best for problems with a gap in the

singular value spectrum, numerical experiments suggest that it may work well

for problems with no gap.

1 Introduction

The solution to ill-posed or nearly rank-deﬁcient linear equations is important in many

applications [18]. To solve these systems some form of regularization is usually used.

By regularization we mean the replacement of the original problem with a diﬀerent,

better posed problem. For example if the original problem is

min kb − Axk (1)

where A is an m ×n ill-conditioned matrix with m ≥ n and the norm is the Euclidean

norm, it is often recommended to approximate A with an exactly rank-deﬁcient matrix

∗

This research was supported in part by the Woodward bequest to the Department of Mathe-

matics, San Jose State University. To appear in SIAM J. Matrix Anal. and Appl.

1

b

A and solve for the minimum norm solution to (1) with A replaced by

b

A. To construct

b

A it is useful to decompose A with a rank-revealing decomposition. The low-rank

approximation

b

A to A can be obtained by truncating such decompositions. The

singular value decomposition (SVD) is a very good, but expensive, decomposition. We

use the complete orthogonal or UTV decomposition A = UTV

T

with U orthogonal,

V orthogonal and T triangular. Here the superscript T indicates transpose. Some

of our results will apply to any UTV factorization and others to UTV factorizations

produced by the algorithm of Mathias and Stewart [21]. This algorithm produces

UTV factorizations by using a sequence of QR factorizations. We begin the algorithm

with an initial UTV factorization of the form A = UTV

T

= QRΠ

T

where Q is an

orthogonal matrix, R is upper triangular, and Π is the permutation matrix produced

by the standard QR algorithm with pivoting [1, 4, 20]. (We also will discuss a variation

where initially A

T

is factored in this form). If the algorithm is stopped after the

ﬁrst step it produces the same solution as the complete orthogonal decomposition

used in LAPACK’s xGELSY. However we show that for low-rank problems a careful

implementation can lead to an order of magnitude improvement in speed over the two

routines, xGELSY and xGELSD, that LAPACK provides for solving rank-deﬁcient

problems. We prove, under assumptions similar to assumptions used by others about

the true solution to (1) and the noise in b , that if the numerical rank is chosen at

a gap in the singular value spectrum and if the initial factorization is rank-revealing

[3, p. 22] then, even if the algorithm is stopped after the ﬁrst step, approximately

half the time its solutions are closer to the desired solution than are the singular

value decomposition solutions. Conversely, the SVD will be closer approximately half

the time and in this case overall the two algorithms are very similar in accuracy.

We conﬁrm this with numerical experiments. Although the algorithm works best for

problems with a gap in the singular value spectrum, numerical experiments suggest

that it may work well for problems with no gap.

The paper is organized as follows. Following this introduction, in Section 2 we

discuss UTV factorizations in general. Section 3 discusses the algorithm in [21].

Section 4 focuses on perturbation errors and Section 5 on regularization errors. Section

6 describes implementation of the algorithm and numerical experiments. Section 7

has conclusions.

2 UTV factorizations

Consider any UTV factorization of A, A = UT V

T

. Let k be the rank of the low-rank

approximation to A. It is useful to partition the factorization as follows. If T is lower

triangular (T = L) we partition U T V

T

as

A = UT V

T

= ULV

T

=

³

b

U

0

´





b

L 0

H E

0 0





³

b

V

0

´

T

. (2)

If T is upper triangular (T = R) we partition U T V

T

as

A = UT V

T

= URV

T

=

³

b

U U

0

´





b

R F

0 G

0 0





³

b

V V

0

´

T

. (3)

In these equations

b

U is m × k, U

0

is m × (m − k),

b

V is n × k, V

0

is n × (n − k),

b

L is

k × k, H is (n − k) × k, E is (n − k) × (n − k),

b

R is k × k, F is k × (n − k), and G is

2

(n − k) × (n − k). In equations (2) and (3) U

0

corresponds to the last two block rows

in the block triangular matrices. If we do not need to distinguish whether T is lower

or upper triangular we will let

b

T represent either

b

L or

b

R. In each case we consider

two low-rank approximations to A. If T is either lower or upper triangular we will

call

b

U

b

T

b

V

T

the corner low-rank approximation to A. If T is lower triangular we call

U [

b

L

T

H

T

0]

T

b

V

T

the block-column low-rank approximation to A. Similarly if T is

upper triangular we call

b

U [

b

R F ] V

T

the block-row low-rank approximation to A.

We will also partition the SVD of A in a similar manner to (2) and (3).

A = U

S

DV

T

S

=

³

b

U

S

U

S0

´





b

D 0

0 D

0

0 0





³

b

V

S

V

S0

´

T

.

b

A

S

=

b

U

S

b

D

b

V

T

S

is the rank k approximation produced by the SVD. We will use s

1

≥

s

2

≥ . . . ≥ s

n

to indicate the singular values of A. We will also use σ

k

(A), 1 ≤ k ≤ n,

to indicate the k

th

singular value of a matrix A. Note that the SVD produces the best

rank k approximation to A in the sense that kA −

e

Ak is minimized over all rank k

matrices

e

A when

e

A =

b

A

S

[3, p. 12].

When solving equation (1) we will consider the regularized solution x

T

=

b

A

+

T

b

where the superscript

+

indicates psuedoinverse and

b

A

T

is either a corner or block-

row/column low-rank approximation to A corresponding to a UTV factorization of A.

It will be clear from the context whether x

T

refers to a corner or block-row/column

low-rank approximation. We call x

T

the truncated UTV solution to (1). We assume

in the rest of this paper that

b

T and

b

D are nonsingular. In this case the corner

low-rank solution has a simple form x

T

=

b

V

b

T

−1

b

U

T

b. The truncated SVD (TSVD)

approximate solution to (1) is x

S

=

b

V

S

b

D

−1

b

U

T

S

b.

To evaluate the accuracy of x

T

we will assume that there is an underlying noiseless

solution x

0

to equation (1) such that Ax

0

= b

0

and that in (1) b = b

0

+ δb where

δb is a noise vector in the right hand side b. We will prove theorems and carry out

numerical experiments that evaluate x

T

based on the value of kx

T

− x

0

k and will

compare kx

T

− x

0

k with kx

S

− x

0

k. We might note that other authors [6, 8, 10] have

focused on bounding kx

T

− x

S

k. In many cases the goal in solving (1) is to recover an

underlying solution x

0

that is diﬀerent from x

S

[18, 22]. In these cases comparison of

kx

T

− x

0

k with kx

S

− x

0

k is of interest.

Suppose that C is some regularization operator so that x = Cb is the regularized

solution to (1). If x

0

is the underlying noiseless solution then

x − x

0

= Cb − x

0

= (CA − I)x

0

+ C(δb) and (4)

kx − x

0

k ≤ k(CA − I)x

0

k + kC(δb)k. (5)

The two terms on the right are called, respectively, the regularization error and the

perturbation error. In the case that C corresponds to a corner low-rank solution

calculated using a truncated UTV factorization, where T is lower triangular, we have

a sharper result than (5):

kx − x

0

k

2

= k(CA − I)x

0

k

2

+ kC(δb)k

2

. (6)

This result follows since, if C =

b

V

b

L

−1

b

U

T

, then C

T

(CA − I) = 0 follows easily.

Our ﬁrst theorem relates kx

T

− x

0

k and kx

S

− x

0

k.

3

Theorem 1 Deﬁne

e

U = U

T

U

S

=

Ã

b

U

T

b

U

S

b

U

T

U

S0

U

T

0

b

U

S

U

T

0

U

S0

!

=

Ã

e

U

11

e

U

12

e

U

21

e

U

22

!

, (7)

e

V = V

T

V

S

=

Ã

b

V

T

b

V

S

b

V

T

V

S0

V

T

0

b

V

S

V

T

0

V

S0

!

=

Ã

e

V

11

e

V

12

e

V

21

e

V

22

!

, (8)

M =

Ã

−

b

D

−1

e

V

T

21

e

V

21

b

D

−1

e

U

T

11

b

T

−T

b

T

−1

e

U

12

e

U

T

12

b

T

−T

b

T

−1

e

U

11

e

U

T

12

b

T

−T

b

T

−1

e

U

12

!

(9)

and

N =

Ã

e

V

T

21

e

V

21

e

V

T

21

e

V

22

e

V

T

22

e

V

21

−

e

V

T

12

e

V

12

!

(10)

Also let

e

δb = U

T

S

δb and ex

0

= V

T

S

x

0

and let x

T

be the corner low-rank solution to (1)

calculated from a truncated UTV factorization with T lower triangular. Then

kx

T

− x

0

k

2

= kx

S

− x

0

k

2

+

e

δb

T

M

e

δb + ex

T

0

N ex

0

. (11)

Proof. First note that

T = U

T

U

S

DV

T

S

V =

e

UD

e

V

T

=

Ã

e

U

11

e

U

12

e

U

21

e

U

22

!





b

D 0

0 D

0

0 0





Ã

e

V

11

e

V

12

e

V

21

e

V

22

!

T

. (12)

The perturbation error term for the SVD solution is k

b

A

+

S

δbk where

b

A

+

S

=

b

V

S

b

D

−1

b

U

T

S

and for the UTV solution it is k

b

A

+

T

δbk where

b

A

+

T

=

b

V

b

T

−1

b

U

T

. Now k

b

A

+

S

δbk

2

=

k

b

D

−1

b

U

T

S

U

S

e

δbk

2

= k

b

D

−1

(I 0)

e

δbk

2

. Note that since

e

V is orthogonal I =

e

V

T

11

e

V

11

+

e

V

T

21

e

V

21

and therefore

b

D

−2

=

b

D

−1

e

V

T

11

e

V

11

b

D

−1

+

b

D

−1

e

V

T

21

e

V

21

b

D

−1

. Rewriting (12) as

T

e

V =

e

UD and since T is lower triangular it follows that

e

V

11

b

D

−1

=

b

T

−1

e

U

11

. We may

conclude that

k

b

A

+

S

δbk

2

=

e

δb

T

µ

e

U

T

11

b

T

−T

b

T

−1

e

U

11

+

b

D

−1

e

V

T

21

e

V

21

b

D

−1

0

0 0

¶

e

δb.

Next note that k

b

A

+

T

δbk

2

= k(

b

V

b

T

−1

b

U

T

) U

S

e

δbk

2

= k

b

T

−1

(

e

U

11

e

U

12

)

e

δbk

2

. Therefore

k

b

A

+

T

δbk

2

=

e

δb

T

Ã

e

U

T

11

b

T

−T

b

T

−1

e

U

11

e

U

T

11

b

T

−T

b

T

−1

e

U

12

e

U

T

12

b

T

−T

b

T

−1

e

U

11

e

U

T

12

b

T

−T

b

T

−1

e

U

12

!

e

δb.

It now follows that

k

b

A

+

T

δbk

2

= k

b

A

+

S

δbk

2

+

e

δb

T

M

e

δb. (13)

Using

b

A

+

S

=

b

V

S

b

D

−1

b

U

T

S

it follows that the regularization error term for the trun-

cated SVD satisﬁes k(

b

A

+

S

A − I) x

0

k

2

= k(

b

A

+

S

A − I)V

S

ex

0

k

2

= k(

b

V

S

b

V

T

S

− I)V

S

ex

0

k

2

=

k − (0 I) ex

0

k

2

. Also k(

b

A

+

T

A − I) x

0

k

2

= k(

b

V

b

V

T

− I)V

S

ex

0

k

2

= kV

0

V

T

0

V

S

ex

0

k

2

=

kV

T

0

(

b

V

S

V

S0

) ex

0

k

2

= k(

e

V

21

e

V

22

) ex

0

k

2

. Using this result and I =

e

V

T

22

e

V

22

+

e

V

T

12

e

V

12

(since

e

V is orthogonal) it follows that

k(

b

A

+

T

A − I) x

0

k

2

= ex

T

0

·µ

0 0

0 I

¶

+ N

¸

ex

0

= k(

b

A

+

S

A − I) x

0

k

2

+ ex

T

0

N ex

0

.

4

The theorem follows from this equation, (6) and (13). ¤

Note that for a UT V factorization chosen so that M 6= 0 and N 6= 0 then it

follows from (9) and (10) that M and N are symmetric indeﬁnite matrices. Therefore

if M, N 6= 0 in the UTV factorization of any matrix A, by (11) there exists solution

vectors x

0

and noise vectors δb such that the truncated UTV solution is closer to

x

0

than is the truncated SVD solution. We will see in our numerical experiments in

Section 6 that it is frequently true that x

T

is closer to x

0

than x

S

is (and, conversely,

that x

S

is frequently closer to x

0

than x

T

is). In Section 4 and 5 we will use Theorem

1 to explore reasons why this is true.

A result from [21] that we will need later relates the singular values of A,

b

T , E

and G. If kEk < σ

k

(

b

T ) and if T is lower triangular then

σ

j

(

b

T ) ≤ σ

j

(A) ≤ σ

j

(

b

T )

.

"

1 −

kHk

2

σ

2

k

(

b

T ) − kEk

2

#

1/2

, for 1 ≤ j ≤ k (14)

and

σ

k+j

(A) ≤ σ

j

(E) ≤ σ

k+j

(A)

.

"

1 −

kHk

2

σ

2

k

(

b

T ) − kEk

2

#

1/2

, for 1 ≤ j ≤ n − k. (15)

If T is upper triangular and if kGk < σ

k

(

b

T ) then equations (14) and (15) are also

true with H and E replaced, respectively, by F and G.

In the later sections we will also use some of the results [9] which we collect here.

These results bound sin θ, the sine of the angle between the subspaces spanned by

b

V

and

b

V

S

, and sin φ, the sine of the angle between the subspaces spanned by

b

U and

b

U

S

.

Let

e

U and

e

V be deﬁned by (7) and (8). Assume that kEk < σ

k

(

b

T ) and kGk < σ

k

(

b

T ).

If T is lower triangular then

sin φ = k

e

U

12

k = k

e

U

21

k ≤

σ

k

(

b

T )kHk

σ

2

k

(

b

T ) − kEk

2

and (16)

sin θ = k

e

V

12

k = k

e

V

21

k ≤

kHkkEk

σ

2

k

(

b

T ) − kEk

2

. (17)

If T is upper triangular then

sin φ = k

e

U

12

k = k

e

U

21

k ≤

kF kkGk

σ

2

k

(

b

T ) − kGk

2

and (18)

sin θ = k

e

V

12

k = k

e

V

21

k ≤

σ

k

(

b

T )kF k

σ

2

k

(

b

T ) − kGk

2

. (19)

Note that in most case of interest to us, we will have kHk ≤ kEk and kF k ≤ kGk.

Assuming this, sin θ and sin φ can be small for either of two reasons: (1) kEk <<

σ

k

(

b

T ) and kGk << σ

k

(

b

T ) or (2) kHk << kEk and kF k << kGk. The ﬁrst condition

will be true if there is a suﬃciently large gap, at singular value k, in the singular values

of A and if the UT V factorization is rank-revealing (as deﬁned in the next section) .

The second condition can be achieved by some of the algorithms for calculating UT V

factorizations, even when there is not a gap in the singular values.

5

Solving Rank-Deficient and Ill-posed Problems Using UTV and QR Factorizations

Figures

Citations

Algebraic Solutions of Systems of Equations

The Sixth Problem of Generalized Algebraic Regression

The Second Problem of Algebraic Regression

Special Problems of Algebraic Regression and Stochastic Estimation

The Fifth Problem of Algebraic Regression: The System of Conditional Equations: Homogeneous and Inhomogeneous Equations: \{\mathbf{By} = \mathbf{Bi}\ \mathbf{versus} -\mathbf{c} + \mathbf{By} = \mathbf{Bi}\}

References

Matrix computations

Matrix computations (3rd ed.)

Solving least squares problems

Analysis of discrete ill-posed problems by means of the L-curve

Numerical Methods for Least Squares Problems

Related Papers (5)

randUTV: A Blocked Randomized Algorithm for Computing a Rank-Revealing UTV Factorization

A new singular value decomposition algorithm suited to parallelization and preliminary results

Efficient pre-processing in the parallel block-Jacobi SVD algorithm

Iterative algorithms for computing the singular subspace of a matrix associated with its smallest singular values

An efficient algorithm for regularization of Laplace transform inversion in real case