What are the contributions in "J-orthogonal matrices: properties and generation higham, nicholas j. 2003" ?

The authors present techniques and tools useful in the analysis, application and construction of these matrices, giving a self-contained treatment that provides new insights. Then the authors show how the exchange operator can be used to obtain a hyperbolic CS decomposition of a J-orthogonal matrix directly from the usual CS decomposition of an orthogonal matrix. The authors introduce the indefinite polar decomposition and investigate two iterations for computing the J-orthogonal polar factor: a Newton iteration involving only matrix inversion and a Schulz iteration involving only matrix multiplication. The authors show that these iterations can be used to J-orthogonalize a matrix that is not too far from being J-orthogonal.

What is the Newton iteration of a matrix?

Restoring lost orthogonality is a common requirement, for example in numerical solution of matrix differential equations having an orthogonal solution [17], or for computed eigenvector matrices of symmetric matrices.

What is the simplest way to prove the convergence of Xk+1?

From standard analysis of this iteration (see, e.g., [23]) the authors know that Sk converges quadratically to sign(S0), which is the identity matrix since the spectrum of S0 lies in the open right half-plane.

What is the simplest way to show that the inverse of the Newton iteration is?

Unlike for orthogonal matrices, for general J-orthogonal matrices ‖Q‖2 can be arbitrarily large and this has implications for the attainable accuracy of the Newton and Schulz iterations in floating point arithmetic.

What is the way to get the inverse of the matrix?

Such an iteration can be obtained by adapting the Schulz iteration, which exists in variants for computing the matrix inverse [31], the orthogonal polar factor [20], the matrix sign function [22], and the matrix square root [18].

(Open Access) J-Orthogonal Matrices: Properties and Generation (2003) | Nicholas J. Higham

J -Orthogonal Matrices: Properties and

Generation

Higham, Nicholas J.

2003

MIMS EPrint: 2006.69

Manchester Institute for Mathematical Sciences

School of Mathematics

The University of Manchester

Reports available from: http://eprints.maths.manchester.ac.uk/

And by contacting: The MIMS Secretary

School of Mathematics

The University of Manchester

Manchester, M13 9PL, UK

ISSN 1749-9097

J-ORTHOGONAL MATRICES: PROPERTIES AND GENERATION

∗

NICHOLAS J. HIGHAM

†

Abstract. A real, square matrix Q is J-orthogonal if Q

JQ = J, where the signature matrix

J = diag(±1). J-orthogonal matrices arise in the analysis and numerical solution of various matrix

problems involving indeﬁnite inner products, including, in particular, the downdating of Cholesky

factorizations. We present techniques and tools useful in the analysis, application and construction

of these matrices, giving a self-contained treatment that provides new insights. First, we deﬁne and

explore the properties of the exchange operator, which maps J-orthogonal matrices to orthogonal

matrices and vice versa. Then we show how the exchange operator can be used to obtain a hyperbolic

CS decomposition of a J-orthogonal matrix directly from the usual CS decomposition of an orthogonal

matrix. We employ the decomposition to derive an algorithm for constructing random J-orthogonal

matrices with speciﬁed norm and condition numb er. We also give a short proof of the fact that

J-orthogonal matrices are optimally scaled under two-sided diagonal scalings. We introduce the

indeﬁnite polar decomposition and investigate two iterations for computing the J-orthogonal polar

factor: a Newton iteration involving only matrix inversion and a Schulz iteration involving only

matrix multiplication. We show that these iterations can be used to J-orthogonalize a matrix that

is not too far from being J-orthogonal.

Key words. J-orthogonal matrix, exchange operator, gyration operator, sweep operator, princi-

pal pivot transform, hyperbolic CS decomposition, two-sided scaling, indeﬁnite least squares problem,

hyperbolic QR factorization, indeﬁnite polar decomposition, Newton’s method, Schulz iteration

AMS subject classiﬁcations. 65F30, 15A18

1. Introduction. A matrix Q ∈ R

n×n

is J-orthogonal if

JQ = J,(1.1)

where J = diag(±1) is a signature matrix. Clearly, Q is nonsingular and QJQ

= J.

This type of matrix arises in hyperbolic problems, that is, problems where there is

an underlying indeﬁnite inner product or weight matrix. We give two examples to

illustrate the utility of J-orthogonal matrices.

First consider the downdating problem of computing the Cholesky factorization of

a positive deﬁnite matrix C = A

A − B

B, where A ∈ R

p×n

(p ≥ n) and B ∈ R

q ×n

This task arises when solving a regression problem after some of the rows (namely

those of B) of the data matrix are removed, and A in this case is usually upper

triangular. Numerical stability considerations dictate that we should avoid explicit

formulation of C. If we can ﬁnd a J-orthogonal matrix Q such that









,(1.2)

with J = diag(I

, −I

) and R ∈ R

n×n

upper triangular, then

C =

















= R

∗

Numerical Analysis Report 408, Manchester Centre for Computational M athematics, September

2002. Revised March 2003.

†

Department of Mathematics, University of Manchester, Manchester, M13 9PL, England

(higham@ma.man.ac.uk, http://www.ma.man.ac.uk/~higham/). This work was supported by Engi-

neering and Physical Sciences Research Council grant GR/R22612.

2 NICHOLAS J. HIGHAM

so R is the desired Cholesky factor. The factorization (1.2) is a hyperbolic QR fac-

torization; for details of how to compute it see, for example, [1].

A second example where J-orthogonal matrices play a key role is in the solution

of the symmetric deﬁnite generalized eigenproblem Ax = λBx, where A and B are

symmetric, some linear combination of them is positive deﬁnite, and B is nonsingular.

Through the use of a congruence transformation (for example by using a block LDL

decomposition of B followed by a diagonalization of the block diagonal factor [38]) the

problem can be reduced to

Ax = λJx, for some signature matrix J = diag (±1). If we

can ﬁnd a J-orthogonal Q such that Q

AQ = D = diag (d

) then the eigenvalues are

the diagonal elements of JD; such a Q can be constructed using a Jacobi algorithm

of Veseli´c [40].

In addition to these practical applications, J-orthogonal matrices are of signiﬁcant

theoretical interest. For example, they play a fundamental role in the study of J-

contractive matrices [30], which are matrices X for which XJX

≤ J, where A ≥ 0

denotes that the symmetric matrix A is p ositive semideﬁnite.

A matrix Q ∈ R

n×n

is (J

, J

)-orthogonal if

Q = J

,(1.3)

where J

= diag (±1) and J

= diag (±1) are signature matrices having the same

inertia. (J

, J

)-orthogonal matrices are also known as hyperexchange matrices and

J-orthogonal matrices as hypernormal matrices [2]. Since J

and J

in (1.3) have the

same inertia, J

= P J

for some permutation matrix P , and hence (QP )

(QP ) =

. A (J

, J

)-orthogonal matrix is therefore simply a column permutation of a J

orthogonal matrix, and so for the purposes of this work we can restrict our attention

to J-orthogonal matrices. An application in which (J

, J

)-orthogonal matrices arise

with J

and J

generally diﬀerent is the HR algorithm of Brebner and Grad [5] and

Bunse-Gerstner [6] for solving the standard eigenvalue problem for J-symmetric ma-

trices. A matrix A ∈ R

n×n

is J-symmetric if AJ is symmetric, or, equivalently, if

JA, JA

or A

J is symmetric. Given a J

-symmetric matrix A, the kth stage of the

unshifted HR algorithm consists of factoring A

= H

, where H

is (J

, J

k+1

orthogonal with J

k+1

= H

and R

is upper triangular, and then setting

k+1

= R

. Computational details and convergence properties of the algorithm

can be found in [6].

Unlike the subclass of orthogonal matrices, J-orthogonal matrices can be arbi-

trarily ill conditioned. This poses interesting questions and diﬃculties in the design,

analysis and testing of algorithms and motivates our attempt to gain a better under-

standing of the class of J-orthogonal matrices.

The purpose of this paper is threefold. First we collect some interesting and not

so well-known properties of J-orthogonal matrices. In particular, we give a new proof

of the hyperbolic CS decomposition via the usual CS decomposition by exploiting the

exchange operator. The exchange operator is a tool that has found use in several areas

of mathematics and is known by several diﬀerent names; we give a brief survey of its

properties and its history. We also give a new proof of the fact that J-orthogonal

matrices are optimally scaled under two-sided diagonal scalings. Our second aim is to

show how to generate random J-orthogonal matrices with speciﬁed singular values,

and in particular with speciﬁed norms and condition numbers—a capability that is

very useful for constructing test data for problems with an indeﬁnite ﬂavour. Finally,

we investigate two Newton iterations for computing a J-orthogonal matrix, one involv-

ing only matrix inversion, the other only matrix multiplication. Both iterations are

J-ORTHOGONAL MATRICES 3

shown to conver ge to the J-orthogonal factor in a certain indeﬁnite polar decomposi-

tion under suitable conditions. Analogously to the case of orthogonal matrices and the

corresponding Newton iterations [15], [20], we show that these Newton iterations can

be used to J-orthogonalize a matrix that is not too far from being J-orthogonal. An

application is to the situation where a matrix that should be J-orthogonal turns out

not to be because of rounding or other errors and it is desired to J-orthogonalize it.

J-orthogonal matrices, and hyperbolic problems in general, are the subject of

much recent and current research, covering both theory and algorithms. This paper

provides a self-contained treatment that highlights some techniques and tools useful

in the analysis and application of these matrices; the treatment should also be of more

general interest.

Throughout, we take J to have the form

J =



0 −I



, p + q = n,(1.4)

and we use exclusively the 2-norm: kAk

= max

x6=0

kAxk

/kxk

, where kxk

= x

2. The exchange operator. Let A ∈ R

n×n

and consider the system

y =



p y

q y





p q

p A

q A







= Ax,(2.1)

where A

is nonsingular. We use this partitioning of A throughout the section. By

solving the ﬁrst equation in (2.1) for x

and then eliminating x

from the second

equation we obtain





= exc(A)





,(2.2)

where

exc(A) =



−1

−A

−1

− A

−1



We call exc the exchange operator, since it exchanges x

and y

in (2.1). Note that

the (2,2)-block of exc(A) is the Schur complement of A

in A. The deﬁnition of

the exchange operator can be generalized to allow the “pivot matrix” A

to be any

principal submatrix, but for our purposes this extra level of generality is not necessary.

It is easy to see that the exchange operator is involutary,

exc(exc(A)) = A,(2.3)

and moreover that

exc(JAJ) = Jexc(A)J = exc(A

)

.(2.4)

This last identity shows that J is naturally associated with exc.

We ﬁrst address the nonsingularity of exc(A). The blo ck LU factorization

exc(A) =



I 0



−1

−A

−1

0 I





I 0



0 I



−1

≡ LR

−1

(2.5)

4 NICHOLAS J. HIGHAM

will be useful.

Lemma 2.1. Let A ∈ R

n×n

with A

nonsingular. Then exc(A) is nonsingular if

and only if A

is nonsingular. If A is nonsingular and exc(A

−1

) exists then exc(A)

is nonsingular and

exc(A)

−1

= exc(A

−1

).(2.6)

Proof. For A ∈ R

n×n

with A

nonsingular, the block LU factorization (2.5)

makes clear that exc(A) is nonsingular if and only if A

is nonsingular. The last part

is obtained by rewriting (2.1) as x = A

−1

y and deriving the corresponding analogue

of (2.2):





= exc(A

−1

)





.(2.7)

It follows from (2.7) that for any x

and y

there is a unique x

and y

, which implies

from (2.2) that exc(A) is nonsingular and exc(A)

−1

= exc(A

−1

Note that either of A and exc(A) can be singular without the other being singular,

as shown by the examples with p = q = 1,

A =



1 1

1 0



, exc(A) =



1 −1



, A =



1 1



, exc(A) =



1 −1

1 0



For completeness, we mention that for both exc(A) and exc(A

−1

) to exist and be

nonsingular, it is necessary and suﬃcient that A, A

and A

be nonsingular.

The reason for our interest in the exchange operator is that it maps J-orthogonal

matrices to orthogonal matrices and vice versa. Note that J-orthogonality of A implies

that A

= I + A

and hence that A

is nonsingular and exc(A) exists, but

if A is orthogonal A

can be singular.

Theorem 2.2. Let A ∈ R

n×n

. If A is J-orthogonal then exc(A) is orthogonal.

If A is orthogonal and A

is nonsingular then exc(A) is J-orthogonal.

Proof. Proving the result by working directly with exc(A) involves some laborious

algebra. A more elegant proof involving quadratic forms is given by Stewart and

Stewart [36, sec. 2]. We give another proof, suggested by Chris Paige. Assume ﬁrst

that A is orthogonal with A

nonsingular. Then exc(A

) = exc(A

−1

) exists and

Lemma 2.1 shows that exc(A) is nonsingular and exc(A)

−1

= exc(A

−1

) = exc(A

Hence, using (2.4),

I = exc(A

)exc(A) = Jexc(A)

J · exc(A),

which shows that exc(A) is J-orthogonal.

If A is J-orthogonal then, as noted above, A

is nonsingular. Also JA

J = A

−1

and so from Lemma 2.1, exc(JA

J) = exc(A

−1

) = exc(A)

−1

. But (2.4) shows that

exc(JA

J) = exc(A)

, and we conclude that exc(A) is orthogonal.

As an example of a result of a diﬀerent ﬂavour, we give the following generalization

to arbitrary p of a result obtained by Duﬃn, Hazony, and Morrison [10] for p = 1.

Theorem 2.3. Let A ∈ R

n×n

with A

nonsingular. Then exc(A) + exc(A)

congruent to A + A

Proof. Using (2.5) we have

exc(A) + exc(A)

= LR

−1

+ R

−T

= R

−T

L + L

R)R

−1

J-Orthogonal Matrices: Properties and Generation

Figures

Citations

Structured Factorizations in Scalar Product Spaces

Generalized tensor function via the tensor singular value decomposition based on the T-product

Functions Preserving Matrix Groups and Iterations for the Matrix Square Root

Structured tools for structured matrices

Damped oscillations of linear systems: a mathematical introduction / Kresimir Veselic

References

Matrix Algorithms, Volume II: Eigensystems

The Efficient Generation of Random Orthogonal Matrices with an Application to Condition Estimators

Matrix nearness problems and applications

Some metric inequalities in the space of matrices

Iterative Berechung der reziproken Matrix

Related Papers (5)

Matrix computations

Topics in Matrix Analysis

Cubature Kalman Filtering for Continuous-Discrete Systems: Theory and Simulations

Matrix Analysis

Functions of Matrices: Theory and Computation

Frequently Asked Questions (5)

Q1. What are the contributions in "J-orthogonal matrices: properties and generation higham, nicholas j. 2003" ?

Q2. What is the Newton iteration of a matrix?

Q3. What is the simplest way to prove the convergence of Xk+1?

Q4. What is the simplest way to show that the inverse of the Newton iteration is?

Q5. What is the way to get the inverse of the matrix?