Dense quantum coding and quantum finite automata

doi:10.1145/581771.581773

Dense Quantum Coding and Quantum Finite Automata

ANDRIS AMBAINIS

Institute for Advanced Study, Princeton, New Jersey

ASHWIN NAYAK

California Institute of Technology, Pasadena, California

AMNON TA-SHMA

Tel-Aviv University, Tel-Aviv, Israel

AND

UMESH VAZIRANI

University of California, Berkeley, California

Abstract. We consider the possibility of encoding m classical bits into many fewer n quantum bits

(qubits) so that an arbitrary bit from the original m bits can be recovered with good probability.

We show that nontrivial quantum codes exist that have no classical counterparts. On the other hand,

we show that quantum encoding cannot save more than a logarithmic additive factor over the best

classical encoding. The proof is based on an entropy coalescence principle that is obtained by viewing

Holevo’s theorem from a new perspective.

In the existing implementations of quantum computing, qubits are a very expensive resource.

Moreover,itis difﬁcult toreinitializeexistingbits during the computation.Inparticular,reinitialization

is impossible in NMR quantum computing, which is perhaps the most advanced implementation of

quantum computing at the moment. This motivates the study of quantum computation with restricted

memory and no reinitialization, that is, of quantum ﬁnite automata. It was known that there are

languages that are recognized by quantum ﬁnite automata with sizes exponentially smaller than those

of corresponding classical automata. Here, we apply our technique to show the surprising result that

there are languages for which quantum ﬁnite automata take exponentially more states than those of

corresponding classical automata.

Preliminary versions of this work appeared as Ambainis et al. [1999] and Nayak [1999b].

A. Ambainis was supported by the Berkeley Fellowship for Graduate Studies and, in part, by NSF

grant CCR-9800024; A. Nayak and U. Vazirani were supported by JSEP grant FDP 49620-97-1-

0220-03-98 and NSF grant CCR-9800024.

Authors’ addresses: A. Ambainis, Institute for Advanced Study, Einstein Dr., Princeton, NJ 08540,

e-mail: ambainis@ias.edu; A. Nayak, Computer Science Department and Institute for Quan-

tum Information, Mail Code 256-80, Pasadena, CA 91125, e-mail: nayak@cs.caltech.edu;

A. Ta-Shma, Computer Science Department, Tel-AvivUniversity, Ramat Aviv, Tel Aviv 69978, Israel,

e-mail: amnon@post.tau.ac.il; U. Vazirani, Computer Science Division, 671 Soda Hall, University of

California, Berkeley, CA, e-mail: vazirani@cs.berkeley.edu.

Permission to make digital/hard copy of part or all of this work for personal or classroom use is

granted without fee provided that the copies are not made or distributed for proﬁt or commercial

advantage, the copyright notice, the title of the publication, and its date appear, and notice is given

that copying is by permission of ACM, Inc. To copy otherwise, to republish, to post on servers, or to

redistribute to lists requires prior speciﬁc permission and/or a fee.

C

°

2002 ACM 0004-5411/04/0700-0496 $5.00

Journal of the ACM, Vol. 49, No. 4, July 2002, pp. 496–511.

Dense Quantum Coding and Quantum Finite Automata 497

Categories and Subject Descriptors: F.2.0 [Analysis of Algorithms and Problem Complexity]:

General; F.1.1 [Computation by Abstract Devices]: Models of Computation—automata (e.g. ﬁnite,

push-down, resource-bounded)

General Terms: Theory

Additional Key Words and Phrases: Automaton size, communication complexity, encoding, ﬁnite

automata, quantum communication, quantum computation

1. Introduction

The tremendous information processing capabilities of quantum mechanical sys-

tems may be attributed to the fact that the state of an n quantum bit (qubit) system

is given by a unit vector in a 2

n

dimensional complex vector space. This suggests

the possibility that classical information might be encoded and transmitted with

exponentially fewer qubits. Yet, according to a fundamental result in quantum in-

formation theory, Holevo’s theorem [Holevo 1973], no more than n classical bits of

information can faithfully be transmitted by transferring n quantum bits from one

party to another. In view of this result, it is tempting to conclude that the exponen-

tially many degrees of freedom latent in the description of a quantum system must

necessarily stay hidden or inaccessible.

However, the situation is more subtle since the recipient of the n-qubit quan-

tum state has a choice of measurement he or she can make to extract information

about their state. In general, these measurements do not commute. Thus making a

particular measurement will disturb the system, thereby destroying some or all the

information that would have been revealed by another possible measurement. This

opens up the possibility of quantum random access codes, which encode classical

bits into many fewer qubits, such that the recipient can choose which bit of classical

information he or she would like to extract out of the encoding. We might think

of this as a disposable quantum phone book, where the contents of an entire tele-

phone directory are compressed into a few quantum bits such that the recipient of

these qubits can, via a suitably chosen measurement, look up any single telephone

number of his or her choice. Such quantum codes, if possible, would serve as a

powerful primitive in quantum communication.

Toformalize this, say we wish to encode m bits b

1

,...,b

m

into n qubits (m Àn).

Then a quantum random access encoding with parameters m, n, p (or simply

an m

p

7→ n encoding) consists of an encoding map from {0, 1}

m

to mixed states

with support in C

2

n

, together with a sequence of m possible measurements for the

recipient. The measurements are such that if the recipient chooses the ith measure-

ment and applies it to the encoding of b

1

···b

m

, the result of the measurement is b

i

with probability at least p.

The main point here is that since the m different possible measurements may

be noncommuting, the recipient cannot make the m measurements in succession

to recover all the encoded bits with a good chance of success. Thus the existence

of m

p

7→ n quantum random access codes with m Àn and p >

1

2

does not nec-

essarily violate Holevo’s bound. Furthermore, even though C

k

can accommodate

only k mutually orthogonal unit vectors, it can accommodate a

k

almost mutually

orthogonal unit vectors (i.e., vectors such that the inner product of any two has an

absolute value less than, say,

1

10

) for some a > 1. Indeed, there is no a priori reason

498 AMBAINIS ET AL.

to rule out the existence of codes that represent a

n

classical bits in n quantum bits

for some constant a > 1.

We start by showing that quantum encodings are more powerful than classi-

cal ones. We describe a

2

0.85

7→ 1

quantum encoding, and prove that there is no 2

p

7→ 1

classical encoding for any p >

1

2

. Our quantum encoding may be generalized to a

3

0.78

7→ 1

encoding, as was shown by Chuang [1997], and to encodings of more bits

into one quantum bit.

The main result in this paper is that (despite the potential of quantum encoding

shown by the arguments and results presented above) quantum encoding does not

provide much compression. We prove that any

m

p

7→ n

quantum encoding satis-

ﬁes n ≥ (1 − H(p)) m, where H(p) =−plog p − (1 − p) log(1 − p)isthe

binary entropy function. The main technique in the proof is the use of the entropy

coalescence lemma, whichquantiﬁes the increasein entropy when we take a convex

combination of mixed states. This lemma is obtained by viewing Holevo’s theorem

from a new perspective.

We turn to upper bounds on compression next, and show that the lower bound

is asymptotically tight up to an additive logarithmic term, and can be achieved

even with classical encoding. For any p > 1/2, we give a construction for m

p

7→ n

classical codes with n = (1 − H(p)) m + O(log m). Thus, even though quantum

random access codes can be more succinct as compared to classical codes, they

may be only a logarithmic number of qubits shorter.

In many of the existing quantum computing implementations, the complexity of

implementing the system grows tremendously as the number of qubits increases.

Moreover, even discarding one qubit and replacing it by a new qubit initialized

to

|

0

i

(often called a clean qubit) while keeping the total number of qubits the

same might be difﬁcult or impossible (as in NMR quantum computing [Nielsen

and Chuang 2000]). This has motivated a huge body of work on one-way quantum

ﬁnite automata (QFAs), which are devices that model computers with a small ﬁnite

memory. During the computation of a QFA, no clean qubits are allowed, and in

addition no intermediate measurements are allowed, except to decide whether to

accept or reject the input.

We deﬁne generalized one-way quantum ﬁnite automata (GQFAs) that capture

the most general quantum computation that can be carried out with restricted mem-

oryand noextracleanqubits.In particular,themodel allowsarbitrarymeasurements

uponthestatespaceoftheautomaton as long as the measurements can be carried out

without clean qubits. We believe our model accurately incorporates the capabilities

of today’s implementations of quantum computing.

In Kondacs and Watrous [1997] it was shown that not every language recognized

by a classical deterministic ﬁnite automaton (DFA) is recognized by a QFA. On the

otherhand,therearelanguagesthatarerecognizedbyQFAswithsizesexponentially

smaller than those of corresponding classical automata [Ambainis and Freivalds

1998]. It remained open whether for any language that can be recognized by a

one-way ﬁnite automaton both classically and quantum-mechanically, a classical

automaton can be efﬁciently simulated by a QFA with no extra clean qubits. We

answer this question in the negative.

We apply the entropy coalescence lemma in a computational setting to give a

lower bound on the sizeof (GQFAs). We provethat there is a sequence of languages

for which the minimal GQFA has exponentially more states than the minimal DFA.

Dense Quantum Coding and Quantum Finite Automata 499

Itmaybesurprising that despite their quantum power(and irreversible computation,

thanks to the intermediate measurements) GQFAs are exponentially less powerful

for certain languages than classical DFAs. This lower bound highlights the need

for clean qubits for efﬁcient computation.

2. Preliminaries

2.1. Q

UANTUM SYSTEMS. Just as a bit (an element of {0, 1}) is a fundamental

unitofclassicalinformation,aqubitisthefundamentalunitofquantuminformation.

A qubit is described by a unit vector in the two-dimensional Hilbert space C

2

. Let

|

0

i

and

|

1

i

be an orthonormal basis for this space.

1

In general, the state of the qubit

is a linear superposition of the form α

|

0

i

+β

|

1

i

. The state of n qubits is described

by a unit vector in the n-fold tensor product C

2

⊗C

2

⊗···⊗C

2

. An orthonormal

basis for this space is now given by the 2

n

vectors

|

x

i

, where x ∈{0,1}

n

. This is

often referred to as the computational basis. In general, the state of n qubits is a

linear superposition of the 2

n

computational basis states. Thus the description of

an n qubit system requires 2

n

complex numbers. This is arguably the source of the

astounding information processing capabilities of quantum computers.

The information in a set of qubits may be “read out” by measuring it in an

orthonormal basis, such as the computational basis. When a state

P

x

α

x

|

x

i

is

measured in the computational basis, we get the outcome x with probability

|

α

x

|

2

.

More generally, a (von Neumann) measurementon a Hilbert space H is deﬁnedby a

set of orthogonal projection operators

{

P

i

}

. When a state

|

φ

i

is measured according

to this set of projection operators, we get outcome i with probability

k

P

i

|

φ

ik

2

.

Moreover, the state of the qubits “collapses” to (i.e., becomes) P

i

|

φ

i

/

k

P

i

|

φ

ik

,

when the outcome i is observed. In order to retrieve information from an unknown

quantum state

|

φ

i

, it is sometimes advantageous to augment the state with some

ancillary qubits, so that the combined state is now

|

φ

i

⊗

¯

0

®

, before measuring

them jointly according to a set of operators

{

P

i

}

as above. This is the most general

form of quantum measurement, and is called a positive operator valued measure-

ment (POVM).

2.2. D

ENSITY MATRICES. In general, a quantum system may be in a mixed

state—a probability distribution over superpositions. For example, such a mixed

state may result from the measurement of a pure state

|

φ

i

.

Consider the mixed state {p

i

,

|

φ

i

}, where the superposition

|

φ

i

occurs with

probability p

i

. The behavior of this mixed state is completely characterized by its

density matrix ρ =

P

i

p

i

|

φ

i

ih

φ

i

|

. (The “bra” notation

h

φ

|

here is used to denote

the conjugate transpose of the superposition (column vector)

|

φ

i

. Thus

|

φ

ih

φ

|

denotes the outer product of the vector with itself.) For example, under a unitary

transformation U, the mixed state {p

i

,

|

φ

i

} evolves as

{

p

i

, U

|

φ

i

i}

, so that the

resulting density matrix is Uρ U

†

. When measured according to the projection

operators {P

j

}, the probability q

j

of getting outcome j is q

j

=

P

i

p

i

kP

j

|

φ

i

k

2

=

Tr(P

j

ρ P

j

), and the residual density matrix is P

j

ρ P

j

/q

j

. Thus, two mixed states

with the same density matrix have the same behavior under any physical operation.

We will therefore identify a mixed state with its density matrix.

1

This is Dirac’s ket notation.

|

φ

i

is another way of denoting a vector

E

φ.

500 AMBAINIS ET AL.

The following properties of density matrices follow from the deﬁnition. For any

density matrix ρ,

(1) ρ is Hermitian, that is, ρ = ρ

†

;

(2) ρ has unit trace, that is, Tr(ρ) =

P

i

ρ(i, i) = 1;

(3) ρ is positive semideﬁnite, that is,

h

ψ

|

ρ

|

ψ

i

≥ 0 for all

|

ψ

i

.

Thus, every density matrix is unitarily diagonalizable and has nonnegative real

eigenvalues that sum up to 1.

Recall that the amount of randomness (or the uncertainty) in a classical proba-

bility distribution may be quantiﬁed by its Shannon entropy. Doing the same for

a mixed state is tricky because all mixed states consistent with a given density

matrix are physically indistinguishable, and therefore contain the same amount of

“entropy.” Before we do this, we recall the classical deﬁnitions.

2.3. C

LASSICAL ENTROPY AND MUTUAL INFORMATION. The Shannon en-

tropy S(X) of a classical random variable X that takes values x in some ﬁnite

set with probability p

x

is deﬁned as

S(X) =−

X

x

p

x

log p

x

.

The mutual information I(X : Y) of a pair of random variables X, Y is deﬁned by

I(X : Y) =S(X)+ S(Y )− S(XY),

where XY denotes the joint random variable with marginals X and Y. It quantiﬁes

the amount of correlation between the random variables X and Y.

Fano’s inequality asserts that if Y can predict X well, then X and Y have large

mutual information. We use a simple form of Fano’s inequality, referring only to

Boolean variables X and Y.

F

ACT 2.1(FANO’S INEQUALITY). Let X bea uniformly distributedboolean ran-

dom variable, and let Y be a boolean random variable such that Pr(X = Y) = p.

Then I(X : Y) ≥1 − H(p).

For other properties of these concepts we refer the reader to a standard text (such

as Cover and Thomas [1991]) on information theory.

2.4. V

ON NEUMANN ENTROPY. Consider the mixedstate X ={p

i

,

|

φ

i

}, where

the superposition

|

φ

i

occurs with probability p

i

. Since the constituent states

|

φ

i

of the mixture are not perfectly distinguishable in general, we cannot deﬁne the

entropy of this mixture to be the Shannon entropy of {p

i

}. Another way to see this

is that this mixture is equivalent to any other mixture with the same density matrix,

and so should have the same entropy as that mixture. Indeed, a special such equiv-

alent mixture can be obtained by diagonalizing the density matrix—the constituent

states of this mixture are orthogonal, and therefore perfectly distinguishable. Now,

the entropy of the density matrix can be deﬁned to be the Shannon entropy of

these probabilities.

To formalize this, recall that every density matrix ρ is unitarily diagonalizable:

ρ =

X

j

λ

j

|ψ

j

ihψ

j

|,

Dense quantum coding and quantum finite automata

Figures

Citations

Quantum theory: Concepts and methods

The Theory of Quantum Information

Index Coding With Side Information

Information causality as a physical principle

Information Causality as a physical principle

References

Elements of information theory

Quantum Computation and Quantum Information

And thomas j

Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)

General properties of entropy

Related Papers (5)

Quantum Computation and Quantum Information

Conjugate coding

Communication Complexity

On the Einstein-Podolsky-Rosen paradox

Quantum Complexity Theory