scispace - formally typeset
Open AccessJournal ArticleDOI

Least squares quantization in PCM

S. P. Lloyd
- 01 Mar 1982 - 
- Vol. 28, Iss: 2, pp 129-137
TLDR
In this article, the authors derived necessary conditions for any finite number of quanta and associated quantization intervals of an optimum finite quantization scheme to achieve minimum average quantization noise power.
Abstract
It has long been realized that in pulse-code modulation (PCM), with a given ensemble of signals to handle, the quantum values should be spaced more closely in the voltage regions where the signal amplitude is more likely to fall. It has been shown by Panter and Dite that, in the limit as the number of quanta becomes infinite, the asymptotic fractional density of quanta per unit voltage should vary as the one-third power of the probability density per unit voltage of signal amplitudes. In this paper the corresponding result for any finite number of quanta is derived; that is, necessary conditions are found that the quanta and associated quantization intervals of an optimum finite quantization scheme must satisfy. The optimization criterion used is that the average quantization noise power be a minimum. It is shown that the result obtained here goes over into the Panter and Dite result as the number of quanta become large. The optimum quautization schemes for 2^{b} quanta, b=1,2, \cdots, 7 , are given numerically for Gaussian and for Laplacian distribution of signal amplitudes.

read more

Content maybe subject to copyright    Report

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-28, NO. 2, MARCH 1982
Least Squares Quantization in PCM
STUART P. LLOYD
129
Abstract-It
has long been realized that in pulse-code modulation
(PCM), with a given ensemble of signals to handle, the quantum values
should be spaced more closely
in
the voltage regions where the signal
amplitude is more likely to fall. It has been shown by Panter and Dite that,
in the limit as the number of quanta becomes infinite, the asymptotic
fractional density of quanta per unit voltage should vary as the one-third
power of the probability density per unit voltage of signal amplitudes. In
this paper the corresponding result for any finite number of quanta is
derived; that is, necessary conditions are found that the quanta and
associated quantization intervals of an optimum finite quantization scheme
must satisfy. The optimization criterion used is that the average quantiza-
tion noise power be a minimum. It is shown that the result obtained here
goes over into the Panter and Dite result as the number of quanta become
large. The optimum quantization schemes for 26 quanta, b = 1,2, t ,7, are
given numerically for Gaussian and for Laplacian distribution of signal
amplitudes.
I. INTRODUCTION
T
HE BASIC IDEAS in the pulse-code modulation
(PCM) system [ 11, [2, ch. 191 are the Shannon-Nyquist
sampling theorem and the notion of quantizing the sample
values.
The sampling theorem asserts that a signal voltage s(t),
- 00 < t < cc, containing only frequencies less than
W
cycles/s can be recovered from a sequence of its sample
values according to
s(t) = f$ s(tj)K(t - t,),
-ccoOttm,
(1)
jz-00
where s(tj) is the value of s at thejth sampling instant
t,=&k
-coCj-Co3,
and where
sin2rWt
K(t) = 2mJ,J,7t T
-KloOttco,
(2)
is a sin
t/t
pulse of the appropriate width.
The pulse-amplitude modulation (PAM) system [2, ch.
161 is based on the sampling theorem alone. One sends
over the system channel, instead of the signal values
s(t)
for all times
t,
only a sequence
. . . > s(t-I>, &), s(t*>, * * -
(3)
of samples of the signal. The (idealized) receiver constructs
the pulses
K(t
-
tj)
and adds them together with the
Manuscript received May 1, 1981. The material in this paper was
presented in part at the Institute of Mathematical Statistics Meeting,
Atlantic City, NJ, September lo- 13, 1957.
The author is with Bell Laboratories, Whippany Road, Whippky, NJ
0798 1.
received amplitudes s!i), as in (1), to produce an exact
reproduction of the ongmal band-limited signal s.
PCM is a modification of this. Instead of sending the
exact sample values (3), one partitions the voltage range of
the signal into a finite number of subsets and transmits to
the receiver only the information as to which subset a
sample happens to fall in. Built into the receiver there is a
source of fixed representative voltages-quanta’‘-one
for each of the subsets. When the receiver is informed that
a certain sample fell in a certain subset, it uses its quantum
for that subset as an approximation to the true sample
value and constructs a band-limited signal based on these
approximate sample values.
We define the noise signal as the difference between the
receiver-output signal and the original signal and the noise
power as the average square of the noise signal. The prob-
lem we consider is the following: given the number of
quanta and certain statistical properties of the signal, de-
termine the subsets and quanta that are best in minimizing
the noise power.
II. QUANTIZATION
Let us formulate the quantization process more ex-
plicitly. A quantization scheme consists of a class of sets
{Q,,
Qz>.
. ->
Q,} and a set of quanta
{q,, q2;. -,q,}.
The
{Q,} are any v disjoint subsets of the voltage axis which,
taken together, cover the entire voltage axis. The
{qa}
are
any v finite voltage values. The number v of quanta is to be
regarded throughout as a fixed finite preassigned number.
We associate with a partition (Q,} a label function y(x),
- 00 < x < co, defined for all (real) voltages x by
y(x) = 1 if x liesin Q,,
y(x) =2 if x liesin Q2,
(4)
y(x) = v if x liesin Q,.
If
s(t,)
is the jth sample of the signal s, as in Section I,
then we denote by aj the label of the set that this sample
falls in:
aj
=
Y(s(tj)),
-m<j<co.
In PCM the signal sent over the channel is (in some code
or another) the sequence of labels
. . .
,a-,,a,,q;~~,
(5)
each aj being one of the integers { 1,2,. . +, v}. The technol-
ogy of this transmission does not concern us, except that
001%9448/82/0300-0129$00.75 01982 IEEE

130
IEEE TRANSACTIONSON INFORMATIONTHEORY,VOL. IT-28,~o. ~,MARCH 1982
we assume that such a sequence can be delivered to the
receiver without error.
The receiver uses the fixed voltage 4, as an approxima-
tion to all sample voltages in Q,, (Y = 1,2,. * *, v. That is,
the receiver, being given the value of uj in the sequence (5),
proceeds as if the jth sample of s had value
q,
and
produces the receiver-output signal
r(t) = 5 q,K(r -
tj)P
-co<ttoQ.
j=-*
To put it another way, the system mutilates an actual
sample voltage value x to the quantized value y(x) given
by
Y(X) = 4U(X)
-cQcx)xxcc,
(6)
and we may express the receiver output in terms of this as
r(t) = g y(s(tj))K(t - tj)9
-03~<~<.
jz-03
(7)
Hence the noise signal, defined as
n(t)
=
r(t)
-
s(t),
-m<ttcc,
is given by
n(t) = 5 z(s(t,))K(t - t,), -co~tt<,
jz-00
(8)
where
z(x) =y(x> - x,
-Mexico,
(9)
may be regarded as the quantization error added to a
sample which has voltage value x.
Note that we assume that the receiver uses the nonrealiz-
able pulses (2). If other pulses are used (e.g:, step functions
or other realizable pulses) there will be sampling noise, in
general, even without quantization [3]. Our noise (8) is due
strictly to quantization.
Finally we must emphasize that we assume that the {Q,}
and
{q,)
are constant in time. In deltamodulation and its
refinements the {Q,}
and
{q,}
change from sampling
instant to sampling instant, depending on the past behav-
ior of the signal being handled. Such systems are very
difficult to treat theoretically.
III.
NOISE POWER
Instead of working with a particular band-limited signal,
we assume that there is given a probabilistic family of such
signals. That is, the s of the preceding sections and hence
the various signals derived from it are to be regarded as
stochastic processes [4]. We denote the underlying proba-
bility measure by P{ .} and averages with respect to this
measure (expectations) by E{ . }.
We use the following results of the probabilistic treat-
ment. We assume that the s process is stationary, so that
the cumulative probability distribution function of a sam-
ple,
F(x) = P{s( t) 5 X}) -@3(x<m,
is independent of
t,
- 00 <
t
-C 00, as indicated by the
notation. Then the average power of the s process, assumed
to be finite, is constant in time:
S = E{s(t)} =/:mxzdF(x),
-co ==c t < co.
w-9
Moreover, the
r
and n processes have this same property;
the average receiver-output power
R
is given by
R = E{r2(t)} =I-Ly(x) dF(x), -00 < t < 00,
(11)
where y(x) is defined in (6), and the noise power N is
N = E{n2(t)} =J-tz2(x)dF(x), -00 <t < co,
(1-a
with z(x) as in (9). (Detailed proofs of these statements,
together with further assumptions used, are given in Ap-
pendix A.) The stochastic process problem is thus reduced
to a problem in a single real variable: choose the {Q,} and
{q,}
so that the rightmost integral in (12) is as small as
possible.
IV.
THE BEST QUANTA
We consider first the problem of minimizing N with
respect to the quanta
{qa}
when the {Q,} are fixed preas-
signed sets.
The
dF
integral in (12) may be written more explicitly as
N= i ( (q,-x)2dF(x).
(13)
(The sets {Q,} must be measurable
[dF]
if (1 l)-( 13) are to
have meaning, and we assume always that this is the case.)
If we regard the given
F
as describing the distribution of
unit probability masson the voltage axis [5, p. 571, then
(13) expresses N as the total moment of inertia of the
sets {Q,} around the respective points
(4,).
It is a classical
result that such a moment assumes its minimum value
when each {q,} is the center of mass of the corresponding
{Q,} (see, e.g., [5, p. 1751). That is,
JQ,
4a =
a = 1,2;**,v
>
(14
/
Q dF(x)
n
are the uniquely determined best quanta to use with a
given partition {Q,}.
To avoid the continual mention of trivial cases we as-
sume always that
F
is increasing at least by v + 1 points,
so that the quantization noise does not vanish. Then none
of the denominators in (14) will vanish, at least in an

LLOYD: LEAST SQUARES QUANTIZATION IN PCM
131
optimum scheme. For if Q, has vanishing mass it can be
combined with some set Q, of nonvanishing mass (dis-
carding
q,)
to give a scheme with v - 1 quanta and the
same noise. Then one of the sets of this scheme can be
divided into two sets and new quanta assigned to give a
scheme with v quanta and noise less than in the original
scheme. (We omit the details.)
If the expression on the right in (14) is substituted for
q,
in (13), there results
N = S - ji
LY=l
q:lQ dF(x),
a
where the {q,} here are the optimum ones of (14). The sum
on the right is the receiver-output power from (11). Hence
when the
{q,}
are centers of mass of the {Q,}, optimum or
not, then S =
R
+ N, which implies that the noise is
orthogonal to the receiver output. One expects this in a
least squares approximation, of course.
V.
THE BEST PARTITION
Now we find the best sets {Q,} to use with a fixed
preassigned set of quanta
(4,).
The considerations of this
section are independent of those of the preceding section.
In particular, the best {Q,} for given
{q,}
may not have
the
{q,}
as their centers of mass.
We assume that the given
{q,}
are distinct since it will
never happen in an optimum scheme that
q, = qP
for some
(Y # p. For if
q,
=
qp,
then Q, and QB are effectively one
set Q,
U
Qp as far as the noise is concerned (13), and this
set can be redivided into two sets and these two sets can be
given distinct quantum values in such a way as to reduce
the noise. (We omit the details.)
Consider the probability mass in a small interval around
voltage value x. According to (13) any of this mass which is
assigned to
q,
(i.e., which lies in Q,) will contribute to the
noise at rate
(q,
- x)~ per unit mass. To minimize the
noise, then, any mass in the neighborhood of x should be
assigned to a
qa
for which
(qa
- x)~ is the smallest of the
numbers
(q,
- x)~,
(q2
-
~)~;..,(q~
- x)~. In other
words,
Q, > [X :
(q,
- x) < (
qp
- x) for all /3 # a>,
a = l,,..,V,
modulo sets of measure zero [
dF].
This simplifies to
Q,
1 {x:(qp - q&x - f(q, + qp)) <OforallP+a),
a = 1,2;**,v.
05)
It is straightforward that the best {Q,} are determined by
(15) as the intervals whose endpoints bisect the segments
between successive
{q,},
except that the assignment of the
endpoints is not determined. To make matters definite we
let the {Q,} be left-open and right-closed, so that the best
If C(x) is a condition on x, then {x: C(x)} denotes the set of all x
which satisfy C(x).
partition to use with the given quanta is
Q, = {x: - 00 <XIX,)
Q,= {X:X<XIX~}
(16)
Qv-, = {x:x,-2(xIx,-,}
Q,= {x:x,~~~x~co},
where the endpoints {x~} are given
Xl = &I, +
q2)
x2 = 5(42 f q3)
07)
X,-I = tkv-1 + 4,).
We have assumed, as we shall hereafter, that the indexing
is such that
q, -C q2 < . . * -C q,.
VI.
QUANTIZATION PROCEDURES
From Sections IV and V we know that we may confine
our attention to quantization schemes defined by 2v - 1
numbers
ql<x, <q,<x,< -..<qv-,<xp-,(qy, (18)
where the {x~} are the endpoints of the intervals {Q,}, as
in (16), and the {
qa}
are the corresponding quanta. We will
regard such a set of numbers as the Cartesian coordinates
of a point
in (2v - 1)-dimensional Euclidean space
E2y-I. The
noise
as a function of
p
has the form
N(p) =Jx (ql - x)dF(x) +jx2(q2 - x)dF(x) + .a.
-cc
+,-;, - x)dF(x).
(19)
h-1
In an optimum scheme the
{qa}
will be centers of mass
of the corresponding {Q,}, (14), and the ( xa} will lie
midway between adjacent
{q,},
(17). From the derivations
these conditions are sufficient that
N(p)
be a minimum
with respect to variations in each coordinate separately and
hence are necessary conditions at a minimum of
N(p).
As
it turns out, however, they are not sufficient conditions for
a minimum of
N(p).
Points at which (14) and (17) are
satisfied, which we term
stationary points,
while never local
maxima, may be saddle points of
N(p).
Moreover, among
the stationary points there may be several local minima,
only one of which is the sought absolute minimum of N(p).
These complications are discussed further in Appendix B.
The author has not been able to determine sufficient
conditions for an absolute minimum.
The derivations suggest one trial-and-error method for
finding stationary points. A trial point
p(l)
in
E2-,
is

132
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-28, NO. 2, MARCH 1982
chosen as follows. The endpoints
-00 <xp<xp< . . . <xp, < CQ
are chosen arbitrarily except that each of the resulting
{Qt)} should have nonvanishing mass. Then the centers of
mass of these sets are taken as the first trial quanta
{q:)}.
These values will not satisfy the midpoint conditions
(17), in general, so that the second trial point
pc2)
is taken
to be
qL2= q;),
a = 1,2;. ,V
,@) = i(q2 + q?,),
a a
a
a = 1,2;-*,v - 1,
with appropriate modifications if any of the resulting { QL2)}
have vanishing mass. This step does not increase the noise,
in view of the discussion in Section V; that is, N(pc2)) 5
N(
p)).
The new
{qf)},
centers of mass (c.m.) of the old {Q$},
will not be centers of mass of the new {QL2)}, in general;
trial point
pc3)
is determined by
x(3) = .(2)
,a, = (a
ff = 1,2,*.*,v - 1,
c.m. of Qi3)),
a = 1,2;*.,v.
For the resulting noise we have N(pC3)) 5
N(pc2)).
We continue in this way, imposing conditions (14) and
(17) alternately. There results a sequence of trial points
p
p2
. . .
9 7
(20)
such that
N( p() r N(
p’“))
L . . . .
The noise is nonnegative, so that lim,N(
pcm))
will exist,
and we might hope that the sequence (20) had as a limit a
local minimum of N(p).
If the sequence (20) has no limit points then some of the
{xLm)} must become infinite with
m;
this corresponds to
quantizing into fewer than v quanta. Since we have as-
sumed that
F
increases at least by v -t 1 points there will
be quantizing schemes with v quanta for which the result-
ing noise is less than the optimum noise for v - 1 quanta,
obviously. If
p(
is such a scheme then (20) will have limit
points, using the property that N(p()) is a decreasing
sequence.2
Suppose p() is such a limit point. If each of the coordi-
nate values {x&m)} of
p(“’
is a continuity point of
F
then it
is easy to see that the coordinates of
p()
will satisfy both
(14) and (17). In particular, if N(p) has a unique stationary
point
p,,
(which is the minimum sought), then the sequence
(20), unless it diverges, will converge to
p,,.
Note, by the way, that at a local minimum of N(p) the
numbers {xa} are necessarily continuity points of
F.
Sup-
pose to the contrary that there is a nonvanishing amount of
mass concentrated at one of the,endpoints {x0>, and that
the adjacent sets Q, and Q,,, are as in (16), so that the
mass at x, belongs to Q,. The centers of mass
q,
and
qa+ 1
21t seems likely that this condition A( p()) 5 (optimum noise for Y - 1
quanta) is stronger than necessary for the nondivergence of (20).
will lie equidistant from x, (17), and from (19) the noise
will not change if we reassign the mass at x, to Q,, 1,
retaining the given (4,) as quanta. But
q,
and
qa+
, are
definitely not centers of mass of the corresponding mod-
ified sets, and the noise will strictly decrease as
qa
and
qa+ 1
are moved to the new centers of mass. Thus the given
configuration is not a local minimum, contrary to assump-
tion. From this result and (19) we see that N(p) is continu-
ous in a neighborhood of a local minimum. We have
proved also that there is no essential loss of generality in
assuming the form (16) for the {Q,}.
We refer to the above trial-and-error method as Method
I. Another trial-and-error method is the following one,
Method II. To simplify the discussion we assume for the
moment that
F
is continuous and nowhere constant. We
choose a trial value
ql
satisfying
41
< ( xdF(x).
J-cc
The condition that
q,
be the center of mass of Q, de-
termines x1 as the unique solution of
/
x xdF(x)
41= /:: dF(x) .
-00
The quantities
q,
and x1 now being known, the first of
conditions (17) determines
q2
as
q2
=
2x1
- 41.
If this
q2
lies to the *right of the center of mass of the
interval (x,, co) then the trial chain terminates, and we
start over again with a different trial value
q,.
Otherwise,
x1 and
q2
being known, the second of conditions (14):
serves to determine x2 uniquely. Now the second of condi-
tions (17) gives
cl3 = 2x2 - q2.
We continue in this way, obtaining successively
41, Xl,
* .
.,q,-
,, x,,- ,,
q,;
the last step is the determination
of
q,
according to
4, =
2x,-,
- 4,-l.
(21)
However in this procedure we have not used the last of
conditions (14):
J
co xdF(x)
4, = yz dF(x) ?
(22)
X,-I
and the
q,
obtained from (21) will not satisfy (22) in
general. The discrepancy between the right members of
(21) and (22) will vary continuously with the starting,value
q,,
and the method consists of running through such chains

LLOYD: LEAST SQUARES QUANTIZATION IN PCM
133
using various starting values until the discrepancy is re-
duced to zero.
This method is applicable to more general
F,
with some
obvious modifications. When
F
has intervals of constancy
the {xJ may not be uniquely determined by conditions
(14), and a trial chain may involve several arbitrary param-
eters besides
q,.
Discontinuities of
F
will cause no real
trouble, since we know that the {xu} of an optimum
scheme are continuity points of F, a trial chain that does
not have this property is discarded. We note that Method
II may be used to locate all stationary points of N(p).
VII.
EXAMPLES
In all of the examples we now consider, the distribution
of sample values is absolutely continuous with a sample
probability density f = F, which is an even function. If
N(p) has a unique stationary point, which we assume to be
the case in the examples treated, then the optimum
{qa}
and {xa} will clearly be symmetrically distributed around
the origin. In applications we are usually interested in
having an even number of quanta, v = 2~~ so we renumber
the positive endpoints and quanta according to
O=xo<q,<x,< . ..<q.-,<x,-,<q,; (23)
the endpoints and quanta for the negative half-axis .are the
negatives of these.
We normalize to unit signal power S = 1. The {q,} and
{x~} for other values of S are to be obtained by multiply-
ing the numbers in the tables by \rs.
The simplest case is the uniform distribution:
Method II of the preceding section shows that N(p) in this
case has a unique stationary point, which is necessarily an
absolute minimum. The optimum scheme is the usual one
with v equal intervals of width l/(2&) each; the quanta
being the midpoints of these intervals. The minimum value
of the noise is the familiar N = l/v*.
Another case of possible interest is the Gaussian:
-cQ~x-=c<.
TABLE I
GAUSSIAN,Y = 2
a
40
x,
1
0.7919
co
TABLE II
GAUSSIAN,~
= 4
a
4a
X,
1 0.4528
0.9816
2 1.5104
w
TABLE III
GAUSSIAN,Y = 8
a
4a X,
1
0.245 1 0.5006
2 0.7560
1.0500
3 1.3439
1.7480
4 2.1520
co
TABLE IV
GAUSSXAN,~
= 16
a
1 0.7584
0.%82
2 0.3880
0.5224
3 0.6568
0.7996
4 0.9423
1.0993
5 1.2562
1.4371
6 1.6181
1.8435
7 2.0690
2.4008
8 2.1326
co
TABLE VIII
GAUSSIAN;OPTIMUMNOISEFORVARIOUSVALUES OF Y
Y
N
Y2N
YXl
2 0.3634
1.452
4 0.1175
1.880 3.93
8 3.455 x 10-2
2.205 4.00
16 9.500 x 10-3
2.430 4.13
32
64
128
(03)
(0)
(2.72)
(4.34)
For speech signals a distribution which has been found
useful empirically is the Laplacian: 4
-lxl!h
f(x) = c-.-
6
-m<xX<.
The optimum quantizing schemes for this distribution for
The optimum schemes for v = 2,
b =
1,2,. . . ,7, are given
v=2b,b=
1,2,-e.
,7, are given in Tables IX-XV, respec-
in Tables I-VIL3 respectively. The corresponding noise
tively. The corresponding N, v*N, and vx, values are given
values appear in Table VIII together with the quantities
in Table XVI; again, we notice certain regularities.
v*N and vx,. The behavior of these latter with increasing v
VIII.
ASYMPTOTIC PROPERTIES
hint at the existence of asymptotic properties; we examine
this question in the next section.
Let us assume that the distribution
F
is absolutely con-
tinuous with density function
f
=
F,
which is itself dif-
Since some of the tables were never completed, those tables although
mentioned in text are not included in this paper.
4The author is indebted to V. Vyssotsky of the Acoustics Research
Group for this information (private communication).

Citations
More filters
Journal ArticleDOI

Wasserstein Blue Noise Sampling

TL;DR: A multi-class blue noise sampling algorithm by throwing samples as the constrained Wasserstein barycenter of multiple density distributions using an entropic regularization term to break the partition required by the previous Capacity-Constrained Voronoi Tessellation method.
Journal ArticleDOI

Remapping the cognitive and neural profiles of children who struggle at school.

TL;DR: This study makes a novel move toward identifying data-driven neurocognitive dimensions underlying learning-related difficulties in a representative sample of poor learners, using unsupervised machine learning in the form of an artificial neural network.
Journal ArticleDOI

On the reaction to deprecation of clients of 4 + 1 popular Java APIs and the JDK

TL;DR: This work addresses several shortcomings of the previous study, namely: a study of several distinct API clients in a popular, statically-typed language, with more accurate version information and the connection between reaction patterns of a client and the deprecation policy adopted by the API used.
Journal ArticleDOI

Quantifying receptor trafficking and colocalization with confocal microscopy

TL;DR: A 3D level set protocol is used to accurately segment cells using only the signal from fluorescently labelled receptor, and a strategy for the visualisation of co-occurrence is presented, which provides an unbiased alternative to colour overlays.
Proceedings Article

Strong Coresets for Hard and Soft Bregman Clustering with Applications to Exponential Family Mixtures

TL;DR: This work proposes a single, practical algorithm to construct strong coresets for a large class of hard and soft clustering problems based on Bregman divergences and implies a randomized polynomial-time approximation scheme for hard clustering.
References
More filters
Journal ArticleDOI

The Philosophy of PCM

TL;DR: This paper shows in a general way some of the advantages of PCM, and distinguishes between what can be achieved with PCM and with other broadband systems, such as large-index FM.
Journal ArticleDOI

Quantization Distortion in Pulse-Count Modulation with Nonuniform Spacing of Levels

P.F. Panter, +1 more
TL;DR: In this article, it was shown that the distortion introduced in a pulse-count-modulation system due to quantization can be minimized by non-uniform spacing of levels, and that minimum distortion is significantly less than distortion resulting from uniform quantization when the crest factor of the signal is greater than four.
Journal ArticleDOI

Instantaneous companding of quantized signals

TL;DR: In this article, the quantizing error power is calculated as a function of the degree of companding, the number of quantizing steps, the signal volume, the size of the equivalent dc component, and the statistical distribution of amplitudes in the signal.
Related Papers (5)