What are the future works in "Down-scaling for better transform compression" ?

Further work is required in order explore extensions and implementation issues, such as an efficient estimation for the image statistics, extraction of second order statistics locally and using an hierarchical slicing of the image to various block sizes, and more. Further work is required to replace the approach presented here to a locally adaptive one, as is done naturally by the wavelet coders.

(Open Access) Down-scaling for better transform compression (2003) | Alfred M. Bruckstein

Q: What have the authors contributed in "Down-scaling for better transform compression" ?

Nevertheless, as the authors go to low bit rates that imply lower quality, JPEG introduces disturbing artifacts. Motivated by this idea, the authors show how down-sampling an image to a low resolution, then using JPEG at the lower resolution, and subsequently interpolating the result to the original resolution can improve the overall PSNR performance of the compression process. The authors show that the image auto-correlation can provide good estimate for establishing the down-sampling factor that achieves optimal performance.

1132 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 12, NO. 9, SEPTEMBER 2003

Down-Scaling for Better Transform Compression

Alfred M. Bruckstein, Michael Elad, and Ron Kimmel

Abstract—The most popular lossy image compression method

used on the Internet is the JPEG standard. JPEG’s good compres-

sion performance and low computational and memory complexity

make it an attractive method for natural image compression.

Nevertheless, as we go to low bit rates that imply lower quality,

JPEG introduces disturbing artifacts. It is known that at low bit

rates a down-sampled image when JPEG compressed visually

beats the high resolution image compressed via JPEG to be

represented with the same number of bits. Motivated by this

idea, we show how down-sampling an image to a low resolution,

then using JPEG at the lower resolution, and subsequently

interpolating the result to the original resolution can improve the

overall PSNR performance of the compression process. We give an

analytical model and a numerical analysis of the down-sampling,

compression and up-sampling process, that makes explicit the

possible quality/compression trade-offs. We show that the image

auto-correlation can provide good estimate for establishing the

down-sampling factor that achieves optimal performance. Given

a specific budget of bits, we determine the down sampling factor

necessary to get the best possible recovered image in terms of

PSNR.

Index Terms—Bit allocation, image down-sampling, JPEG com-

pression, quantization.

I. INTRODUCTION

HE most popular lossy image compression method used

on the Internet is the JPEG standard [1]. Fig. 1 presents a

basic block diagram of the JPEG encoder. JPEG uses the Dis-

crete Cosine Transform (DCT) on image blocks of size 8

pixels. The fact that JPEG operates on small blocks is moti-

vated by both computational/memory considerations and the

need to account for the nonstationarity of the image. A quality

measure determines the (uniform) quantization steps for each

of the 64 DCT coefficients. The quantized coefficients of each

block are then zigzag-scanned into one vector that goes through

a run-length coding of the zero sequences, thereby clustering

long insignificant low energy coefficients into short and com-

pact descriptors. Finally, the run-length sequence is fed to an en-

tropy coder, that can be a Huffman coding algorithm with either

a known dictionary or a dictionary extracted from the specific

statistics of the given image. A different alternative supported

by the standard is arithmetic coding.

JPEG’s good middle and high rate compression performance

and low computational and memory complexity make it an at-

Manuscript received May 29, 2001; revised April 22, 2003. The associate ed-

itor coordinating the review of this manuscript and approving it for publication

was Prof. Trac D. Tran.

A. M. Bruckstein and R. Kimmel are with the Computer Science Department,

The Technion—Israel Institute of Technology, Haifa 32000, Israel (e-mail:

freddy@cs.technion.ac.il; ron@cs.technion.ac.il).

M. Elad is with the Computer Science Department—SCCM Program, Stan-

ford University, Stanford, CA 94305 USA (e-mail: elad@sccm.stanford.edu).

Digital Object Identifier 10.1109/TIP.2003.816023

tractive method for natural image compression. Nevertheless, as

we go to low bit rates that imply lower quality, the JPEG com-

pression algorithm introduces disturbing blocking artifacts. It

appears that at low bit rates a down-sampled image when JPEG

compressed and later interpolated, visually beats the high res-

olution image compressed directly via JPEG using the same

number of bits. Whereas this property is known to some in the

industry (see, for example, [9]), it was never explicitly proposed

nor treated in the scientific literature. One might argue however,

that the hierarchical JPEG algorithm implicitly uses this idea

when low bit-rate compression is considered [1].

Let us first establish this interesting property though a

simple experiment, testing the compression-decompression

performance both by visual inspection and quantitative

mean-square-error comparisons. An experimental result dis-

played in Fig. 2 shows indeed that both visually and in terms

of the Mean Square Error (or PSNR), one obtains better results

using down-sampling, compression, and interpolation after the

decompression. Two comments are in order at this stage: i)

throughout this paper, all experiments are done using Matlab

v.6.1. Thus, simple IJG-JPEG is used with fixed quantization

tables, and control over the compression is achieved via the

Quality parameter and ii) throughout this paper, all experi-

ments applying down-sampling use an anti-aliasing pre-filter,

as Matlab 6.1 suggests, through its standard image resizing

function.

Let us explain this behavior from an intuitive perspective.As-

sume that for a given image we use blocks of 8

8 pixels in the

coding procedure. As we allocate too few bits (say 4 bits per

block on average), only the DC coefficients are coded and the

resulting image after decompression consists of essentially con-

stant valued blocks. Such an image will clearly exhibit strong

blocking artifacts. If instead the image is down-sampled by a

factor of 2, the coder is now effectively working with blocks of

16 and has an average budget of bits to code

the coefficients.Thus, some bits will be allocated to higher order

DCT coefficients as well, and the blocks will exhibit more de-

tail. Moreover, as we up-scale the image at the decoding stage

we add another improving ingredient to the process, since inter-

polation further blurs the blocking effects. Thus, the down-sam-

pling approach is expected to result in better both visually and

qualitatively outcomes.

In this paper we propose an analytical explanation to

the above phenomenon, along with a practical algorithm

to automatically choose the optimal down-sampling factor

for best PSNR. Following the method outlined in [4], we

derive an analytical model of the compression-decompression

reconstruction error as a function of the memory budget, (i.e.,

the total number of bits) the (statistical) characteristics of the

image, and the down-sampling factor. We show that a simplistic

BRUCKSTEIN et al.: DOWN-SCALING FOR BETTER TRANSFORM COMPRESSION 1133

Fig. 1. JPEG encoder block diagram.

Fig. 2. Original image (on theleft), JPEG compressed-decompressed image (middle), and down-sampled-JPEG compressed-decompressed and up sampled image

(right). The down-sampling factor was 0.5. The compressed 256

256 “Lena” image in both cases used 0.25 bpp inducing MSEsof 219.5 and 193.12, respectively.

The compressed 512

512 “Barbara” image in both cases used 0.21 bpp inducing MSEs of 256.04 and 248.42, respectively.

second order statistical model provides good estimates for the

down-sampling factors that achieves optimal performance.

This paper is organized as follows.Sections II–IV present the

analytic model and explore its theoretical implications. In Sec-

tion II we start the analysis by developing a model that describes

the compression-decompression error based on the quantiza-

tion error and the assumption that the image is a realization of

a Markov random field. Section III then introduces the impact

of bit-allocation so as to relate the expected error to the given

bit-budget. In Section IV we first establish several important pa-

rameters used by the model, and then use the obtained formula-

tion in order to graphically describe the trade-offs between the

total bit-budget, the expected error, and the coding block-size.

Section V describes an experimentalsetup that validates the pro-

posed model and its applicability for choosing best down-sam-

pling factor for a given image with a given bits budget. Finally,

Section VI ends the paper with some concluding remarks.

II. A

NALYSIS OF A CONTINUOUS “JPEG-STYLE”IMAGE

REPRESENTATION MODEL

In this section we start building a theoretical model for ana-

lyzing the expected reconstruction error when doing compres-

sion-decompression as a function of the total bits budget, the

1134 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 12, NO. 9, SEPTEMBER 2003

characteristics of the image, and the down-sampling factor. Our

model considers the image over a continuous domain rather then

a discrete one, in order to simplify the derivation. The steps we

follow are as follows.

1) We derive the expected compression-decompression

mean-square-error for a general image representation.

Slicing of the image domain into

by blocks is

assumed.

2) We use the fact that the coding is done in the transform

domain using an orthonormal basis, to derive to error in-

duced due to truncation only.

3) We extend the calculation to account for quantization

error of the nontruncated coefficients.

4) We specialize the image transform to the DCT basis.

5) We introduce an approximation for the quantization error,

as a function of the allocated bits.

6) We explore several possible bit-allocation policies and

introduce the overall bit-budget as a parameter into our

model.

At the end of this process we obtain an expression for the ex-

pected error as a function of the bit budget, scaling factor, and

the image characteristics. This function eventually allows us

to determine the optimal down-sampling factor in JPEG-like

image coding.

A. Compression-Decompression Expected Error

Assume we are given images on the unit square

, realizations of a 2-D random

process

, with second order statistics given by

(1)

Note that here we assume that the image is stationary. This is a

marked deviation from the real-life scenario, and this assump-

tion is done mainly to simplify our analysis. Nevertheless, as we

shall hereafter see, the obtained model succeeds in predicting

the down-sampling effect on compression-decompression per-

formance.

We assume that the image domain

is sliced into

regions of the form

Assume that due to our coding of the original image

we obtain the compressed-decompressed result , which

is an approximation of the original image. We can measure the

error in approximating

by as follows:

(2)

where we define

(3)

We shall, of course, be interested in the expected mean square

error of the digitization, i.e.,

(4)

Note that the assumed wide-sense stationarity of the image

process results in the fact that the expression

is independent of , i.e., we have the same expected mean

square error over each slice of the image. Thus, we can write

(5)

Up to now we considered the quality measure to evaluate the

approximation of

in the digitization process. We shall

next consider the set of basis functions needed for representing

over each slice.

B. Bases for Representing

Over Slices

In order to represent the image over each slice

,wehave

to choose an orthonormal basis of functions. Denote this basis

. We must have

otherwise.

is indeed an orthonormal basis then we can write

(6)

as a representation of

over in terms of an infinite

set of coefficients

(7)

Suppose now that we approximate

over by using

only a finite set

of the orthonormal functions , i.e.,

consider

(8)

BRUCKSTEIN et al.: DOWN-SCALING FOR BETTER TRANSFORM COMPRESSION 1135

The optimal coefficients in the approximation above turn out to

be the corresponding

’s from the infinite representation. The

mean square error of this approximation, over

say, will be

(9)

Hence,

(10)

Now the expected

will be

(11)

Hence

(12)

C. Effect of Quantization of the Expansion Coefficient

Suppose that in the approximation

(13)

we can only use a finite number of bits in representing the coeffi-

cients

that take values in R. If is represented/encoded

with

-bits we shall be able to describe it via that takes

values only, i.e., set of

representation levels. The error in representing in this way

Let us now see how the quantization

errors affect the

.Wehave

(14)

where

. Some alge-

braic steps leads to the following result for the expected

. The expected is therefore

given by

(15)

Hence, in order to evaluate

in a particular representa-

tion when the image is sliced into

pieces and over each

piece we use a subset

of the possible basis functions (i.e.,

) and we quantize the coefficients

with

-bits we have to evaluate

D. An Important Particular Case: Markov Process With

Separable Cosine Bases

We now return to the assumption that the statistics of

is given by (1), namely,

and we choose a separable cosine basis for the slices, i.e., over

, , where

This choice of using the DCT basis is motivated by our desire to

model the JPEG behavior. As is well known [6], the DCT offers

a good approximation of the KLT if the image is modeled as a

2-D random Markov field with very high correlation factor.

1136 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 12, NO. 9, SEPTEMBER 2003

To compute for this case we need to evaluate the vari-

ances of

defined as

(16)

We have

(17)

Therefore, by separating the integrations we obtain

(18)

Changing variables of integration to

yields

(19)

Let us define, for compactness, the following integral:

(20)

Then we see that

(21)

An Appendix at the end of the paper derives

, leading

to the following expression:

(22)

Hence

(23)

E. Incorporating the Effect of Coefficient Quantization

According to rate-distortion theory, if we either assume uni-

form or Gaussian random variables, there is a formula for evalu-

ating the Mean-Square-Error due to quantization. This formula,

known to be accurate at high rates, is given by [3]

(24)

where

is a constant in the range and represents the

number of bits allocated for representing

. Putting the above

results together, we get that the expected mean square error in

representing images from the process

with Markov

statistics, by slicing the image plane into

slices and using,

over each slice, a cosine basis is given by

(25)

This expression gives

in terms of and

-the bits allocated to the coefficients where the subset

of the coefficient is given via

III. S

LICING AND BIT-ALLOCATION OPTIMIZATION PROBLEMS

Suppose we consider

(26)

Down-scaling for better transform compression

Figures

Citations

Super Resolution of Images and Video

Adaptive downsampling to improve image compression at low bit rates

Lossy Point Cloud Geometry Compression via End-to-End Learning

Convolutional Neural Network-Based Block Up-Sampling for Intra Frame Coding

Learning a Convolutional Neural Network for Image Compact-Resolution

References

Fundamentals of digital image processing

Vector Quantization and Signal Compression

High performance scalable image compression with EBCOT

High performance scalable image compression with EBCOT

JPEG: Still Image Compression Standard

Related Papers (5)

Adaptive downsampling to improve image compression at low bit rates

Image quality assessment: from error visibility to structural similarity

Overview of the H.264/AVC video coding standard

Image Super-Resolution Via Sparse Representation

Overview of the High Efficiency Video Coding (HEVC) Standard

Frequently Asked Questions (2)

Q1. What have the authors contributed in "Down-scaling for better transform compression" ?

Q2. What are the future works in "Down-scaling for better transform compression" ?