Journal Article•DOI•

Image registration using log-polar mappings for recovery of large-scale similarity and projective transformations

Siavash Zokai, George Wolberg¹•Institutions (1)

01 Oct 2005-IEEE Transactions on Image Processing (IEEE Trans Image Process)-Vol. 14, Iss: 10, pp 1422-1434

TL;DR: A novel technique to recover large similarity transformations (rotation/scale/translation) and moderate perspective deformations among image pairs and achieves subpixel accuracy through the use of nonlinear least squares optimization.

read less

Abstract: This paper describes a novel technique to recover large similarity transformations (rotation/scale/translation) and moderate perspective deformations among image pairs We introduce a hybrid algorithm that features log-polar mappings and nonlinear least squares optimization The use of log-polar techniques in the spatial domain is introduced as a preprocessing module to recover large scale changes (eg, at least four-fold) and arbitrary rotations Although log-polar techniques are used in the Fourier-Mellin transform to accommodate rotation and scale in the frequency domain, its use in registering images subjected to very large scale changes has not yet been exploited in the spatial domain In this paper, we demonstrate the superior performance of the log-polar transform in featureless image registration in the spatial domain We achieve subpixel accuracy through the use of nonlinear least squares optimization The registration process yields the eight parameters of the perspective transformation that best aligns the two input images Extensive testing was performed on uncalibrated real images and an array of 10,000 image pairs with known transformations derived from the Corel Stock Photo Library of royalty-free photographic images

...read moreread less

Summary (3 min read)

Jump to: [Introduction] – [II. PREVIOUS WORK] – [B. Log-Polar Transform] – [D. Feature-Based Image Registration] – [III. MODIFIED LMA] – [A. Multiresolution Pyramid] – [B. Modified Levenberg–Marquardt Algorithm] – [IV. GLOBAL REGISTRATION USING LOG-POLAR TRANSFORM] – [V. EXPERIMENTAL RESULTS] – [A. Uncalibrated Test Images] and [B. Calibrated Test Images]

Introduction

Therefore, the images differ by large rotation and scale.
See [2] for a recent survey of image registration techniques.
Similarity measures like the zero-mean normalized sum of squared differences (SSD) and correlation coefficient are invariant to the linear intensity changes.
In Section II, the authors discuss related work on the standard Levenberg–Marquardt algorithm (LMA) and log-polar techniques.

II. PREVIOUS WORK

The authors discuss related work on the LMA and the log-polar techniques.
In Section II-A, the authors present a background of the Levenberg–Marquardt nonlinear least-squares optimization algorithm that is useful for achieving subpixel registration accuracy.
The log-polar transform is described in Section II-B.
In Section II-C, the authors discuss the Fourier–Mellin transform, its limitations, and a review of related work.
Section II-D discusses a feature-based method that can register images subjected to large scale changes (i.e., ) and arbitrary rotation.

B. Log-Polar Transform

The log-polar transformation is a nonlinear and nonuniform sampling of the spatial domain.
The log-polar transform has received considerable attention.
If the authors assume that and lie along the horizontal and vertical axes, respectively, then image shown in Fig. 2(a) will be mapped to image in Fig. 2(b) after a log-polar coordinate transformation.
The log-polar mapping is an accepted model of the representation of the retina in the primary visual cortex in primates, also known as V1 [23]–[25].
This bandwidth reduction helps us process a high resolution image only at the focus of attention while aware of a wider field of view.

D. Feature-Based Image Registration

Feature-based image registration algorithms extract salient structures, such as points, lines, curves, and regions, from graylevel images and establish correspondences between features using invariant descriptors.
In more recent feature-based work, registration for wide baseline applications has been reported in [48]–[52].
These results are promising in that they accommodate larger deformations.
These stages include corner detection, conversion to invariant descriptors, matching based on the Mahalanobis distance or k-d tree, and outlier removal using the RANSAC algorithm.
Whereas their methods are designed to operate under textured regions, they may fail in smooth regions.

III. MODIFIED LMA

The LMA solves the following system of equations in an iterative fashion: (5) where is the Hessian matrix and is the residual vector (6) (7).
The authors can improve the standard Levenberg–Marquardt optimization algorithm outlined above by adding two modifications.
The first modification includes the use of a multiresolution pyramid for both reference and target images.
The second modification virtually eliminates the calculation of the Hessian matrix (7) which would otherwise have been computed in every iteration.
The authors second modification is based on the work of [16], whereby registration was performed on medical images subjected to similarity transforms (rotation/scale/translation).

A. Multiresolution Pyramid

The original image, sitting at the base of the pyramid, is downsampled by a constant scale factor in each dimension to form the next level.
This is repeated from one level to the next until the tip of the pyramid is reached.
Second, the smoothness conditions imposed by successively bandlimiting the pyramid levels causes to be computed on smoother images.
An example of computed on two different pyramid levels is shown in Fig.
Thus, the relation between parameters is (11).

B. Modified Levenberg–Marquardt Algorithm

In the standard LMA, the authors calculate the vector and Hessian matrix in each iteration.
This is achieved by casting this problem into one where is transformed into , leaving unchanged from one iteration to the next.
An important distinction between the standard and modified LMA methods lie in the manner in which the unknown parameters are updated in each iteration.
Instead of minimizing (15a), the authors minimize (15c) with respect to the parameters .
For further details about resampling, see [57].

IV. GLOBAL REGISTRATION USING LOG-POLAR TRANSFORM

The authors have implemented a new algorithm for automatically finding the translation between both input images in the presence of large scale and rotation.
The radius and the center of the template are optionally given by the user.
The authors compute the base of the logarithm for log-polar transformation as follows: (20) where is the width of the input image ( diameter).
The normalized correlation coefficient similarity measure is given as follows: (21) where is the average of image .
Furthermore, the bulk of their computation is performed at the coarsest level where there are fewest pixels.

V. EXPERIMENTAL RESULTS

An analytical evaluation of the robustness of image registration algorithms is an elusive task.
Performance is highly dependent on the content of the input images.
Many proposed image registration algorithms in the literature have limited their published results to the use of a few reference images and their synthetically generated target images.
The reference and target images are taken by a camera with optical zoom.
In Section V-B, the authors test the robustness of their algorithm with a large suite of 10 000 image pairs.

A. Uncalibrated Test Images

A Canon PowerShot G3 digital camera with 4 optical zoom was used to capture a set of test images taken from natural and man-made scenes.
The authors method uses all pixels and does not depend on any specific feature set.
Images were acquired with (a) no magnification and (b) 4 magnification with unknown rotation about the optical axis.
In order to quantify registration accuracy, the authors compute the correlation coefficient in the overlapping area.
All of the non-SIFT methods failed to register the image pairs in Fig.

B. Calibrated Test Images

It is not feasible to capture a very large set of images with a variety of image content and transformation parameters.
The authors generated 10 000 target images from these random parameters.
First, the authors used their log-polar module to recover the global rotation, scale, and translation parameters.
The authors have tested their registration algorithm to create image mosaics by stitching together low resolution frames from several overlapping images.
In order to best expose any misalignment, the authors applied unweighted averaging upon the overlapping areas.

Did you find this useful? Give us your feedback

Figures (14)

Fig. 5. curve for rotation (standard LMA).

Fig. 12. Image set used to create a panorama image.

Fig. 13. Mosaic produced by stitching images shown in Fig. 12.

Fig. 1. Airborne imagery. (a) Observed images. (b) Reference image. (c) Registration overlays.

Fig. 6. curve for rotation (modified LMA). (a) 0 iterations, (b) 10 iterations, (c) 20 iterations, and (d) 30 iterations.

TABLE I REGISTRATION RESULTS FOR 10 000 IMAGE PAIRS

Fig. 10. Histogram of for the LP/LMA case.

Fig. 9. Registration failed when applied to database images that do not have discriminating information in the central region, as shown above.

Fig. 2. Log-polar coordinate transformation. (a) Input image. (b) Log-polar transformation.

Fig. 7. Four-dimensional search strategy. (a) A circular template from the center of the reference image is cropped. (b) For every position in the target image, a circular region is selected and compared against the circular template in (a) to find the best (T ; T ). (c) Search for (R; S) in the log-polar domain.

Fig. 3. Effects of optical and digital zoom on the power spectrum. (a) Reference image. (b) Target image (real). (c) Target image (synthetic).

Fig. 8. (a) Observed images: 4 zoom, arbitrary rotation, and moderate perspective. (b) Registration results highlighted on reference images.

Fig. 4. Example of (a) computed on two different pyramid levels.

Content maybe subject to copyright Report

1422 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 14, NO. 10, OCTOBER 2005

Image Registration Using Log-Polar Mappings

for Recovery of Large-Scale Similarity and

Projective Transformations

Siavash Zokai and George Wolberg, Senior Member, IEEE

Abstract—This paper describes a novel technique to recover

large similarity transformations (rotation/scale/translation) and

moderate perspective deformations among image pairs. We in-

troduce a hybrid algorithm that features log-polar mappings

and nonlinear least squares optimization. The use of log-polar

techniques in the spatial domain is introduced as a preprocessing

module to recover large scale changes (e.g., at least four-fold) and

arbitrary rotations. Although log-polar techniques are used in the

Fourier–Mellin transform to accommodate rotation and scale in

the frequency domain, its use in registering images subjected to

very large scale changes has not yet been exploited in the spatial

domain. In this paper, we demonstrate the superior performance

of the log-polar transform in featureless image registration in the

spatial domain. We achieve subpixel accuracy through the use

of nonlinear least squares optimization. The registration process

yields the eight parameters of the perspective transformation that

best aligns the two input images. Extensive testing was performed

on uncalibrated real images and an array of 10,000 image pairs

with known transformations derived from the Corel Stock Photo

Library of royalty-free photographic images.

Index Terms—Image registration, Levenberg–Marquardt non-

linear least-squares optimization, log-polar transform, perspective

transformation, similarity transformation.

I. INTRODUCTION

IGITAL image registration is a branch of computer vision

that deals with the geometric alignment of a set of im-

ages. The set may consist of two or more digital images taken

of a single scene at different times, from different sensors, or

from different viewpoints. A large body of research has been

drawn to this area due to its importance in remote sensing, med-

ical imaging, computer graphics, and computer vision. Despite

comprehensive research spanning over thirty years, robust tech-

niques to register images in the presence of large deformations

remains elusive. Most techniques fail unless the input images

are misaligned by moderate deformations.

The goal of registration is to establish geometric correspon-

dence between the images so that they may be transformed,

compared, and analyzed in a common reference frame. Regis-

tration is often necessary for 1) integrating information taken

Manuscript received April 27, 2004; revised October 11, 2004. This work was

supported in part by an ONR HBCU/MI Research and Education Program Grant

(N000140310511) and a PSC-CUNY Grant. The associate editor coordinating

the review of this manuscript and approving it for publication was Dr. Luca

Lucchese.

S. Zokai is with Brainstorm Technology LLC, New York, NY 10011 USA.

G. Wolberg is with the Department of Computer Science, City College of

New York, New York, NY 10031 USA (e-mail: wolberg@cs.ccny.cuny.edu).

Digital Object Identiﬁer 10.1109/TIP.2005.854501

from different sensors (i.e., multisensor data fusion), 2) ﬁnding

changes in images taken at different times or under different

conditions, 3) inferring three-dimensional (3-D) information

from images in which either the camera or the objects in the

scene have moved, and 4) for model-based object recognition.

The most common task associated with image registration

is the generation of large panoramic images for viewing and

analysis. Image mosaics, created by warping and blending

together several overlapping images, are central to this process.

Other common registration tasks include producing super-reso-

lution images from multiple images of the same scene, change

detection, motion stabilization, topographic mapping, and

multisensor image fusion.

This work attempts to register two images using one global

perspective transformation even in the presence of arbitrary ro-

tation angles and large scale changes (up to 5

zoom). Our

work is motivated by the problem of registering airborne im-

ages. These images are taken at vastly different times, altitudes,

and directions. Therefore, the images differ by large rotation and

scale. Also, the pitch and roll introduces moderate perspective.

In general, images of a 3-D scene do not differ by just one per-

spective transformation because the depth between the camera

and the objects introduces parallax. A global transformation

cannot align all features in such cases. We must, therefore, place

constraints on camera motion and/or our 3-D scene to produce

images that are free of parallax. One constraint requires the

camera motion to be limited to rotation, pan, tilt, and zoom about

a ﬁxed point, e.g, on a tripod. If this constraint is not satisﬁed,

then we may still have images free of parallax if the object’s 3-D

points

in the scene are far away from the camera, i.e.,

. This means that the scene is ﬂat and we are looking

at a planar object. In either case, we assume that the scene is

static and the lighting is ﬁxed between images. Nevertheless, we

have relaxed these conditions to accommodate local disparity

and linear changes in illumination.

A survey by Brown [1] introduces a framework in which all

registration techniques can be understood. The framework con-

sists of the feature space, similarity measure, search space, and

search strategy. The feature space extracts the information in

the images that will be used for matching. The search space is

the class of transformations, or deformation models, that is ca-

pable of aligning the images. The search strategy decides how

to choose the next transformation from this space, to be tested in

the search for the optimal transformation. The similarity mea-

sure determines the relative merit for each test. Search continues

according to the search strategy until a transformation is found

ZOKAI AND WOLBERG: IMAGE REGISTRATION USING LOG-POLAR MAPPINGS 1423

Fig. 1. Airborne imagery. (a) Observed images. (b) Reference image. (c) Registration overlays.

whose similarity measure is satisfactory. Numerous registration

techniques have been proposed based on choosing a speciﬁc fea-

ture, deformation model, optimization method, and/or similarity

measure. See [2] for a recent survey of image registration tech-

niques.

For image registration, we need to recover the geo-

metric transformation and/or intensity function. Let

and be the reference and observed images, re-

spectively. The relationship between these images is

, where is a two-dimensional

(2-D) geometric transformation operator that relates the

coordinates in to the coordinates in and is the

intensity function.

The estimation of the intensity function

is useful when we

want to register images taken from different sensors or when il-

lumination is changed by automatic gain exposure of a camera.

Comparametric equations have been introduced to model the in-

tensity function

[3]. Although these equations are nonlinear,

a piecewise linear method has been developed to estimate

and simultaneously [4].

Mutual information is a similarity measure that has recently

been introduced for multimodal medical image registration [5],

[6]. Correlation ratio is another similarity measure for multi-

modal image registration and has proven to perform better than

mutual information [7]. Multimodal image registration has been

studied extensively in the medical imaging domain. In this work,

we assume that the intensity function is linear. Similarity mea-

sures like the zero-mean normalized sum of squared differences

(SSD) and correlation coefﬁcient are invariant to the linear in-

tensity changes.

This paper describes a hierarchical image registration system.

We model the mapping function as a perspective transforma-

tion. The algorithm estimates the perspective parameters neces-

sary to register any two misaligned digital images. The parame-

ters are selected to minimize the SSD between the two images.

They are computed iteratively in a coarse-to-ﬁne hierarchical

framework using a variation of the Levenberg–Marquadt non-

linear least squares optimization method. This approach yields

a robust solution that precisely registers images with subpixel

accuracy.

The primary drawback of the optimization-based approach

is that it may fail unless the two images are misaligned by a

moderate difference in scale, rotation, and translation. In order

to address this problem, we introduce a log-polar registration

module to bring the images into approximate alignment, even in

the presence of arbitrary rotation angles and large scale changes.

Its purpose is to furnish a good initial estimate to the perspec-

tive registration module that is based on nonlinear least squares

optimization.

The scope of this work shall prove useful for various ap-

plications, including the registration of aerial images, and the

formation of image mosaics. Note that aerial imagery may be

acquired from uncalibrated airborne cameras subjected to yaw,

pitch, and roll at various altitudes. Since the terrain appears ﬂat

from moderately high altitude, it is an ideal candidate for reg-

istration using a single perspective transformation. An example

demonstrating the registration of two aerial images in the pres-

ence of large scale/rotation and moderate perspective is shown

in Fig. 1. The image in Fig. 1(a) is automatically registered to

that in Fig. 1(b), as depicted by the highlighted rectangle.

In Section II, we discuss related work on the standard Lev-

enberg–Marquardt algorithm (LMA) and log-polar techniques.

Section III describes a modiﬁed LMA for improving the per-

formance of the standard LMA and Section IV presents our

proposed log-polar method. In Section V, we demonstrate the

success of the log-polar transform in recovering large deforma-

tions by comparing registration accuracy with and without the

log-polar registration module. A signiﬁcant increase in correct

matches is attributed to our algorithm. A secondary compar-

ison was made by replacing the log-polar module with the well-

known Fourier–Mellin transform. Again, our log-polar module

proved superior to the Fourier–Mellin transform for achieving

high perspective registration accuracy.

II. P

REVIOUS WORK

In this section, we discuss related work on the LMA and the

log-polar techniques. In Section II-A, we present a background

of the Levenberg–Marquardt nonlinear least-squares optimiza-

tion algorithm that is useful for achieving subpixel registration

accuracy. The log-polar transform is described in Section II-B.

1424 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 14, NO. 10, OCTOBER 2005

In Section II-C, we discuss the Fourier–Mellin transform, its

limitations, and a review of related work. Section II-D discusses

a feature-based method that can register images subjected to

large scale changes (i.e.,

) and arbitrary rotation.

A. LMA

There is a vast literature of work in the related ﬁelds of image

registration, motion estimation, image mosaics, and video in-

dexing that make use of a nonlinear least-squares optimization

technique known as the LMA. Most algorithms exploit a hier-

archical approach due to computational efﬁciency in handling

large displacements. Algorithms for hierarchical motion esti-

mation [8]–[10] and image mosaicing [11]–[20] usually assume

small deformations among image pairs. For instance, a dense

image sequence is required to stitch the frames together [14],

[18]. The problem of assembling a large set of images into a

common reference frame is simpliﬁed when the inter-frame de-

formations are small. The LMA uses the SSD as the similarity

measure between two images (or regions)

(1)

and the discrete form is

Note that is a geometric transformation applied to image

to map it from its coordinate system to the coordi-

nate system of

. In our case, the subscript is a 3 3 per-

spective transformation matrix and

is the number of pixels.

B. Log-Polar Transform

The log-polar transformation is a nonlinear and nonuniform

sampling of the spatial domain. Nonlinearity is introduced by

polar mapping, while nonuniform sampling is the result of log-

arithmic scaling. Despite the difﬁculties of nonlinear processing

for computer vision applications, the log-polar transform has re-

ceived considerable attention. Consider the log-polar

coordinate system, where denotes radial distance from the

center

and denotes angle. Any point can be rep-

resented in polar coordinates

(2)

(3)

Applying a polar coordinate transformation to an image

maps radial lines in Cartesian space to horizontal lines in the

polar coordinate space. We shall denote the transformed image

. If we assume that and lie along the horizontal and ver-

tical axes, respectively, then image

shown in Fig. 2(a) will be

mapped to image

in Fig. 2(b) after a log-polar coordinate

transformation.

Fig. 2. Log-polar coordinate transformation. (a) Input image. (b) Log-polar

transformation.

The motivation for considering the log-polar transform stems

from its biological origins. The ﬁrst reported discoveries of log-

polar mappings in the primate visual system were reported in

[21] and [22]. The log-polar mapping is an accepted model of

the representation of the retina in the primary visual cortex in

primates, also known as V1 [23]–[25]. The nonuniform sam-

pling that simulates logarithmic scale takes place in the retina

and the nerve endings from the retina are connected to the visual

cortex by a special mapping. This mapping realizes the polar

transformation by a simple rewiring. The radial nerve endings

are connected horizontally to the visual cortex. Due to these bi-

ological origins, the log-polar transform has often been referred

to as the retino-cortical transform [26]. The log-polar transform

has two principal advantages: 1) rotation and scale invariance

and 2) the spatially varying sampling in the retina is the solu-

tion to reduce the amount of information traversing the optical

nerve while maintaining high resolution in the fovea and cap-

turing a wide ﬁeld of view. This bandwidth reduction helps us

process a high resolution image only at the focus of attention

while aware of a wider ﬁeld of view. Several researchers have

designed log-polar sensors for active and real-time vision appli-

cations [27]–[31]. These efforts sought to make the leap from

biological hardware to VLSI hardware.

C. Fourier–Mellin Transform

The Fourier–Mellin registration method is based on phase

correlation and the properties of Fourier analysis. The phase

correlation method can ﬁnd the translation between two im-

ages. The Fourier–Mellin transform extends phase correlation

to handle images related by both translation and rotation

[32]–[39]. According to the rotation and translation properties

of the Fourier transform, the transforms are related by

We can see that the magnitude of spectra is a rotated

replica of

. Both spectrum share the same center of rotation.

ZOKAI AND WOLBERG: IMAGE REGISTRATION USING LOG-POLAR MAPPINGS 1425

Fig. 3. Effects of optical and digital zoom on the power spectrum.

(a) Reference image. (b) Target image (real). (c) Target image (synthetic).

We can recover this rotation by representing the spectra and

in polar coordinates

(4)

The Fourier magnitude in polar coordinates differs only

by translation. We can use the phase-correlation method to

ﬁnd this translation and estimate

. This method has been

extended to ﬁnd scale by mapping the Fourier magnitude to

log-polar coordinates. Therefore, one ﬁnds scale and rotation

by phase-correlation, which recovers the amount of shifts in

space. One advantage of this method is that it tol-

erates additive noise. The method, however, can only recover

moderate scales and rotations. This difﬁculty can be understood

by realizing that large rotation and scale changes exacerbate the

border effects when computing the Fourier transform. These

problems are minimized in the rare case when the images are

periodic. Therefore, a large translation, or scale introduces

additional pixel information that can dramatically alter the

Fourier coefﬁcients.

In early papers on Fourier–Mellin, the border problems were

not investigated. They were, however, reported recently in [40]

and [41], where the authors showed that rotation and scale in-

troduce aliasing in the low frequencies. They have suggested

that two preprocessing steps are needed to alleviate the aliasing

problem. First, the image must be multiplied by a radial mask

consisting of a 2-D Gaussian function. Second, a low-pass ﬁlter

must be applied to remove the offending low frequencies. The

researchers in [35] reported that they recovered scale factors up

to 1.8 and 80

rotations.

It is important to note that the literature is replete with syn-

thetic examples for the Fourier–Mellin registration method. In

particular, a reference image is always matched against a scaled

and rotated version of itself. This serves to defer the problem of

handling the ﬁne details introduced by an actual optical zoom.

Conversely, when the image undergoes miniﬁcation, translation,

or rotation, additional real data seeps into the target image, not

just black pixels. Note that artiﬁcial black backgrounds can help

underlying content.

An example demonstrating the differences between digital

and optical zoom is shown in Fig. 3. As is expected, the shape

of the spectrum in Fig. 3(c) conforms to the inverse relationship

between space and frequency. However, the spectra of Fig. 3(b)

reﬂects the fact that the images were taken with optical zoom

and minor perspective distortion was introduced due to real hand

movement. Although the Fourier–Mellin transform is able to

correctly register the synthetic image shown in Fig. 3(c), the

image in Fig. 3(b) deﬁes recovery because of the lack of simi-

larity in its spectra compared to that of the reference image.

An important contribution of this work is that we introduce

a new method based on the log-polar transform in the spatial

domain that works robustly with real images.

D. Feature-Based Image Registration

Feature-based image registration algorithms extract salient

structures, such as points, lines, curves, and regions, from

graylevel images and establish correspondences between

features using invariant descriptors. Early work in this area

includes [42]–[47]. This work, however, is generally limited to

small geometric deformations.

In more recent feature-based work, registration for wide base-

line applications has been reported in [48]–[52]. These results

are promising in that they accommodate larger deformations.

Finding local and invariant features is an important tool for

detecting correspondences between different views of a scene.

In [50], the authors detect quadrilateral and elliptical locally

afﬁne regions for ﬁnding the fundamental matrix in wide-base-

line stereo images. In [51] and [53], the authors look for locally

afﬁne regions. They compute several degrees of moments in

these regions to build feature vectors for wide-baseline stere-

oscopy [53] and image retrieval [51]. Their work tolerates only

small scale changes.

Recently, several researchers at INRIA and University of

British Columbia developed methods for recovering large-scale

deformation based on scale-space theory [49], [54]–[56]. The

INRIA method computes interest points at different scales, cal-

culating at each scale a set of local descriptors that are invariant

to rotation, translation, and illumination. The Mahalanobis

distance is then used to ﬁnd the corresponding interest points

between two images. In order to remove outliers, they use the

RANSAC algorithm with constraints based on collections of

points. In the work of Lowe and his colleagues, a scale-invariant

feature transform (SIFT) is introduced to ﬁnd features and a

k-d tree is used to match features across multiple images [48],

[49]. To our knowledge, the techniques described in [49] and

[54] are the only works that are applied to outdoor images

with large scale factors (i.e.,

) derived from optical zoom

cameras (not digital zoom). Our registration algorithm is able

to properly register all of their test data. Their methods consist

of a series of complex stages that are not prone to direct hard-

ware implementation. These stages include corner detection,

conversion to invariant descriptors, matching based on the

Mahalanobis distance or k-d tree, and outlier removal using the

RANSAC algorithm. Whereas their methods are designed to

operate under textured regions, they may fail in smooth regions.

III. M

ODIFIED LMA

The LMA solves the following system of equations in an it-

erative fashion:

(5)

1426 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 14, NO. 10, OCTOBER 2005

where is the Hessian matrix and is the residual

vector

(6)

(7)

We can improve the standard Levenberg–Marquardt opti-

mization algorithm outlined above by adding two modiﬁcations.

The ﬁrst modiﬁcation includes the use of a multiresolution

pyramid for both reference and target images. The second

modiﬁcation virtually eliminates the calculation of the Hessian

matrix (7) which would otherwise have been computed in every

iteration. Our second modiﬁcation is based on the work of [16],

whereby registration was performed on medical images sub-

jected to similarity transforms (rotation/scale/translation). We

have extended their method to recover perspective parameters.

A. Multiresolution Pyramid

A multiresolution pyramid consists of a set of images repre-

senting an image in multiple resolutions. The original image,

sitting at the base of the pyramid, is downsampled by a constant

scale factor in each dimension to form the next level. This is re-

peated from one level to the next until the tip of the pyramid is

reached. The image size at level

is reduced from the original

by a factor of

in each dimension. Level 0, at the base of the

pyramid, is referred to as the ﬁnest level. Level

, at the tip

of the pyramid, is known as the coarsest level.

Multiresolution pyramids supply us with two major advan-

tages. First, when we apply the Levenberg–Marquardt method

to the coarsest level of the pyramid, the number of pixels is re-

duced by a factor of

. We get large computational gains

because most of the iterations are executed in the coarsest level,

consisting of fewer pixels. Second, the smoothness conditions

imposed by successively bandlimiting the pyramid levels causes

to be computed on smoother images. This smoothness

property helps prevent getting trapped in local minimas. An ex-

ample of

computed on two different pyramid levels is

shown in Fig. 4. Since the coarsest level retains large-scale fea-

tures only, the registration algorithm proceeds from the coarsest

level to progressively ﬁner levels, where small corrections due to

ﬁner details are integrated. This approach passes the computed

parameters as an initial estimate to the next ﬁner level. The pa-

rameters must be scaled properly across successive levels. Let

the scale factor between the levels be

: ,

, , and , where

(8a)

(8b)

Fig. 4. Example of



(

)

computed on two different pyramid levels.

Substituting the coordinates of the next ﬁner level into the above

equations yields

(9)

Multiplying both sides by

gives us

(10)

Thus, the relation between parameters is

(11)

In our case,

, so the translation parameters and are

multiplied by two and

and divided by two.

B. Modiﬁed Levenberg–Marquardt Algorithm

In the standard LMA, we calculate the

vector and Hes-

sian matrix

in each iteration. In this section, we review a

modiﬁed LMA that realizes performance gains by eliminating

the calculation of the Hessian matrix at each iteration. Consider

the following objective function to establish a similarity mea-

sure between

and

(12)

HTML Viewer

Frequently Asked Questions (13)

Q1. What are the contributions in "Image registration using log-polar mappings for recovery of large-scale similarity and projective transformations" ?

This paper describes a novel technique to recover large similarity transformations ( rotation/scale/translation ) and moderate perspective deformations among image pairs. The authors introduce a hybrid algorithm that features log-polar mappings and nonlinear least squares optimization. The use of log-polar techniques in the spatial domain is introduced as a preprocessing module to recover large scale changes ( e. g., at least four-fold ) and arbitrary rotations. In this paper, the authors demonstrate the superior performance of the log-polar transform in featureless image registration in the spatial domain.

Q2. What have the authors stated for future works in "Image registration using log-polar mappings for recovery of large-scale similarity and projective transformations" ?

Additional future work will accelerate correlation. It is worthwhile to examine whether this process may be accelerated by positioning the sliding window on areas of high information content only. Entropy, variance, or other statistically discriminating techniques can be used to quantify information content. Recent success with scale-invariant interest points ( e. g., SIFT ) suggest that the log-polar windows should be centered at these extracted positions.

Q3. What is the purpose of the log-polar transform?

The log-polar transform has two principal advantages: 1) rotation and scale invariance and 2) the spatially varying sampling in the retina is the solution to reduce the amount of information traversing the optical nerve while maintaining high resolution in the fovea and capturing a wide field of view.

Q4. What is the purpose of the INRIA method?

The INRIA method computes interest points at different scales, calculating at each scale a set of local descriptors that are invariant to rotation, translation, and illumination.

Q5. What measures have been used to test the similarity of a polar image?

The authors have extensively tested several similarity measures, including normalized correlation coefficient, phase correlation, and mutual information.

Q6. Why does the image in Fig. 3(b) defie recovery?

Although the Fourier–Mellin transform is able to correctly register the synthetic image shown in Fig. 3(c), the image in Fig. 3(b) defies recovery because of the lack of similarity in its spectra compared to that of the reference image.

Q7. What is the main contribution of this work?

An important contribution of this work is that the authors introduce a new method based on the log-polar transform in the spatial domain that works robustly with real images.

Q8. What is the new method for finding scale, rotation, and translation parameters?

The new method is based on multiresolution log-polar transformations to simultaneously find the best scale, rotation, and translation parameters.

Q9. How did the authors show the importance of the log-polar module?

In order to show the importance of the log-polar module, the authors ran the LMA without the estimated initial parameters from the log-polar module.

Q10. How many pairs of images are shown in Fig. 1?

The correlation coefficient values for the thirty pairs mentioned above are all above 0.9, which is very good considering camera noise and artifacts introduced by warping to produce the target images.

Q11. How does the LMA solve the system of equations?

The LMA solves the following system of equations in an iterative fashion:(5)where is the Hessian matrix and is the residual vector(6)(7)The authors can improve the standard Levenberg–Marquardt optimization algorithm outlined above by adding two modifications.

Q12. What was the first report of the authors?

They were, however, reported recently in [40] and [41], where the authors showed that rotation and scale introduce aliasing in the low frequencies.

Q13. Why does the Fourier–Mellin transform help register two images?

Note that artificial black backgrounds can help register two images because it ensures that the authors consider the same underlying content.

Image registration using log-polar mappings for recovery of large-scale similarity and projective transformations

Summary (3 min read)

Introduction

II. PREVIOUS WORK

B. Log-Polar Transform

D. Feature-Based Image Registration

III. MODIFIED LMA

A. Multiresolution Pyramid

B. Modified Levenberg–Marquardt Algorithm

IV. GLOBAL REGISTRATION USING LOG-POLAR TRANSFORM

V. EXPERIMENTAL RESULTS

A. Uncalibrated Test Images

B. Calibrated Test Images

Figures (14)

Citations

Cites methods from "Image registration using log-polar ..."

Cites methods from "Image registration using log-polar ..."

Cites methods from "Image registration using log-polar ..."

Cites background from "Image registration using log-polar ..."

References

"Image registration using log-polar ..." refers background in this paper

"Image registration using log-polar ..." refers background in this paper

Related Papers (5)

Frequently Asked Questions (13)

Q1. What are the contributions in "Image registration using log-polar mappings for recovery of large-scale similarity and projective transformations" ?

Q2. What have the authors stated for future works in "Image registration using log-polar mappings for recovery of large-scale similarity and projective transformations" ?

Q3. What is the purpose of the log-polar transform?

Q4. What is the purpose of the INRIA method?

Q5. What measures have been used to test the similarity of a polar image?

Q6. Why does the image in Fig. 3(b) defie recovery?

Q7. What is the main contribution of this work?

Q8. What is the new method for finding scale, rotation, and translation parameters?

Q9. How did the authors show the importance of the log-polar module?

Q10. How many pairs of images are shown in Fig. 1?

Q11. How does the LMA solve the system of equations?

Q12. What was the first report of the authors?

Q13. Why does the Fourier–Mellin transform help register two images?