scispace - formally typeset
Open AccessJournal ArticleDOI

A Partial Intensity Invariant Feature Descriptor for Multimodal Retinal Image Registration

Reads0
Chats0
TLDR
The proposed novel highly distinctive local feature descriptor named partial intensity invariant feature descriptor (PIIFD) is so distinctive that it can be correctly identified even in nonvascular areas and far outperforms existing algorithms in terms of robustness, accuracy, and computational efficiency.
Abstract
Detection of vascular bifurcations is a challenging task in multimodal retinal image registration. Existing algorithms based on bifurcations usually fail in correctly aligning poor quality retinal image pairs. To solve this problem, we propose a novel highly distinctive local feature descriptor named partial intensity invariant feature descriptor (PIIFD) and describe a robust automatic retinal image registration framework named Harris-PIIFD. PIIFD is invariant to image rotation, partially invariant to image intensity, affine transformation, and viewpoint/perspective change. Our Harris-PIIFD framework consists of four steps. First, corner points are used as control point candidates instead of bifurcations since corner points are sufficient and uniformly distributed across the image domain. Second, PIIFDs are extracted for all corner points, and a bilateral matching technique is applied to identify corresponding PIIFDs matches between image pairs. Third, incorrect matches are removed and inaccurate matches are refined. Finally, an adaptive transformation is used to register the image pairs. PIIFD is so distinctive that it can be correctly identified even in nonvascular areas. When tested on 168 pairs of multimodal retinal images, the Harris-PIIFD far outperforms existing algorithms in terms of robustness, accuracy, and computational efficiency.

read more

Content maybe subject to copyright    Report

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 57, NO. 7, JULY 2010 1707
APartialIntensityInvariantFeatureDescriptorfor
Multimodal Retinal Image Registration
Jian Chen, Jie Tian
,Fellow,IEEE, Noah Lee, Jian Zheng, R. Theodore Smith, and Andrew F. Laine,Fellow,IEEE
Abstract—Detection of vascular bifurcations is a challenging
task in multimodal retinal image registration. Existing algorithms
based on bifurcations usually fail in correctly aligning poor qual-
ity retinal image pairs. To solve this problem, we propose a novel
highly distinctive local feature descriptor named partial intensity
invariant feature descriptor (PIIFD) and describe a robust auto-
matic retinal image registration framework named Harris-PIIFD.
PIIFD is invariant to image rotation, partially invariant to image
intensity, affine transformation, and viewpoint/perspective change.
Our Harris-PIIFD framework consists of four steps. First, corner
points are used as control point candidates instead of bifurcations
since corner points are sufficient and uniformly distributed across
the image domain. Second, PIIFDs are extracted for all corner
points, and a bilateral matching technique is applied to identify
corresponding PIIFDs matches between image pairs. Third, in-
correct matches are removed and inaccurate matches are refined.
Finally, an adaptive transformation is used to register the image
pairs. PIIFD is so distinctive that it can be correctly identified even
in nonvascular areas. When tested on 168 pairs of multimodal reti-
nal images, the Harris-PIIFD far outperforms existing algorithms
in terms of robustness, accuracy, and computational efficiency.
Index Terms—Harris detector, local feature, multimodal regis-
tration, partial intensity invariance, retinal images.
Manuscript received February 20, 2009; revised July 12, 2009 and November
20, 2009; accepted January 15, 2010. Date of publication February 18, 2010;
date of current version June 16, 2010. This work was supported in part by the
National Eye Institute under Grant R01 EY015520-01, by the NYC Commu-
nity Trust (RTS), by the unrestricted funds from Research to prevent blindness,
by the Project for the National Basic Research Program of China (973) un-
der Grant 2006CB705700, by Changjiang Scholars and Innovative Research
Team in University (PCSIRT) under Grant IRT0645, by CAS Hundred Talents
Program, by CAS Scientific Research Equipment Development Program under
Grant YZ200766, by the Knowledge Innovation Project of the Chinese Academy
of Sciences under Grant KGCX2-YW-129 and Grant KSCX2-YW-R-262, by
the National Natural Science Foundation of China under Grant 30672690,
Grant 30600151, Grant 60532050, Grant 60621001, Grant 30873462, Grant
60910006, Grant 30970769, and Grant 30970771, by Beijing Natural Science
Fund under Grant 4071003, and by the Science and Technology Key Project
of Beijing Municipal Education Commission under Grant KZ200910005005.
Asterisk indicates corresponding author.
J. Chen was with the Institute of Automation, Chinese Academy of Sci-
ences, Beijing 100190, China, and also with the Department of Biomedical
Engineering, Columbia University, New York, NY 10027 USA. He is now with
IBM China Research Laboratory, Beijing 100027, China (e-mail: jianchen@
cn.ibm.com).
J. Tian is with the Institute of Automation, Chinese Academy of Sciences,
Beijing 100190, China (e-mail: tian@ieee.org).
N. Lee is with the Department of Biomedical Engineering, Columbia Uni-
versity, New York, NY 10027 USA (e-mail: nl2168@columbia.edu).
J. Zheng is with the Institute of Automation, Chinese Academy of Sciences,
Beijing 100190, China (e-mail: zhengjian@fingerpass.net.cn).
R. T. Smith is with the Retinal Image Analysis Laboratory, Edward S. Hark-
ness Eye Institute and the Department of Ophthalmology, Columbia University,
New York, NY 10027 USA (e-mail: rts1@columbia.edu).
A. F. Laine is with Heffner Biomedical Imaging Laboratory, Department
of Biomedical Engineering, Columbia University, New York, NY 10027 USA
(e-mail: laine@columbia.edu).
Digital Object Identifier 10.1109/TBME.2010.2042169
Fig. 1. (a) and (b) Poor quality retinal images taken at different stages. Tra-
ditional feature-based approaches usually fail to register this image pair since it
is hard to detect the vasculatures in (b).
I. INTRODUCTION
T
HE PURPOSE of retinal image registration is to spatially
align two or more retinal images for clinical review of dis-
ease progression. These images come from different screening
events and are usually taken at different times or different elds
of view. An accurate registration is helpful to diagnose various
kinds of retinal diseases such as glaucoma, diabetes, and age-
related macular degeneration [1]–[4], [54]. However, automatic
accurate registration becomes a problem when registering poor
quality multimodal retinal images (severely affected by noise or
pathology). For example, it is difficult to register an image pair
taken years apart, which were acquired with different sensors
due to possible differences in the field of view and modality
characteristics [5]–[8]. Retinopathy may cause severe changes
in the appearance of the whole retina such as obscure vascu-
lature patterns (see Fig. 1). Registration algorithms that rely
on vascular information may fail to correctly align such image
pairs.
Thus, in this paper, we propose a novel distinctive partial
intensity invariant feature descriptor (PIIFD) and describe a fully
automatic algorithm to register poor quality multimodal retinal
image pairs. In the following, we will first briefly introduce prior
work regarding existing retinal registration algorithms and then
propose our Harris-PIIFD framework.
A. Prior Work
Existing registration algorithms can be classified as area-
based and feature-based approaches [9]–[11]. The area-based
approaches [12]–[24] compare and match the intensity differ-
ences of an image pair under a similarity metric such as mu-
tual information [19]–[22] and cross correlation [12], [15], and
then apply an optimization technique [23], [24] to maximize
the similarity metric by searching in the transformation space.
0018-9294/$26.00 © 2010 IEEE
Authorized licensed use limited to: Columbia University. Downloaded on June 15,2010 at 14:53:08 UTC from IEEE Xplore. Restrictions apply.

1708 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 57, NO. 7, JULY 2010
The similarity metric is expected to reach its optimum when
two images are properly registered. However, in the case of low
overlapping area registration, the area-based approaches usually
fail [55]. In other words, the similarity metric is usually misled
by nonoverlapping areas. To overcome this problem, a widely
used solution is to assign a region of interest (ROI) within one or
both images for computing the similarity metric [24]. The area-
based approaches are also sensitive to illumination changes and
significant initial-misalignment, suggesting that area-based ap-
proaches may be susceptible to occlusion, background changes
caused by pathologies, and pose changes of the camera [35].
Compared with area-based registration, feature-based ap-
proaches [25]–[41] are more appropriate for retinal image reg-
istration. Feature-based approaches typically involve extracting
features and searching for a transformation, which optimizes
the correspondence between these features. The bifurcations of
retinal vasculature, optic disc, and fovea [27], [28] are examples
of such widely used feature cues, respectively. The main advan-
tage of feature-based approaches is the robustness against illu-
mination changes. However, extraction of such features in poor
quality images is difficult. Feature-based approaches for reti-
nal image registration usually distinguish themselves through
minor differences and rely on the assumption that vasculature
network is able to be extracted. For instance, the use of different
structures of the retina as landmark points [27], a focus on im-
proving the performance of landmark extraction algorithm [36],
anarrowingdownofthesearchspacebymanuallyorauto-
matically assigning “matched” points [30], and a more com-
plicated mapping strategy to estimate the most plausible trans-
formation from a pool of possible landmark matches [26] have
been described and all of them rely on the extraction of retinal
vasculature.
Ahybridapproachthateffectivelycombinesbotharea-based
and feature-based approaches has also been proposed [55]; how-
ever, it still relies on retinal vasculature.
General feature-based approaches that do not rely on vas-
culature are also discussed. Scale invariant feature transform
(SIFT), an algorithm for extracting distinctive invariant features
has been proposed [42]–[46]. The SIFT features proposed in this
algorithm are invariant to image scale and rotation and provide
robust matching across a substantial range of affine distortion,
change in 3-D viewpoint/perspective, addition of noise, and
changes in illumination. These features are highly distinctive
in a sense that a single feature can be correctly matched with
high probability against a large database of f eatures from many
images. However, SIFT is designed for monomodal image reg-
istration, and its scale invariance strategy usually cannot provide
sufficient control points for high order transformations. Another
local feature named speeded up robust features (SURF) [57] has
also been proposed, which is several times faster and more ro-
bust against different image transformations than SIFT claimed
by its authors. SURF is based on Haar wavelet, and its good
performance is achieved by building on the strengths of SIFT
and simplifying SIFT to the essential [57]. Soon after, a SURF-
based retinal image registration method, which does not depend
on vasculature has been proposed [59]; however, it is still only
applicable for monomodal image registration.
Fig. 2. Pair of poor quality multimodal retinal images. These two images were
taken from the same eye.
General dual bootstrap iterative closest point algorithm
(GDB-ICP) [35], [60] which uses “corner” points and “face”
points as correspondence cues is more efficient than other ex-
isting algorithms. To our knowledge, the GDB-ICP algorithm
is the best algorithm reported for poor quality retinal image
registration. There are two versions of this approach. The first
version uses Lowe’s multiscale keypoint detector and the SIFT
descriptor [42]–[46] to provide initial matches. In comparison,
the second version uses the central line extraction algorithm [36]
to extract the bifurcations of the vasculature to provide initial
matches. Then GDB-ICP algorithm is applied to iteratively ex-
pand the area around initial matches by mapping the “corner”
or “face” points. The authors declare that only one correct ini-
tial match is enough for subsequent iterative registering pro-
cess. However, in some extreme cases no correct match can
be detected by their two initial matching methods. Further, f or
very poor quality images, even if there are some correct initial
matches, the GDB-ICP algorithm may still fail because the dis-
tribution of “corner” and “face” points are severely affected by
noise.
B. Problem Statement and Proposed Method
As mentioned earlier, the existing algorithms cannot register
poor quality multimodal image pairs in which the vasculature is
severely affected by noise or artifacts. The retinal image regis-
tration can be broken down to two situations: multimodal image
registration and poor quality image registration. The existing
algorithms can achieve good performance when these two sit-
uations are not combined together. On one hand, vasculature-
based registration methods can correctly align good-quality mul-
timodal retinal image pairs. On the other hand, some robust local
features such as SIFT and SURF can achieve satisfactory results
for poor quality monomodal registration. However, it is hard to
register poor quality multimodal retinal images. An illustration
of retinal image registration combined these two situations is
shown in Fig. 2, in which two images are of poor quality and
different modalities.
Arobustlocalfeaturedescriptormaybringtosuccessthe
registration of poor quality multimodal retinal images, as long as
it solves the following two problems: 1) the gradient orientations
at corresponding locations in multimodal images may point to
opposite directions and the gradient magnitudes usually change.
Authorized licensed use limited to: Columbia University. Downloaded on June 15,2010 at 14:53:08 UTC from IEEE Xplore. Restrictions apply.

CHEN et al.: PIIFD FOR MULTIMODAL RETINAL IMAGE REGISTRATION 1709
Fig. 3. Flowchart of our registration framework. The key contribution of this
study (see Section II-C) is highlighted in bold.
Thus, how can a local feature achieve intensity invariance or at
least partial intensity invariance? and 2) the main orientations
of corresponding control points in multimodal images usually
point to the opposite directions supposing that two images are
properly registered. How can a local feature achieve rotation
invariance?
In this paper, we propose a novel highly distinctive local
feature descriptor named PIIFD [58] and describe a robust auto-
matic retinal image registration framework named Harris-PIIFD
to solve the aforementioned registration problem. PIIFD is in-
variant to image rotation, partially invariant to image intensity ,
affine transformation, and viewpoint/perspective change. Note
that PIIFD is a hybrid area-feature descriptor since the area-
based structural outline is transformed to a feature-vector.
The remainder of this paper is organized as mentioned in
the following. Section II is devoted to the proposed Harris-
PIIFD framework including the novel PIIFD feature descriptor.
Section III describes the experimental settings and reports the
experimental results. Discussion and conclusion are given in
Section IV.
II. P
ROPOSED REGISTRATION FRAMEWORK
Our suggested Harris-PIIFD framework comprises the fol-
lowing seven distinct steps.
1) Detect corner points by a Harris detector [47].
2) Assign a main orientation for each corner point.
3) Extract the PIIFD surrounding each corner point.
4) Match the PIIFDs with bilateral matching.
5) Remove any incorrect matches.
6) Refine the locations of each match.
7) Select the transformation mode.
The flowchart of the Harris-PIIFD framework is shown in
Fig. 3. First, corner points are used as control point candidates
instead of bifurcations (step 1) since corner points are sufficient
and uniformly distributed across the image domain. We assume
that there are two subsets of control point candidates, which
could be identically matched across two images. Second, PI-
IFDs are extracted relative to the main orientations of control
point candidates therefore achieve invariance to image rotation,
and a bilateral matching technique is applied to identify cor-
responding PIIFDs matches between image pairs (steps 2–4).
Fig. 4. Spatial distributions of the control point candidates represented by
(a) bifurcations of vasculature detected by an automatic central line extraction
method and (b) corner points detected by a Harris detector.
Third, incorrect matches are removed and inaccurate matches
are refined (steps 5–6). Finally, an adaptive transformation is ap-
plied to register the image pairs based on these matched control
point candidates (step 7).
Three preprocessing operations are applied before detecting
control point candidates: 1) convert the input image format to
grayscale; 2) scale the intensities of the input image to the full
8-bit intensity range [0, 255]; and 3) zoom out or in the image to
afixedsize(about1000× 1000 pixels, in this paper). The third
operation is not necessary but has twofold advantages: 1) some
image-size-sensitive parameters can be hold fixed and 2) the
scale difference can be reduced in some cases.
A. Detect Corner Points by Harris Detector
The lack of control points is likely to result in an unsuccess-
ful registration for a feature-based algorithm. In retinal image
registration, bifurcations are usually regarded as control point
candidates. However, it is hard to extract the bifurcations in some
cases, especially in poor quality retinal images. Take the image
in Fig. 1(b), for example, only four bifurcations are detected
by a central line extraction algorithm [see Fig. 4(a)] [36]. On
the contrary, a large number of Harris corners are detected and
uniformly distributed across the image domain [see Fig. 4(b)].
Therefore, we introduce Harris detector [47] to generate con-
trol point candidates in our registration framework. The basic
concept of the Harris detector is to measure the changes in all
directions when convoluted with a Gaussian window, and the
changes can be represented by image gradients. For an image I,
assume the traditional image gradients are given as follows:
!
G
xt
G
yt
"
=
!
I/x
I/y
"
. (1)
Thus, the Harris detector can be mathematically expressed as
M =
!
G
2
xt
G
xt
G
yt
G
yt
G
xt
G
2
yt
"
h (2)
R =det(M) ktr
2
(M) (3)
where h is a Gaussian window, k is a constant (usually k =
0.04 0.06 [47]), and det and tr are the determinant and trace
of the matrix, respectively. Given a point p(x, y),itisconsidered
Authorized licensed use limited to: Columbia University. Downloaded on June 15,2010 at 14:53:08 UTC from IEEE Xplore. Restrictions apply.

1710 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 57, NO. 7, JULY 2010
as a corner point if and only if R(p) > 0.Formoredetailsabout
Harris detector please refer to [47].
Extracting the PIIFDs is the most time-consuming stage of the
proposed Harris-PIIFD framework, and its runtime is directly
proportional to the number of corner points (control point can-
didates). It has been confirmed that 200 Harris corner points are
sufficient for subsequent processing, thus, in our experiments
about 200 Harris corner points are detected by automatically
tuning the sigma of Gaussian window.
The corner points in our framework are not directly used as
features for the registration algorithm. Instead, they just provide
the locations for calculating PIIFDs. Thus, the proposed method
can still work if these corner points are disturbed in the neigh-
borhood or even be replaced by a set of randomly distributed
points. The only difference may be a change in accuracy.
B. Assign Main Orientation to Each Corner Point
Amainorientationthatisrelativetothelocalgradientis
assigned to each control point candidate before extracting the
PIIFD. Thus, the PIIFD can be represented relative to this ori-
entation and therefore achieve invariance to image rotation. In
the present study, we introduce a continuous method, average
squared gradients [48], [49], to assign the main orientation.
This method uses the averaged perpendicular direction of gra-
dient which is limited within [0,π)torepresentacontrolpoint
candidate’s main orientation. For image I,thenewgradient
[ G
x
G
y
]
T
is expressed as follows:
!
G
x
G
y
"
= sgn(G
yt
)
!
G
xt
G
yt
"
(4)
where G
xt
and G
yt
are the traditional gradients defined in (1).
In this equation, the second element of the gradient vector is
always positive for the reason that opposite directions of gradi-
ents indicate equivalent main orientations. To compute the main
orientation, the image gradients should be averaged or accumu-
lated within an image window. Opposite gradients will cancel
each other if they are directly averaged or accumulated, but they
are supposed to reinforce each other because they indicate the
same main orientation. A solution to this problem is to square
the gradient vector in complex domain before averaging. The
squared gradient vector [ G
s,x
G
s,y
]
T
is given by
!
G
s,x
G
s,y
"
=
!
G
2
x
G
2
y
2G
x
G
y
"
. (5)
Next, the average squared gradient [
G
s,x
G
s,y
]
T
is calcu-
lated within a Gaussian-weighted circular window
!
G
s,x
G
s,y
"
=
!
G
s,x
h
σ
G
s,y
h
σ
"
(6)
where h
σ
is the Gaussian-weighted kernel, and the operator
means convolution. The σ of the Gaussian window can neither
be too s mall nor too big, for the reason that the average ori-
entation computed in a small window is sensitive to noise and
in a large window cannot represent the local orientation. In this
study, the σ of Gaussian window is set to five pixels empirically.
Fig. 5. Extracting PIIFD relative to main orientation of control point candidate.
(a) Neighborhood surrounding the control point candidate (centered point) is
decided relative to the main orientation. (b) Orientation histogram extracted
from the highlighted small square in (a).
The main orientation φ of each neighborhood with 0 φ<π
is given by
φ =
1
2
tan
1
'
G
s,y
/G
s,x
(
+ π,
G
s,x
0
tan
1
'
G
s,y
/G
s,x
(
+2π,
G
s,x
< 0 G
s,y
0
tan
1
'
G
s,y
/G
s,x
(
,
G
s,x
< 0 G
s,y
< 0
.
(7)
Thus, for each control point candidate p(x, y),itsmainori-
entation is assigned to φ(x, y).
The SIFT algorithm uses an orientation histogram to calcu-
late the main orientation [42]. However, the main orientations
in multimodal images calculated by orientation histogram may
direct to unrelated directions. This may result in many incor-
rect matches. In addition, the orientation histogram is discrete,
suggesting that their directional resolution is related to the num-
ber of histogram bins. Compared with orientation histogram,
our averaging squared gradients is continuous, more accurate
and computational efficient. As long as the structural outlines
are the same, the main orientations calculated by our method
remain the same. Therefore, our method for calculating main
orientation is suitable for multimodal image registration.
C. Extract PIIFD Surrounding Each Corner Point
Given the main orientation of each control point candidate
(corner point extracted by Harris Detector), we can extract the
local feature in a manner invariant to image rotation [42] and
partially invariant to image intensity. As shown in Fig. 5(a), sup-
posing the centered point is a control point candidate, and the big
square which consists of 4 × 4smallsquaresisthelocalneigh-
borhood surrounding this control point candidate. Note that the
main orientation of this control point candidate is illustrated
by the arrow. The size of neighborhood is a tradeoff between
distinctiveness and computational efficiency. In Lowe’s SIFT
algorithm, the size of neighborhood is automatically decided by
the scale of control point. By carefully investigating the reti-
nal images, we empirically set the size to fixed 40 × 40 pixels
in our experiments for the reason that the scale difference is
slight.
To extract the PIIFD, the image gradient magnitudes and ori-
entations are sampled in this local neighborhood. In order to
Authorized licensed use limited to: Columbia University. Downloaded on June 15,2010 at 14:53:08 UTC from IEEE Xplore. Restrictions apply.

CHEN et al.: PIIFD FOR MULTIMODAL RETINAL IMAGE REGISTRATION 1711
achieve orientation invariance, the gradient orientations are ro-
tated relative to the main orientation. For a given small square
in this neighborhood [e.g., the highlighted small square shown
in Fig. 5(a)], an orientation histogram, which evenly covers 0
360
with 16 bins (0
,22.5
,45
,...,337.5
)isformed.The
gradient magnitude of each pixel that falls into this small square
is accumulated to the corresponding histogram entry. It is im-
portant to avoid the boundary affects in which the descriptor
abruptly changes as a sample shifts smoothly from being within
one histogram to another or from one orientation to another.
Therefore, bilinear interpolation is used to distribute the value
of each gradient sample into adjacent histogram bins. The pro-
cesses between extracting PIIFD and SIFT are almost the same,
therefore, PIIFD and SIFT have some common characteristics.
For example, both PIIFD and SIFT are partially inv ariant to
affine transformation [42]–[46].
In an image, an outline is a line marking the multiple con-
tours or boundaries of an object or a figure. The basic idea
of achieving partial intensity invariance involves extracting the
descriptor from the image outlines. This is based on the assump-
tion that regions of similar anatomical structure in one image
would correspond to regions in the other image that also con-
sist of similar outlines (although probably different values to
those of the first image). In this study, image outline extrac-
tion is simplified to extract the constrained image gradients.
The gradient orientations at corresponding locations in multi-
modal images may possibly point to opposite directions and the
gradient magnitudes usually change. In order to achieve partial
intensity invariance, two operations are applied on the image
gradients. First, we normalize the gradient magnitudes piece-
wise to reduce the influence of change of gradient magnitude.
In a neighborhood surrounding each control point candidate,
we normalize the first 20% strongest gradient magnitudes to 1,
second 20% to 0.75, and by parity of reasoning the last 20% to
0. Second, we convert the orientation histogram with 16 bins
to a degraded orientation histogram with only 8 bins (0
,22.5
,
45
, ...,157.5
)bycalculatingthesumoftheoppositedirec-
tions [see Fig. 5(b)]. If the intensities of this local neighborhood
change between two image modalities (for instance, some dark
vessels become bright), then the gradients in this area will also
change. However, the outlines of this area will almost remain
unchanged. The degraded orientation histogram constrains the
gradient orientation from 0 to π,andthenthehistogramachieves
invariance when the gradient orientation rotates by 180
.Con-
sequently, the descriptor achieves partial invariance to the afore-
mentioned intensity change. The second operation is based on
the assumption that the gradient orientations at corresponding
locations in multimodal images point to the same direction or
opposite directions. It is difficult to mathematically prove this
assumption as “multimodal image” is not a well-defined nota-
tion, although for intensity inverse images (an ideal situation),
this assumption is absolutely sustainable. Actually, the degraded
orientation histogram is not as distinctive as the original one,
but this degradation at the cost of distinctiveness is acceptable
for achieving partial invariance to image intensity. For the case
shown in Fig. 5, there are in total 4 × 4 = 16 orientation his-
tograms (one for each small square). All these histograms can
be denoted by
H =
H
11
H
12
H
13
H
14
H
21
H
22
H
23
H
24
H
31
H
32
H
33
H
34
H
41
H
42
H
43
H
44
(8)
where H
ij
denotes an orientation histogram with eight bins.
The main orientations of corresponding control points may
point to the opposite directions in multimodal image pair. This
situation will still occur even we have already constrained the
gradient orientations to the range [0
,180
], and break the ro-
tation invariance. For example, the main orientations of corre-
sponding control points extracted from an image and its rotated
version by 180
always point to the opposite directions. In this
paper, we propose a linear combination of two subdescriptors
to solve this problem. One subdescriptor is the matrix H com-
puted by (8). The other subdescriptor is a rotated version of H:
H : Q =rot(H, 180
).Thecombineddescriptor,PIIFD,can
be calculated as follows:
des =
(H
1
+ Q
1
)
(H
2
+ Q
2
)
c |H
3
Q
3
|
c |H
4
Q
4
|
(9)
H
i
=[H
i1
H
i2
H
i3
H
i4
] (10)
Q
i
=[Q
i1
Q
i2
Q
i3
Q
i4
] (11)
where c is a parameter to tune the proportion of magnitude in this
local descriptor. The absolute value of descriptor is normalized
in the next step. In our algorithm, c is adaptively determined by
making the maximum of two parts the same. The goal of the
linear combination is to make the final descriptor invariant to
two opposite directions. This linear combination is reversible,
so it will not reduce the distinctiveness of the descriptor. It is
obvious that PIIFD is a 4 × 4 × 8matrix.Fortheconvenienceof
matching, it is quantized to a vector with 128 elements. Finally,
the PIIFD is normalized to a unit length.
D. Match PIIFDs by Bilateral Matching Method
We use the best-bin-rst (BBF) algorithm [50] to match the
correspondences between two images. This algorithm identifies
the approximate closest neighbors of points in high dimensional
spaces. This is approximate in the sense that it returns the closest
neighbor with the highest probability. Suppose that the set of
all PIIFDs of image I
1
is F
1
,andthesetofI
2
is F
2
,thenfor
agivenPIIFDf
1i
F
1
,asetofdistancesfromf
1i
to F
2
is
defined as follows:
D(f
1i
,F
2
)={f
1i
f
2i
|f
2j
F
2
} (12)
where is the dot product of vectors. It is obvious that this set
comprises all the distances between f
1i
and descriptors in I
2
.
Let f
2j
+
and f
2j
++
be the biggest and second-biggest elements
of D(f
1
,F
2
),whichcorrespondtof
+
1i
s closest and second-
closest neighbors, respectively. If the closest neighbor is signif-
icantly closer than the second-closest neighbor, f
2j
++
/f
2j
+
<t,
Authorized licensed use limited to: Columbia University. Downloaded on June 15,2010 at 14:53:08 UTC from IEEE Xplore. Restrictions apply.

Citations
More filters
Journal ArticleDOI

Retinal Imaging and Image Analysis

TL;DR: Methods for 2-D fundus imaging and techniques for 3-D optical coherence tomography (OCT) imaging are reviewed and aspects of image acquisition, image analysis, and clinical relevance are treated together considering their mutually interlinked relationships.
Journal ArticleDOI

Image Matching from Handcrafted to Deep Features: A Survey

TL;DR: This survey introduces feature detection, description, and matching techniques from handcrafted methods to trainable ones and provides an analysis of the development of these methods in theory and practice, and briefly introduces several typical image matching-based applications.
Journal ArticleDOI

RIFT: Multi-Modal Image Matching Based on Radiation-Variation Insensitive Feature Transform

TL;DR: Experimental results show that RIFT is superior to SIFT and SAR-SIFT on multi-modal images and the first feature matching algorithm that can achieve good performance on all the abovementioned types of multi- modal images.
Journal ArticleDOI

Remote Sensing Image Matching Based on Adaptive Binning SIFT Descriptor

TL;DR: Experimental results show that the proposed AB-SIFT matching method is more robust and accurate than state-of-the-art methods, including the SIFT, DAISY, the gradient location and orientation histogram, the local intensity order pattern, and the binary robust invariant scale keypoint.
Journal ArticleDOI

MODS: Fast and robust method for two-view matching☆

TL;DR: An improved method for tentative correspondence selection, applicable both with and without view synthesis, and a modification of the standard first to second nearest distance rule increases the number of correct matches by 5–20% at no additional computational cost are introduced.
References
More filters
Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Proceedings ArticleDOI

Object recognition from local scale-invariant features

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.
Proceedings ArticleDOI

A Combined Corner and Edge Detector

TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.
Book ChapterDOI

SURF: speeded up robust features

TL;DR: A novel scale- and rotation-invariant interest point detector and descriptor, coined SURF (Speeded Up Robust Features), which approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster.
Related Papers (5)
Frequently Asked Questions (15)
Q1. What are the contributions in "A partial intensity invariant feature descriptor for multimodal retinal image registration" ?

To solve this problem, the authors propose a novel highly distinctive local feature descriptor named partial intensity invariant feature descriptor ( PIIFD ) and describe a robust automatic retinal image registration framework named Harris-PIIFD. 

The degraded orientation histogram constrains the gradient orientation from 0 to π, and then the histogram achieves invariance when the gradient orientation rotates by 180◦. 

A main orientation that is relative to the local gradient is assigned to each control point candidate before extracting the PIIFD. 

It takes approximately 41.3 min to register all 168 pairs of retinal images using their Harris-PIIFD algorithm (14.75 s per pair, standard deviation of 4.65 s). 

In a neighborhood surrounding each control point candidate, the authors normalize the first 20% strongest gradient magnitudes to 1, second 20% to 0.75, and by parity of reasoning the last 20% to 0. 

PIIFDs are extracted relative to the main orientations of control point candidates therefore achieve invariance to image rotation, and a bilateral matching technique is applied to identify corresponding PIIFDs matches between image pairs (steps 2–4). 

A reliable and fair evaluation method is very important for measuring the performance since there is no public retinal registration dataset. 

In this test, 400 pairs of corresponding control points and 400 pairs of noncorresponding control points are chosen from 20 pairs of different-modal retinal images. 

A robust local feature descriptor may bring to success the registration of poor quality multimodal retinal images, as long as it solves the following two problems: 1) the gradient orientations at corresponding locations in multimodal images may point to opposite directions and the gradient magnitudes usually change. 

Given the main orientation of each control point candidate (corner point extracted by Harris Detector), the authors can extract the local feature in a manner invariant to image rotation [42] and partially invariant to image intensity. 

In their experiments, the average number of control point candidates is 231, the average number of initial matches (including incorrect matches) is 64.6, and the average number of final matches (after removing incorrect matches) is 43.2. 

It has been confirmed that 200 Harris corner points are sufficient for subsequent processing, thus, in their experiments about 200 Harris corner points are detected by automatically tuning the sigma of Gaussian window. 

For a given small square in this neighborhood [e.g., the highlighted small square shown in Fig. 5(a)], an orientation histogram, which evenly covers 0◦– 360◦ with 16 bins (0◦, 22.5◦, 45◦,. . ., 337.5◦) is formed. 

This task takes approximately 10 h, and afterward the authors develop a program to estimate the transformation parameters and overlapping percentage. 

8. The results of this experiment indicate that their proposed Harris-PIIFD can provide robust matching when the scale factor is below 1.8.