scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

A non-local MRF model for heritage architectural image completion

TL;DR: This work proposes a non-local MRF model for image completion problem, which represents the patches in the target region of the image as random variables in an MRF, and introduces a novel energy function on these variables.
Abstract: MRF models have shown state-of-the-art performance for many computer vision tasks. In this work, we propose a non-local MRF model for image completion problem. The goal of image completion is to fill user specified "target" region with patches of "source" regions in a way that is visually plausible to an observer. We represent the patches in the target region of the image as random variables in an MRF, and introduce a novel energy function on these variables. Each variable takes a label from a label set which is a collection of patches of the source region. The quality of the image completion is determined by the value of the energy function. The non-locality in the MRF is achieved through long range pairwise potentials. These long range pairwise potentials are defined to capture the inherent repeating patterns present in heritage architectural images. We minimize this energy function using Belief Propagation to obtain globally optimal image completion.We have tested our method on a wide variety of images and shown superior performance over previously published results for this task.

Summary (4 min read)

1. INTRODUCTION

  • Image completion is an important and challenging computer vision task.
  • Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page.
  • Image completion is an important part of many computer vision applications such as scratch removal, object removal and reconstruction of damaged architectural parts in an image.
  • The authors also give the reasoning behind long range potentials and define a novel non-local MRF energy function such that its minimum corresponds to the globally optimal image completion.
  • The authors then discuss experimental settings and present results of proposed method in Section 5.

Statistical Methods.

  • These methods make use of parametric statistical models for image completion.
  • These models (wavelet coefficients [13], colour histogram [6]) are used for representation of image characteristics.
  • Initially, the output image is generated keeping the missing regions as pure noise.
  • Statistical methods are only useful in case of texture synthesis.
  • Moreover, they produce blurred outputs for natural images.

PDE-Based Methods.

  • Partial Differential Equation (PDE) based methods use diffusion process for image completion.
  • The boundary filling uses diffusion process simulated by solving the partial differentiation equations.
  • Chan et al. [3] use variational model for region filling.
  • PDE-based approaches perform well in cases where the missing region is smooth and non-textured.
  • They fail in case of large inpainting regions.

Exemplar-Based Methods.

  • Exemplar based techniques have been the most successful approaches in presence of large unknown regions.
  • Exemplar based techniques have been widely used for image completion recently.
  • Finally, the greedy approach leads to a bias caused due to selection of a few incorrect patches in the priority based mechanism.
  • These methods do not take advantage of repeating patterns inherently present in many architectural images.
  • Contrary to this, the authors incorporate the repetition present in the image by including long-range potentials and find the global minima of the non-local MRF energy.

3. THE IMAGE COMPLETION PROBLEM

  • Given a source region S and a target region T the image completion problem is to fill the target region such that it agrees with its surroundings.
  • Xm} where each Xi is a spatial position of size w×h in the image.
  • Figure 2 shows the formulation of image completion as a labeling problem.
  • Xj).
  • (1) Ei(·), Eij(·) and N corresponds to data term, smoothness term and neighborhood system defined inMRF respectively.

Data term.

  • The data term computation for image completion problem is not straightforward because only boundary sites are visible whereas interior sites are hidden.
  • (Recall that in any image completion problem user provides a mask which needs to be filled).
  • Without loss of generality each patch (xis and Lis) can be represented as a vector of size w×h.
  • The authors define the unary cost of random variable xi taking label Li as follows.
  • Ei(xi = Li) = l∑ m=0 km(xim − Lim) 2. (2) In other words, data term measures the agreement between random variable xi and label Li in terms of sum of squared distance (ssd) of known pixels.

Smoothness term.

  • The data term alone cannot give coherent completion.
  • To enforce coherency in the completed image, the authors define a smoothness term such that overlapping region of neighboring labels have least sum of squared distance.
  • The authors define the smoothness term as follows.
  • The process of data and smoothness term computation is pictorially depicted in Figure 3.

Long range potentials.

  • In addition to the data and smoothness terms, the authors wish to capture the inherent repetitive patterns present in heritage architectural image.
  • To achieve this, the authors add an extra term in the MRF energy which they call as long range pairwise potentials.
  • The long range pairwise potential are defined between a patch and its repeating offset at distance τ .
  • (We describe the repeating offset computation in the next subsection).the authors.
  • (5) Once the energy is formulated, the problem of image completion becomes equivalent to finding the configuration x∗ corresponding to the global minima of the energy function.

3.1 Repeating Offset Computation

  • Many archaeological monument images contain repeating patterns.
  • These repeating patterns vary in complexity which makes image completion a challenging task.
  • The repetitions can be of any size and along any direction.
  • The authors use the fact that patches which are part of the repetition will repeat with some common offset.

1. Finding Nearest Similar Patches.

  • For every patch in the source region of the image the authors find the nearest most similar patch.
  • The similarity is defined using the sum of squared differences (ssd) between the patches.
  • This is just to ignore the nearby patches which are likely to be similar but do not contribute towards the repetition offset.
  • In their testing, θ was set to be 1/15th of the maximum of image height and image width.
  • Therefore, to overcome the high computation cost, the authors use Approximate Nearest Neighbour(ANN).

2. Histogram and Offset Generation.

  • Once the authors obtain the offsets corresponding to each pixel in the source region, they need to combine the results in order to obtain the correct repetition offsets.
  • H(τ ) gives the count of the number of patches having their individual offset as τ .
  • Now, to get the prominent repetition offsets, the authors analyze the histogram counts of the offsets and select offsets with highest count.
  • Since the image may contain many repetitive patterns which are prominent, thus the authors generate the top C offsets to capture varied repetitions (C = 10 was used in their experiments).

3.2 Graph Construction and inference

  • The authors solve the energy minimization problem on a corresponding graph, where each random variable is represented as a node in the graph.
  • To capture repeating patterns in the image, the authors also join non-local nodes at offset τ .
  • The authors further group nodes in this graph into two categories: visible and hidden.
  • The cost of a node taking some label Li is determined by the unary cost defined in Equation 2. Similar to [10], if a node is highly likely to take some label, the authors declare that node “committed” and give higher priority to it for sub-sequent inference procedure.

Inference.

  • In their experiments, the patch size is set dynamically as per the image resolution and aspect ratio with the minimum dimension of 4×.
  • For all the examples, the belief thresholds for pruning and confidence is set to −2ssd0 and −ssd0 respectively, where −ssd0 represents a predefined mediocre ssd between the patches.
  • Figure 7 shows the results of object removal using their method.
  • Apart from object removal and ruined wall reconstruction the authors also use their method for an interesting application known as background replacement.
  • The authors also study the importance of long range potentials.

4. SUB-MODULARITY AND METRICITY

  • The authors prove that the non-local MRF energy function defined in Equation 5 is sub-modular and semimetric.
  • Further, since the sum of sub-modular functions is a sub-modular function [16], the energy function defined in Equation 5 is a sub-modular energy function for every pair of labels.
  • The energy function defined in Equation 5 is basically composed of sum of squared distance (ssd) between two vectors, thus it would be sufficient to prove ssd as a semi-metric.
  • Then, since Euclidean distance holds triangular in-equality, the authors can write.
  • The proof of sub-modularity and semi-metricity of the energy functions also guarantees that popular move making algorithm α-β swap can be efficiently used to find the global minima of this energy with a constant approximation [2].

5. EXPERIMENTS AND RESULTS

  • The authors present a detailed evaluation of their method on a large collection of images captured from Indian heritage sites.
  • To show the generality of the method, the authors also include few synthetic images and natural images in their test datasets.
  • Given an image and user provided mask, their problem is to complete the masked region in a way that is visually plausible to observer.
  • The authors evaluate various components of their approach to justify their choices.
  • The dataset for their experiments comprises of a large variety of images of Indian Heritage sites including Hampi, Konark, Golkonda Fort etc.

Approximate Nearest Neighbour.

  • In the process of repetition offset computation, the authors use Approximate Nearest Neighbour1 technique in order to find the most similar patches.
  • For a resolution of 100 × 100, a brute force method takes around 2 minutes to process the entire image and generate the offsets.
  • With ANN, the time is reduced to 0.1 seconds.
  • The threshold radius (θ) is set to 1/15th of the maximum of the image width and height.
  • C = 10 most frequent offsets are chosen for their experiments.

6. CONCLUSIONS

  • In this work the authors address the problem of image completion.
  • The image completion problem is formulated in a principled framework.
  • The authors model the repeating patterns inherently present in images using long range potentials and solve the problem in non-local MRF framework.
  • The authors prove that the proposed MRF energy is sub-modular and semi-metric.
  • Experimental results on a wide collection of images show that the authors clearly outperform popular technique like exemplar based inpainting [4].

Did you find this useful? Give us your feedback

Figures (11)

Content maybe subject to copyright    Report

A Non-local MRF model for Heritage Architectural
Image Completion
Deepan Gupta
Vaidehi Chhajer
Anand Mishra
C. V. Jawahar
Center for Visual Information Technology, IIIT Hyderabad, India
http://cvit.iiit.ac.in/
ABSTRACT
MRF mod els have shown state-of-the-art performance for
many computer vision tasks. In this work, we propose a
non-local MRF model for image completion problem. The
goal of image completion is to fill user specified “target” re-
gion with patches of“source” regions in a way th at is visually
plausible to an observer. We represent the patches in the tar-
get region of the image as random variables in an MRF, and
introduce a novel energy function on these variables. Each
variable takes a label from a label set which is a collection
of patches of the source region. The quality of the image
completion is determined by the value of the energy func-
tion. The non-locality in the MRF is achieved through long
range pairwise potentials. These long range pairwise poten-
tials are defined to capture the inherent repeating patterns
present in heritage architectural images. We minimize this
energy function using Belief Propagation to obtain globally
optimal image completion.
We have tested our method on a wide variety of images
and shown superior performance over previously published
results for this task.
Keywords
Inpainting, MRF, Belief Propagation
1. INTRODUCTION
Image completion is an important and challenging com-
puter vision task. The goal of an image completion algo-
rithm is to reconstru ct the missing regions within an image
in a way that is visually plausible to an observer. In most
cases, the missing region (called the target region) is lled in
by using the information from the rest of the image (called
Equal contribution
{deepan.gupta,vaidehi.chhajer}@students.iiit.ac.in
anand.mishra@research.iiit.ac.in
jawahar@iiit.ac.in
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
ICVGIP ’12, December 16-19, 2012, Mumbai, India
Copyright 2012 ACM 978-1-4503-1660-6/12/12 ...$15.00.
(a) Object Removal
(b) Reconstruction
Figure 1: Many successful applications of our
method. (a) Object remova l: the signboard on the
window is successfully removed. (b) Reconstruction:
the originally broken Colosseum has been success-
fully reconstructed using our approach.
the source region). Image completion is an important part
of many computer vision applications such as scratch re-
moval, object removal and reconstruction of damaged archi-
tectural parts in an image. Moreover, image completion has
applications in the field of photo editing and restoration.
Image completion is a highly researched area in computer
vision [4, 5, 7, 8, 10, 11]. Although due to the complex-
ity of the images, results leave a lot to be desired. In this
work, we propose a non-local MRF technique to complete
images. (Note that non-local MRF [15] has been success-
fully applied to image restoration in past). The beauty of
our formulation is in its capability to use non-local repeating
patterns in MRF energy minimization framework. We use
Belief Propagation [12] to find the minimum of this en ergy
i.e., the optimal image completion. Few successful appli-
cations of our method are shown in Figure 1. It can be
seen that our method performs well for the tasks like object
removal, reconstruction etc.
India has one of the greatest architectu ral monuments in
the world. With advances in computer vision techniques,
it is now possible to capture the glory of heritage architec-

tural images for both showcasing and preservation for future
generations. Our work can be considered as a small step to-
wards this. We primarily focus on completion of images
taken from Indian heritage architectures. (Although we also
test our method on many other images to show the general-
ity of the method).
Contributions. The contribution of this work is two
fold, (1) We propose a non-local op timization framework to
capture repeating patterns inherently present in the image
(especially, the images of our interest i.e. heritage architec-
tural images). We model these repetitions via long range
potentials in MRF. (2) We prove that the proposed energy
function is sub-mo dular as well as semi-metric. This proof
guarantees that the energy function can be efficiently min-
imized via move making algorithm like α-β swap [2]. How-
ever, the study of energy minimization techniques is beyond
the scope of this work. H ence, we restrict ourself to Belief
Propagation for the implementation of our method.
Outline of the paper. The remainder of the paper is or-
ganized as follows. We discuss related work in Section 2. In
Section 3, the image completion problem is formulated as a
labeling problem. In this section, we also give the reasoning
behind long range potentials and define a novel non- local
MRF energy funct ion such t hat its minimum correspond s
to the globally optimal image completion. In Section 4, we
prove that the proposed non-lo cal MRF energy function is
sub-modular and semi-metric. We then discuss experimental
settings and present results of proposed method in Section 5.
Finally, Section 6 concludes our work.
2. RELATED WORK
There has been significant research in the field of image
completion. Various approaches have been put forward over
the last few decades. The approaches can be grouped into
four major categories: (1) Statistical Methods, (2) Partial
Differentiation Equation (PDE)-Based Methods, (3) Exem-
plar Based Methods, (4) Global Op timization based Meth-
ods.
Statistical Methods.
These methods make use of parametric statistical mod-
els for image completion. These models (wavelet coeffi-
cients [13], colour histogram [6]) are used for representation
of image characteristics. The idea is to estimate th e miss-
ing region and fill it using an iterative process. Initially, the
output image is generated keeping the missing regions as
pure noise. These regions u ndergo iterative noise reduction
to produce the final output. Statistical m ethod s are only
useful in case of texture synthesis. Moreover, they produce
blurred outputs for natural images.
PDE-Based Methods.
Partial Differential Equation (PDE) based methods use
diffusion process for image completion. The idea is to start
the region-filling from t he boundary of the missing region
and then propagate towards the interior. The boundary
filling u ses diffusion process simulated by solving the par-
tial differentiation equations. In [1] region-filling is done
by propagating image Laplacians in the direction of the
isophotes. Chan et al. [3] use variational model for region
filling.
PDE-based approaches perform well in cases where the
missing region is smooth and non-textured. However, they
fail in case of large inpainting regions.
Exemplar-Based Methods.
Exemplar based techniques have been the most su ccessful
approaches in presence of large unknown regions. The visi-
ble patches of the image are used as a training set to infer
the unknown parts which are then filled by simply copy-
ing the content of these known patches. Exemplar based
techniques have been widely used for image completion re-
cently. Criminisi et al. [4] propose a priority-based mecha-
nism which combines texture synthesis and isophote driven
inpainting for image completion. This ap proach, though
isophote driven, is not capable of maintaining the structural
consistency of the image. Hung et al. [8] propose Bezier
curves to determine missing edge information, h ence pre-
serving structure consistency. The damaged regions are then
inpainted using exemplar based methods. However, there
are three major pitfalls of these methods. Firstly, the con-
fidence map is computed based on heuristics and may not
be applicable to a general case. Secondly, once a patch has
been assigned to an unknown region, it cannot be changed.
Finally, the greedy approach leads to a bias caused due to
selection of a few incorrect patches in the priority based
mechanism. These incorrect completions have a spiraling
effect which destabilizes the inpainting process.
Global Optimization based methods.
There has been huge interest in the discrete optimization
community for image completion problem in recent years [5,
14]. However, these methods do not take advantage of re-
peating patterns inherently present in many architectural
images. On other hand, we model the repetitions in the en-
ergy function itself and thus define a better energy function
for the problem. Closest to our work is [10]. Here authors try
to tackle the drawbacks of the Exemplar-based approach by
posing image-filling as a discrete global optimization prob-
lem. It uses the exemplar-based framework and Markov
Random Field (MRF) for image completion. The idea is to
minimize the energy of the MRF using Priority-Belief Prop-
agation (Priority-BP) optimization scheme. The approach
works well for majority of the cases. However, in the im-
ages where repetitions are prominent, it produces relatively
poor output. The method ignores the fact that repetition,
if present, may carry critical information about the miss-
ing region. Contrary to this, we incorporate the repetition
present in the image by inclu ding long-range potentials and
find the global min ima of the non-local MRF energy.
3. THE IMAGE COMPLETION PROBLEM
Given a source region S and a target region T the image
completion problem is to fill the target region such that it
agrees with its surroundings. We define th e image comple-
tion problem in a labeling problem framework where over-
lapping spatial positions in image can be considered as a set
of sites and patches of size w × h sampled from source re-
gion can be considered as labels. In other words, site is a set
S = {X
0
, X
1
. . . X
m
} where each X
i
is a spatial position of
size w×h in the image. Similarly, label L = {L
0
, L
1
, . . . , L
n
}
is a set where each L
i
is a patch of size w × h sampled from
source region S. Figure 2 shows the formulation of image
completion as a labeling problem.

Each site X
i
can take a random value x
i
= {L
0
, . . . , L
n
}.
The labeling problem here is to find the optimal function
f
: S L. Optimality criteria is defined based on qual-
ity of the image completion. In general, this is an NP-hard
problem. However it can be solved approximately by find-
ing the minimum of the Gibbs energy (also known as MRF
energy) of following form:
E(x) =
m
X
i=1
E
i
(x
i
) +
X
N
E
ij
(x
i
, x
j
). (1)
E
i
(·), E
ij
(·) and N corresponds to data term, smoothness
term and neighb orh ood system defined in MRF respectively.
The data term measures the agreement with the available
observations whereas the smoothn ess term is used to enforce
spatial coherence. The minimum of this energy function
corresponds to the op timal image completion.
Data term.
The data term computation for image completion prob-
lem is not straightforward because only boundary sites are
visible whereas interior sites are hidden. (Recall that in any
image completion problem user provides a mask which need s
to be filled). Without loss of generality each patch (x
i
s and
L
i
s) can be represented as a vector of size w×h. i.e. patches
can be represent as a vector in a vector space as follows.
x
i
= [x
i
0
, . . . , x
i
l
]
T
L
i
= [L
i
0
, . . . , L
i
l
]
T
,
where l = w×h and each x
i
i
is either hidden or in [0, 255]
3
.
Whereas L
i
i
[0, 255]
3
, i. To distinguish visible and hid-
den nodes, we introduce a binary vector K = {k
1
, k
2
, . . . , k
l
}
such that k
m
takes value 0 if x
i
m
is hidden and 1 otherwise.
We define the unary cost of ran dom variable x
i
taking label
L
i
as follows.
E
i
(x
i
= L
i
) =
l
X
m=0
k
m
(x
i
m
L
i
m
)
2
. (2)
In other words, data term measures the agreement between
random variable x
i
and label L
i
in t erms of sum of squared
distance (ssd) of known pixels. Thus, th e cost of x
i
taking
label L
i
is low if the sum of squared distance (ssd) between
x
i
and L
i
is low.
Smoothness term.
The data term alone cannot give coherent completion.
To enforce coherency in the completed image, we define a
smoothn ess term such that overlapping region of neighbor-
ing labels have least sum of squared distance. We define the
smoothn ess term as follows.
E
ij
(x
i
= L
i
, x
j
= L
j
)
=
size(ψ)
X
m=0
δ(X
i
m
ψ) δ(X
j
m
ψ)(L
i
m
L
j
m
)
2
. (3)
Here ψ is the overlapping region between sites (i.e. patches)
X
i
and X
j
. δ(·) is an indicator function. The process of data
and smoothness term comput ation is pictorially depicted in
Figure 3.
Figure 2: Image Completion as a labeling problem.
Overlapping patch posi tions in image and patches
sampled from source region represent sites and la-
bels respectively. The labeling problem here is to
find the optimal labeling from sites to labels.
Figure 3: Data term and smoothness term computa-
tions. Data term is agreement from the la bels to the
node in terms of ssd. Only vi sible area contributes to
ssd. Smoothness term is computed based on ssd in
overlapping regions of labels. (note that labels here
are basically a collection of patches) (Best viewed in
colour).
Long range potentials.
In addition to the data and smoothness terms, we wish to
capture the inherent repetitive patt erns present in heritage
architectural image. To achieve this, we add an extra term
in the MRF energy which we call as long range pairwise
potentials. The long range pairwise potential are defined
between a patch and its repeating offset at d istance τ . (We
describe the repeating offset computation in the next subsec-
tion). This long range potential forces a node to take similar
label to a patch at offset τ . Mathematically, the long range
potential E
lr
(·, ·) is defined as follows.
E
lr
(x
i
= L
i
, x
k
= L
j
) =
l
X
m=1
k
m
(x
i
m
L
j
m
)
2
. (4)
Note that here x
i
and x
k
are non-local i.e. they are at
distance τ . (Here τ is a repeating offset, in other words the
image has a repeating pattern at offset τ ). This definition
of long range potential ensures less penalty if x
i
and x
k
take
similar labels.
Thus we modify Equation 1 to non-local MRF energy as

Figure 4: Many architectural i mages contain repeat-
ing patterns. One such example is shown here (Best
viewed in colour).
follows.
E(x) =
X
E
i
(x
i
)+
X
N
E
ij
(x
i
, x
j
)+
X
dist(x
i
,x
k
)=τ
E
lr
(x
i
, x
k
).
(5)
Once the energy is formulated, the problem of image com-
pletion becomes equivalent to finding the configuration x
correspondin g to the global minima of the energy function.
The graph construction corresponding t o this energy func-
tion and the inference (energy minimization) are d iscussed
in Section 3.2.
3.1 Repeating Offset Computation
Many archaeological monument images contain repeating
patterns. One su ch example is shown in Figure 4. These
repeating patterns vary in complexity which makes image
completion a challenging task. The repetitive pattern may
carry significant information about the region which is to be
completed (inpainted). Hence, t he idea is t o make use of
various repetitions present in the input image and boost the
region-filling.
The repetitions can be of any size and along any direction.
In order to capture both we make use of (p, q) offset (τ ,
repeating offset) pairs which correspond to the x-direction
and y-direction repetition offsets respectively.
In this step, we compute offsets that can effectively rep-
resent the inherent repetition in the image. We use the fact
that patches which are part of the repetition will repeat with
some common offset. The distance between a patch and its
nearest similar patch will account for the repetition offset.
Offset Generation consists of following steps.
1. Finding Nearest Similar Patches.
For every patch in the source region of t he image we find
the nearest most similar patch. For every patch P belonging
to the source region,
τ(x) = arg min
τ
||P (x) P (x + τ )||
2
; |τ| > θ.
Here τ is a 2D-coordinate offset (p, q), P (x) is the patch
centered at x. P (x + τ(x)) is the nearest similar patch. τ
represents the offset obtained for th e pixel x. The simi-
larity is defined using the sum of squared differences (ssd)
between the patches. Lesser ssd correspond s to higher sim-
ilarity. The parameter θ represents the th reshold radius for
the nearest patch. The patch must lie out side this range.
Figure 5: The proposed graphical model. There are
two types of nodes in the graph: visible (in boundary
and source regi on) and hidden (in interior region of
target). Hidden nodes are shown by filled circles.
Local pairwise potentials are s hown via red edges
and non-local long range potentials are shown via
green edges. We use loopy belief propagation for
inference in this graphical model (Best v iewed in
colour).
This is just to ignore the n earby patches which are likely
to be similar but do not contribute towards the repetition
offset. This threshold varies with the images based on their
sizes. In our testing, θ was set to be 1/15
th
of th e maximum
of image height and image width.
The brute force search for the nearest patch can be compu-
tationally expensive as for each patch we need to traverse en-
tire source region. Therefore, to overcome the high compu-
tation cost, we use Approximate Nearest N eighb ou r(ANN).
Using approximate nearest neighbour,
τ(x) = AN N (P (x)); |τ| > θ.
2. Histogram and Offset Generation.
Once we obtain the offsets corresponding to each pixel in
the source region, we need to combine the results in order
to obtain the correct repetition offsets. To achieve this we
represent τ as a 2D-plane and generate histogram count of
the all τ offsets i.e.
H(τ ) =
P
τ (x)
δ(τ (x) = τ).
H(τ ) gives the count of the number of patches having
their individual offset as τ . Now, to get the prominent repe-
tition offsets, we analyze the histogram counts of the offsets
and select offsets with highest count. Since the image may
contain many repetitive patterns which are prominent, thus
we generate the top C offsets to capture varied repetitions
(C = 10 was used in our experiments).
3.2 Graph Construction and inference
We solve the energy minimization problem on a corre-
sp on ding graph, where each random variable is represented
as a node in the graph. Nodes in 4-neighborhood system are
connected via edges. To capture repeating patterns in the
image, we also join non-local nodes at offset τ. (This offset
is computed based on the procedure described Section 3.1).
We further group nodes in this graph into two categories:
visible and hidden. The nodes belonging to source region

Figure 6: Overview of our method. First user selects a mask. Based on user selected mask a graph is con-
structed where each no de represent an overlapping spatial position in the image. These nodes are connected
via a 4-neighborhood system N . Moreover, to capture inherent repeatability, two nodes at distance of re-
peating offset are also connected (these edges are shown in yellow colour). To find repeating pattern in the
image we use approximate nearest search (ANN). After graph construction, graph is labeled using popula r
inferencing technique: BP. The final output of our method is shown i n the right most image (Best viewed in
colour).
and boundary of target region are visible, however interior
nodes of th e target regions are hidden. Each node takes a
label from label set L = {L
0
, L
1
, . . . , L
n
} where each L
i
is a
patch of size w × h. The cost of a node taking some label L
i
is determined by the unary cost defined in Equation 2. Fur-
ther, the joint cost of two neighboring and non-local nodes
taking label L
i
and L
j
defined in Equation 3 and 4 respec-
tively, give th e weight s to edges. Similar to [10], if a node is
highly likely to take some label, we declare that node “com-
mitted” and give higher priority to it for sub-sequent infer-
ence procedure. The proposed graphical model is shown and
explained in Figure 5.
Inference.
For Inference of the proposed graphical model, we use pop-
ular message passing based inferencing algorithm known as
loopy Belief propagation (BP). Belief propagation was first
proposed in [12]. I t iteratively tries to find the Maximum-
a-Posteriori (MAP) estimate by propagating messages ( be-
liefs) from nodes t o its neighbors. (Recall t hat in MAP-
MRF framework MAP is equivalent to the global minima
of the MRF energy). Although theoretically loopy BP does
not guarantee convergence for grids, but experimentally it
has been shown that it yields a strong local minima for a
wide range of computer vision problems [17].
The proposed meth od is summarized in Figure 6.
4. SUB-MODULARITY AND METRICITY
In this section, we prove that the non-local MRF energy
function defined in Equation 5 is sub-modular and semi-
metric.
Statement 1. The energy function defined in Equation 5
is a sub-modular function for every pair of labels.
Proof. A function of single variable is trivially a sub-
mod ular function [9]. Thus, it would suffice if we prove that
the pairwise terms E
ij
(·, ·) and E
lr
ij
(·, ·) are sub-modular for
every pair of labels. To proof the sub-modularity, we need
to prove the following:
P
i
E(L
i
, L
i
)
P
i6=j
E(L
i
, L
j
), i, j.
Since E(·, ·) is a sum of squared distance between two vec-
tors, thus E(L
i
, L
i
) = 0,i. Moreover, sum of squared dis-
tance between any two not-equal vectors is always positive,
which implies, E(·, ·) is a sub-modular function. This proof
of sub-mod ularity can be easily extended to long range po-
tentials without loss of generality. Further, since the sum
of sub-modular functions is a su b-mo dular function [16], the
energy function defined in Equation 5 is a sub-mod ular en-
ergy function for every pair of labels.
Statement 2. The energy function defined in Equation 5
is a semi-metric.
Proof. The energy function defined in Equation 5 is ba-
sically composed of sum of squared distance (ssd) between
two vectors, thus it would be sufficient to prove ssd as a
semi-metric. Here we show that the sum of squared distance
(ssd) has all th e three necessary and sufficient properties to
be a semi-metric. We also show that ssd does not hold tri-
angular inequality always, thus is not a metric.
1.Non-negativity: ssd between two vector is alway s greater
than zero i.e.
ssd(L
i
, L
j
) 0, i 6= j.
2.Identity of indiscernibles: ssd between two vector is equal
to zero iff both the vectors are equ al, i.e.,
ssd(L
i
, L
j
) = 0 i = j, i, j.
3.Symmetricity: ssd between two vector is a symmetric
function,i.e.
ssd(L
i
, L
j
) = ssd(L
j
, L
i
), i, j.
4.Triangular in-equality: In order to attempt to prove this
let us rst start with the Euclidean distance between two
vectors. Let dist(L
i
, L
j
) be Euclidean distance between vec-
tors L
i
and L
j
. Then, since Euclidean distance holds trian-
gular in-equality, we can write.
dist(L
i
, L
j
) dist(L
i
, L
k
) + dist(L
k
, L
j
), k.
Squaring ab ove equation yields,
ssd(L
i
, L
j
) ssd(L
i
, L
k
) + ssd(L
k
, L
j
)+
2 dist(L
i
, L
k
)dist(L
k
, L
j
).
Recall that ssd is a square of Euclidean distance. Now
since Euclidean distances are non-negative, i.e. 2 dist(L
i
, L
k
)
dist(L
k
, L
j
) 0, in other words we can always find L
i
, L
j
and L
k
such that,

Citations
More filters
Journal ArticleDOI
TL;DR: Experimental results show that the proposed method achieves generally better performance than nine state-of-the-art methods in terms of the abilities of maintaining structure coherence and the computational cost on inpainting different kinds of degraded images.

10 citations

Journal ArticleDOI
TL;DR: Experimental results show that the proposed Structure Offsets Statistics based image inpainting algorithm is superior to several state-of-the-art approaches in terms of the abilities of maintaining structure coherence and neighborhood consistence and the computational efficiency.
Abstract: Image inpainting technique recovers the missing regions of an image using information from known regions and it has shown success in various application fields. As a popular kind of methods, Markov Random Field (MRF)-based methods are able to produce better results than earlier diffusion-based and sparse-based methods on inpainting images with big holes. However, for images with complex structures, the results are still not quite pleasant and some inpainting trails exist. The direction feature is an important factor for image understanding and human eye visual requirements, and exploiting multi-direction features is of great potential to further improve inpainting performance. Following the idea, this paper proposes a Structure Offsets Statistics based image inpainting algorithm by exploiting multiple direction features under the framework of MRF-based methods. Specifically, when selecting proper labels, multi-direction features are extracted and applied to construct a structure image and a non-structure image, and the candidate labels are chosen from the offsets of structure and non-structure images. Meanwhile, the multi-direction features are applied to construct a new smooth term for the energy equation which is then solved by graph-cut optimization technology. Experimental results show that on inpainting tasks with various complexities, the proposed method is superior to several state-of-the-art approaches in terms of the abilities of maintaining structure coherence and neighborhood consistence and the computational efficiency.

5 citations


Cites background from "A non-local MRF model for heritage ..."

  • ...[31] incorporated long range pairwise potentials into energy equation in order to capture the inherent repeating patterns for inpainting heritage architectural images....

    [...]

Journal Article
TL;DR: An algorithm that enhances and extends a previously proposed algorithm and it provides faster inpainting and dynamic settings selection, which would allow the end user to obtain a collection of output images from which the finest result can be chosen.
Abstract: Inpainting is the technique which is used for the modification of a particular image in an undetectable form, is as ancient as art itself. There are a number of applications for inpainting, they varies from the restoration of damaged paintings and photographs to the removal/replacement of selected objects. There have been several approaches proposed for the same. In this paper, we present an algorithm that enhances and extends a previously proposed algorithm and it provides faster inpainting. Using our approach, one can use this to inpaint large regions (e.g. to remove an object etc.) as well as it is used to recover small portions (e.g. restore a photograph by removing cracks etc.). The inpainting method is based on the exemplar based approach. The basic idea behind this approach is to find exemplars (i.e. patches) from the image and replace the lost data with it. This technique can be used for the restoration of damaged photographs or damaged film. In contrast with previous approaches, the technique here introduced has an advantage. That is, it does not require the user to specify where the novel information comes from. This is automatically done (and in a fast way), thereby it allows the system to simultaneously fill-in numerous regions which contain completely different structures and surrounding backgrounds. This paper looks forward to improve the algorithm so that the computational complexity is further improved while retaining the quality of inpainting. Here the inpainting algorithm presented here is not meant to be used for inpainting images, but for videos also. We are also investing methods to improve this algorithm to make it more robust so that it can be used with videos in this paper itself. We would like to provide dynamic settings selection, which would allow the end user to obtain a collection of output images from which the finest result can be chosen.
References
More filters
Book
01 Jan 1988
TL;DR: Probabilistic Reasoning in Intelligent Systems as mentioned in this paper is a complete and accessible account of the theoretical foundations and computational methods that underlie plausible reasoning under uncertainty, and provides a coherent explication of probability as a language for reasoning with partial belief.
Abstract: From the Publisher: Probabilistic Reasoning in Intelligent Systems is a complete andaccessible account of the theoretical foundations and computational methods that underlie plausible reasoning under uncertainty. The author provides a coherent explication of probability as a language for reasoning with partial belief and offers a unifying perspective on other AI approaches to uncertainty, such as the Dempster-Shafer formalism, truth maintenance systems, and nonmonotonic logic. The author distinguishes syntactic and semantic approaches to uncertainty—and offers techniques, based on belief networks, that provide a mechanism for making semantics-based systems operational. Specifically, network-propagation techniques serve as a mechanism for combining the theoretical coherence of probability theory with modern demands of reasoning-systems technology: modular declarative inputs, conceptually meaningful inferences, and parallel distributed computation. Application areas include diagnosis, forecasting, image interpretation, multi-sensor fusion, decision support systems, plan recognition, planning, speech recognition—in short, almost every task requiring that conclusions be drawn from uncertain clues and incomplete information. Probabilistic Reasoning in Intelligent Systems will be of special interest to scholars and researchers in AI, decision theory, statistics, logic, philosophy, cognitive psychology, and the management sciences. Professionals in the areas of knowledge-based systems, operations research, engineering, and statistics will find theoretical and computational tools of immediate practical use. The book can also be used as an excellent text for graduate-level courses in AI, operations research, or applied probability.

15,671 citations

Journal ArticleDOI
TL;DR: This work presents two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves that allow important cases of discontinuity preserving energies.
Abstract: Many tasks in computer vision involve assigning a label (such as disparity) to every pixel. A common constraint is that the labels should vary smoothly almost everywhere while preserving sharp discontinuities that may exist, e.g., at object boundaries. These tasks are naturally stated in terms of energy minimization. The authors consider a wide class of energies with various smoothness constraints. Global minimization of these energy functions is NP-hard even in the simplest discontinuity-preserving case. Therefore, our focus is on efficient approximation algorithms. We present two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves. These moves can simultaneously change the labels of arbitrarily large sets of pixels. In contrast, many standard algorithms (including simulated annealing) use small moves where only one pixel changes its label at a time. Our expansion algorithm finds a labeling within a known factor of the global minimum, while our swap algorithm handles more general energy functions. Both of these algorithms allow important cases of discontinuity preserving energies. We experimentally demonstrate the effectiveness of our approach for image restoration, stereo and motion. On real data with ground truth, we achieve 98 percent accuracy.

7,413 citations


"A non-local MRF model for heritage ..." refers background or methods in this paper

  • ...Reader is encouraged to see [2] for details of the move making algorithms....

    [...]

  • ...The proof of sub-modularity and semi-metricity of the energy functions also guarantees that popular move making algorithm α-β swap can be efficiently used to find the global minima of this energy with a constant approximation [2]....

    [...]

  • ...This proof guarantees that the energy function can be efficiently minimized via move making algorithm like α-β swap [2]....

    [...]

Proceedings ArticleDOI
01 Jul 2000
TL;DR: A novel algorithm for digital inpainting of still images that attempts to replicate the basic techniques used by professional restorators, and does not require the user to specify where the novel information comes from.
Abstract: Inpainting, the technique of modifying an image in an undetectable form, is as ancient as art itself. The goals and applications of inpainting are numerous, from the restoration of damaged paintings and photographs to the removal/replacement of selected objects. In this paper, we introduce a novel algorithm for digital inpainting of still images that attempts to replicate the basic techniques used by professional restorators. After the user selects the regions to be restored, the algorithm automatically fills-in these regions with information surrounding them. The fill-in is done in such a way that isophote lines arriving at the regions' boundaries are completed inside. In contrast with previous approaches, the technique here introduced does not require the user to specify where the novel information comes from. This is automatically done (and in a fast way), thereby allowing to simultaneously fill-in numerous regions containing completely different structures and surrounding backgrounds. In addition, no limitations are imposed on the topology of the region to be inpainted. Applications of this technique include the restoration of old photographs and damaged film; removal of superimposed text like dates, subtitles, or publicity; and the removal of entire objects from the image like microphones or wires in special effects.

3,830 citations


"A non-local MRF model for heritage ..." refers background in this paper

  • ...In [1] region-filling is done by propagating image Laplacians in the direction of the isophotes....

    [...]

Journal ArticleDOI
TL;DR: The simultaneous propagation of texture and structure information is achieved by a single, efficient algorithm that combines the advantages of two approaches: exemplar-based texture synthesis and block-based sampling process.
Abstract: A new algorithm is proposed for removing large objects from digital images. The challenge is to fill in the hole that is left behind in a visually plausible way. In the past, this problem has been addressed by two classes of algorithms: 1) "texture synthesis" algorithms for generating large image regions from sample textures and 2) "inpainting" techniques for filling in small image gaps. The former has been demonstrated for "textures"-repeating two-dimensional patterns with some stochasticity; the latter focus on linear "structures" which can be thought of as one-dimensional patterns, such as lines and object contours. This paper presents a novel and efficient algorithm that combines the advantages of these two approaches. We first note that exemplar-based texture synthesis contains the essential process required to replicate both texture and structure; the success of structure propagation, however, is highly dependent on the order in which the filling proceeds. We propose a best-first algorithm in which the confidence in the synthesized pixel values is propagated in a manner similar to the propagation of information in inpainting. The actual color values are computed using exemplar-based synthesis. In this paper, the simultaneous propagation of texture and structure information is achieved by a single , efficient algorithm. Computational efficiency is achieved by a block-based sampling process. A number of examples on real and synthetic images demonstrate the effectiveness of our algorithm in removing large occluding objects, as well as thin scratches. Robustness with respect to the shape of the manually selected target region is also demonstrated. Our results compare favorably to those obtained by existing techniques.

3,066 citations


"A non-local MRF model for heritage ..." refers background or methods in this paper

  • ...Experimental results on a wide collection of images show that we clearly outperform popular technique like exemplar based inpainting [4]....

    [...]

  • ...Image completion is a highly researched area in computer vision [4, 5, 7, 8, 10, 11]....

    [...]

  • ...[4] propose a priority-based mechanism which combines texture synthesis and isophote driven inpainting for image completion....

    [...]

  • ...We compare our method with the well known exemplar based method [4]....

    [...]

  • ...On the other hand, greedy algorithm like [4] clearly fails in removing these objects....

    [...]

Journal ArticleDOI
TL;DR: A universal statistical model for texture images in the context of an overcomplete complex wavelet transform is presented, demonstrating the necessity of subgroups of the parameter set by showing examples of texture synthesis that fail when those parameters are removed from the set.
Abstract: We present a universal statistical model for texture images in the context of an overcomplete complex wavelet transform. The model is parameterized by a set of statistics computed on pairs of coefficients corresponding to basis functions at adjacent spatial locations, orientations, and scales. We develop an efficient algorithm for synthesizing random images subject to these constraints, by iteratively projecting onto the set of images satisfying each constraint, and we use this to test the perceptual validity of the model. In particular, we demonstrate the necessity of subgroups of the parameter set by showing examples of texture synthesis that fail when those parameters are removed from the set. We also demonstrate the power of our model by successfully synthesizing examples drawn from a diverse collection of artificial and natural textures.

1,978 citations


"A non-local MRF model for heritage ..." refers methods in this paper

  • ...These models (wavelet coefficients [13], colour histogram [6]) are used for representation of image characteristics....

    [...]

Frequently Asked Questions (13)
Q1. What are the contributions in "A non-local mrf model for heritage architectural image completion" ?

In this work, the authors propose a non-local MRF model for image completion problem. The authors represent the patches in the target region of the image as random variables in an MRF, and introduce a novel energy function on these variables. The authors have tested their method on a wide variety of images and shown superior performance over previously published results for this task. The non-locality in the MRF is achieved through long range pairwise potentials. These long range pairwise potentials are defined to capture the inherent repeating patterns present in heritage architectural images. 

Apart from object removal and ruined wall reconstruction the authors also use their method for an interesting application known as background replacement. 

For all the examples, the belief thresholds for pruning and confidence is set to −2ssd0 and −ssd0 respectively, where −ssd0 represents a predefined mediocre ssd between the patches. 

Identity of indiscernibles: ssd between two vector is equal to zero iff both the vectors are equal, i.e.,ssd(Li, Lj) = 0 ⇐⇒ i = j, ∀i, j.3. 

The proof of sub-modularity and semi-metricity of the energy functions also guarantees that popular move making algorithm α-β swap can be efficiently used to find the global minima of this energy with a constant approximation [2]. 

To enforce coherency in the completed image, the authors define a smoothness term such that overlapping region of neighboring labels have least sum of squared distance. 

The authors solve the energy minimization problem on a corresponding graph, where each random variable is represented as a node in the graph. 

Since the image may contain many repetitive patterns which are prominent, thus the authors generate the top C offsets to capture varied repetitions (C = 10 was used in their experiments). 

The authors define the image completion problem in a labeling problem framework where overlapping spatial positions in image can be considered as a set of sites and patches of size w × h sampled from source region can be considered as labels. 

Ei(xi = Li) = l∑m=0km(xim − Lim) 2. (2)In other words, data term measures the agreement between random variable xi and label Li in terms of sum of squared distance (ssd) of known pixels. 

In the process of repetition offset computation, the authors use Approximate Nearest Neighbour1 technique in order to find the most similar patches. 

The dataset for their experiments comprises of a large variety of images of Indian Heritage sites including Hampi, Konark, Golkonda Fort etc. 

The labeling problem here is to find the optimal function f∗ : S → L. Optimality criteria is defined based on quality of the image completion.