A non-local MRF model for heritage architectural image completion
Summary (4 min read)
1. INTRODUCTION
- Image completion is an important and challenging computer vision task.
- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page.
- Image completion is an important part of many computer vision applications such as scratch removal, object removal and reconstruction of damaged architectural parts in an image.
- The authors also give the reasoning behind long range potentials and define a novel non-local MRF energy function such that its minimum corresponds to the globally optimal image completion.
- The authors then discuss experimental settings and present results of proposed method in Section 5.
Statistical Methods.
- These methods make use of parametric statistical models for image completion.
- These models (wavelet coefficients [13], colour histogram [6]) are used for representation of image characteristics.
- Initially, the output image is generated keeping the missing regions as pure noise.
- Statistical methods are only useful in case of texture synthesis.
- Moreover, they produce blurred outputs for natural images.
PDE-Based Methods.
- Partial Differential Equation (PDE) based methods use diffusion process for image completion.
- The boundary filling uses diffusion process simulated by solving the partial differentiation equations.
- Chan et al. [3] use variational model for region filling.
- PDE-based approaches perform well in cases where the missing region is smooth and non-textured.
- They fail in case of large inpainting regions.
Exemplar-Based Methods.
- Exemplar based techniques have been the most successful approaches in presence of large unknown regions.
- Exemplar based techniques have been widely used for image completion recently.
- Finally, the greedy approach leads to a bias caused due to selection of a few incorrect patches in the priority based mechanism.
- These methods do not take advantage of repeating patterns inherently present in many architectural images.
- Contrary to this, the authors incorporate the repetition present in the image by including long-range potentials and find the global minima of the non-local MRF energy.
3. THE IMAGE COMPLETION PROBLEM
- Given a source region S and a target region T the image completion problem is to fill the target region such that it agrees with its surroundings.
- Xm} where each Xi is a spatial position of size w×h in the image.
- Figure 2 shows the formulation of image completion as a labeling problem.
- Xj).
- (1) Ei(·), Eij(·) and N corresponds to data term, smoothness term and neighborhood system defined inMRF respectively.
Data term.
- The data term computation for image completion problem is not straightforward because only boundary sites are visible whereas interior sites are hidden.
- (Recall that in any image completion problem user provides a mask which needs to be filled).
- Without loss of generality each patch (xis and Lis) can be represented as a vector of size w×h.
- The authors define the unary cost of random variable xi taking label Li as follows.
- Ei(xi = Li) = l∑ m=0 km(xim − Lim) 2. (2) In other words, data term measures the agreement between random variable xi and label Li in terms of sum of squared distance (ssd) of known pixels.
Smoothness term.
- The data term alone cannot give coherent completion.
- To enforce coherency in the completed image, the authors define a smoothness term such that overlapping region of neighboring labels have least sum of squared distance.
- The authors define the smoothness term as follows.
- The process of data and smoothness term computation is pictorially depicted in Figure 3.
Long range potentials.
- In addition to the data and smoothness terms, the authors wish to capture the inherent repetitive patterns present in heritage architectural image.
- To achieve this, the authors add an extra term in the MRF energy which they call as long range pairwise potentials.
- The long range pairwise potential are defined between a patch and its repeating offset at distance τ .
- (We describe the repeating offset computation in the next subsection).the authors.
- (5) Once the energy is formulated, the problem of image completion becomes equivalent to finding the configuration x∗ corresponding to the global minima of the energy function.
3.1 Repeating Offset Computation
- Many archaeological monument images contain repeating patterns.
- These repeating patterns vary in complexity which makes image completion a challenging task.
- The repetitions can be of any size and along any direction.
- The authors use the fact that patches which are part of the repetition will repeat with some common offset.
1. Finding Nearest Similar Patches.
- For every patch in the source region of the image the authors find the nearest most similar patch.
- The similarity is defined using the sum of squared differences (ssd) between the patches.
- This is just to ignore the nearby patches which are likely to be similar but do not contribute towards the repetition offset.
- In their testing, θ was set to be 1/15th of the maximum of image height and image width.
- Therefore, to overcome the high computation cost, the authors use Approximate Nearest Neighbour(ANN).
2. Histogram and Offset Generation.
- Once the authors obtain the offsets corresponding to each pixel in the source region, they need to combine the results in order to obtain the correct repetition offsets.
- H(τ ) gives the count of the number of patches having their individual offset as τ .
- Now, to get the prominent repetition offsets, the authors analyze the histogram counts of the offsets and select offsets with highest count.
- Since the image may contain many repetitive patterns which are prominent, thus the authors generate the top C offsets to capture varied repetitions (C = 10 was used in their experiments).
3.2 Graph Construction and inference
- The authors solve the energy minimization problem on a corresponding graph, where each random variable is represented as a node in the graph.
- To capture repeating patterns in the image, the authors also join non-local nodes at offset τ .
- The authors further group nodes in this graph into two categories: visible and hidden.
- The cost of a node taking some label Li is determined by the unary cost defined in Equation 2. Similar to [10], if a node is highly likely to take some label, the authors declare that node “committed” and give higher priority to it for sub-sequent inference procedure.
Inference.
- In their experiments, the patch size is set dynamically as per the image resolution and aspect ratio with the minimum dimension of 4×.
- For all the examples, the belief thresholds for pruning and confidence is set to −2ssd0 and −ssd0 respectively, where −ssd0 represents a predefined mediocre ssd between the patches.
- Figure 7 shows the results of object removal using their method.
- Apart from object removal and ruined wall reconstruction the authors also use their method for an interesting application known as background replacement.
- The authors also study the importance of long range potentials.
4. SUB-MODULARITY AND METRICITY
- The authors prove that the non-local MRF energy function defined in Equation 5 is sub-modular and semimetric.
- Further, since the sum of sub-modular functions is a sub-modular function [16], the energy function defined in Equation 5 is a sub-modular energy function for every pair of labels.
- The energy function defined in Equation 5 is basically composed of sum of squared distance (ssd) between two vectors, thus it would be sufficient to prove ssd as a semi-metric.
- Then, since Euclidean distance holds triangular in-equality, the authors can write.
- The proof of sub-modularity and semi-metricity of the energy functions also guarantees that popular move making algorithm α-β swap can be efficiently used to find the global minima of this energy with a constant approximation [2].
5. EXPERIMENTS AND RESULTS
- The authors present a detailed evaluation of their method on a large collection of images captured from Indian heritage sites.
- To show the generality of the method, the authors also include few synthetic images and natural images in their test datasets.
- Given an image and user provided mask, their problem is to complete the masked region in a way that is visually plausible to observer.
- The authors evaluate various components of their approach to justify their choices.
- The dataset for their experiments comprises of a large variety of images of Indian Heritage sites including Hampi, Konark, Golkonda Fort etc.
Approximate Nearest Neighbour.
- In the process of repetition offset computation, the authors use Approximate Nearest Neighbour1 technique in order to find the most similar patches.
- For a resolution of 100 × 100, a brute force method takes around 2 minutes to process the entire image and generate the offsets.
- With ANN, the time is reduced to 0.1 seconds.
- The threshold radius (θ) is set to 1/15th of the maximum of the image width and height.
- C = 10 most frequent offsets are chosen for their experiments.
6. CONCLUSIONS
- In this work the authors address the problem of image completion.
- The image completion problem is formulated in a principled framework.
- The authors model the repeating patterns inherently present in images using long range potentials and solve the problem in non-local MRF framework.
- The authors prove that the proposed MRF energy is sub-modular and semi-metric.
- Experimental results on a wide collection of images show that the authors clearly outperform popular technique like exemplar based inpainting [4].
Did you find this useful? Give us your feedback
Citations
10 citations
5 citations
Cites background from "A non-local MRF model for heritage ..."
...[31] incorporated long range pairwise potentials into energy equation in order to capture the inherent repeating patterns for inpainting heritage architectural images....
[...]
References
15,671 citations
7,413 citations
"A non-local MRF model for heritage ..." refers background or methods in this paper
...Reader is encouraged to see [2] for details of the move making algorithms....
[...]
...The proof of sub-modularity and semi-metricity of the energy functions also guarantees that popular move making algorithm α-β swap can be efficiently used to find the global minima of this energy with a constant approximation [2]....
[...]
...This proof guarantees that the energy function can be efficiently minimized via move making algorithm like α-β swap [2]....
[...]
[...]
3,830 citations
"A non-local MRF model for heritage ..." refers background in this paper
...In [1] region-filling is done by propagating image Laplacians in the direction of the isophotes....
[...]
3,066 citations
"A non-local MRF model for heritage ..." refers background or methods in this paper
...Experimental results on a wide collection of images show that we clearly outperform popular technique like exemplar based inpainting [4]....
[...]
...Image completion is a highly researched area in computer vision [4, 5, 7, 8, 10, 11]....
[...]
...[4] propose a priority-based mechanism which combines texture synthesis and isophote driven inpainting for image completion....
[...]
...We compare our method with the well known exemplar based method [4]....
[...]
...On the other hand, greedy algorithm like [4] clearly fails in removing these objects....
[...]
1,978 citations
"A non-local MRF model for heritage ..." refers methods in this paper
...These models (wavelet coefficients [13], colour histogram [6]) are used for representation of image characteristics....
[...]
Related Papers (5)
Frequently Asked Questions (13)
Q2. What is the main purpose of the method?
Apart from object removal and ruined wall reconstruction the authors also use their method for an interesting application known as background replacement.
Q3. How many ssds are used in the example?
For all the examples, the belief thresholds for pruning and confidence is set to −2ssd0 and −ssd0 respectively, where −ssd0 represents a predefined mediocre ssd between the patches.
Q4. what is the ssd between two vectors?
Identity of indiscernibles: ssd between two vector is equal to zero iff both the vectors are equal, i.e.,ssd(Li, Lj) = 0 ⇐⇒ i = j, ∀i, j.3.
Q5. What is the proof of sub-modularity and semi-metricity of the energy functions?
The proof of sub-modularity and semi-metricity of the energy functions also guarantees that popular move making algorithm α-β swap can be efficiently used to find the global minima of this energy with a constant approximation [2].
Q6. What is the definition of a smoothness term?
To enforce coherency in the completed image, the authors define a smoothness term such that overlapping region of neighboring labels have least sum of squared distance.
Q7. How do the authors solve the energy minimization problem?
The authors solve the energy minimization problem on a corresponding graph, where each random variable is represented as a node in the graph.
Q8. How many repetitions are used in the graph?
Since the image may contain many repetitive patterns which are prominent, thus the authors generate the top C offsets to capture varied repetitions (C = 10 was used in their experiments).
Q9. What is the definition of the image completion problem?
The authors define the image completion problem in a labeling problem framework where overlapping spatial positions in image can be considered as a set of sites and patches of size w × h sampled from source region can be considered as labels.
Q10. What is the cost of xi taking label Li?
Ei(xi = Li) = l∑m=0km(xim − Lim) 2. (2)In other words, data term measures the agreement between random variable xi and label Li in terms of sum of squared distance (ssd) of known pixels.
Q11. How do the authors find the similar patches?
In the process of repetition offset computation, the authors use Approximate Nearest Neighbour1 technique in order to find the most similar patches.
Q12. What are the datasets for their experiments?
The dataset for their experiments comprises of a large variety of images of Indian Heritage sites including Hampi, Konark, Golkonda Fort etc.
Q13. What is the definition of the labeling problem?
The labeling problem here is to find the optimal function f∗ : S → L. Optimality criteria is defined based on quality of the image completion.