A Multiscale Variable-Grouping Framework for MRF Energy Minimization
TL;DR: A multiscale approach for minimizing the energy associated with Markov Random Fields with energy functions that include arbitrary pairwise potentials, which is evaluated on real-world datasets, achieving competitive performance in relatively short run-times.
Abstract: We present a multiscale approach for minimizing the energy associated with Markov Random Fields (MRFs) with energy functions that include arbitrary pairwise potentials. The MRF is represented on a hierarchy of successively coarser scales, where the problem on each scale is itself an MRF with suitably defined potentials. These representations are used to construct an efficient multiscale algorithm that seeks a minimal-energy solution to the original problem. The algorithm is iterative and features a bidirectional crosstalk between fine and coarse representations. We use consistency criteria to guarantee that the energy is nonincreasing throughout the iterative process. The algorithm is evaluated on real-world datasets, achieving competitive performance in relatively short run-times.
Summary (3 min read)
- Furthermore, it is generally agreed that coarse-to-fine schemes are less sensitive to local minima and can produce higher-quality label assignments [4, 11, 16].
- The authors approach uses existing inference algorithms together with a variable grouping procedure referred to as coarsening, which is aimed at producing a hierarchy of successively coarser representations of the MRF problem, in order to efficiently explore relevant subsets of the space of possible label assignments.
- The method can efficiently incorporate any initializeable inference algorithm that can deal with general pairwise potentials, e.g., QPBO-I  and LSA-TR , yielding significantly lower energy values than those obtained with standard use of these methods.
- Furthermore, the authors suggest to group variables based on the magnitude of their statistical correlation, regardless of whether the variables are assumed to take the same label at the minimum energy.
2.1. The coarsening procedure
- The authors denote the MRF (or its graph) whose energy they aim to minimize, and its corresponding search space by G(0)(V(0), E(0), φ(0)) and X (0), respectively, and use a shorthand notation G(0) to refer to these elements.
- Then, in each such group the authors select one vertex to be the “seed variable” (or seed vertex) of the group.
- Next, the authors eliminate all but the seed vertex in each group and define the coarser graph, G(t+1), whose vertices correspond to the seed vertices of the fine graph G(t).
- The first term sums up the unary potentials of variables in [ṽ], and the second term takes into account the energy of pairwise potentials of all internal pairs u,w ∈ [ṽ].
- It is readily seen that consistency is satisfied by the coarsening procedure, by substituting a labeling assignment of G(t+1) into Eqs. (4) and (5) to verify that the energy at scale t of the interpolated labeling is equal to the coarsescale energy for any interpolation rule.
2.2. The multiscale algorithm
- The key ingredient of this paper is the multiscale algorithm which takes after the classical V-cycle employed in multigrid numerical solvers for partial differential equations [5, 21].
- This process comprises a single iteration or cycle.
- Coarsening halts when the number of variables is sufficiently small, say |V(t)| < N , and an exact solution can be easily recovered, e.g., via exhaustive search.
- Computational complexity and choice of inference module.
- Note that the inference algorithm should not be run until convergence, because its goal is not to find a global optimum of the search sub-space; rather, a small number inference module coarsening interpolation finest scale coarsets scale Figure 2.
- The multiscale framework described so far is not monotonic, due to the fact that the initial state at a coarse level may incur a higher energy than that of the fine state from which it is derived.
- To see this, let x(t) denote the state at level t, right before the coarsening stage of a V-cycle.
- As noted above, coarse-scale variables inherit the current state of seed variables.
- If the energy associated with x(t+1) happens to be higher than the energy associated with x(t) then monotonicity is compromised.
- To avoid this undesirable behavior the authors modify the interpolation rule such that if x(t+1) was inherited from x(t) then x(t+1) will be mapped back to x(t) by the interpolation.
2.4. Variable-grouping by conditional entropy
- The authors next describe their approach for variable-grouping and the selection of a seed variable in each group.
- Heuristically, the authors would like v to be a seed variable, whose labeling determines that of u via the interpolation, if they are relatively confident of what the label of u should be, given just the label of v. Conditional entropy measures the uncertainty in the state of one random variable given the state of another random variable .
- The authors then proceed with the variable-grouping procedure; for each variable they must determine its status, namely whether it is a seed variable or an interpolated variable whose seed must be determined.
- This is achieved by examining directed edges one-by-one according to the order by which they are stored in the binned-score list.
- The process terminates when the status of all the variables has been set.
- The algorithm was implemented in the framework of OpenGM , a C++ template library that offers several inference algorithms and a collection of datasets to evaluate on.
- The authors use QPBO-I  and LSA-TR  for binary models and Swap/Expand-QPBO (αβ-swap/α-expand with a QPBO-I binary step) and Lazy-Flipper with a search depth of 2  for multilabel models.
- Unless otherwise indicated, 3 V-cycles were applied on “hard” energy models (Sec. 3.1) and a single V-cycle on Potts models (Sec. 3.2).
- Hence, the authors resort to comparing multiscale to single-scale inference for algorithms which can be applied in their framework without modifications.
- For each dataset the authors report also the “Ace” inference method for that dataset, where algorithms are ranked according to the percentage of instances on which they achieve the best energy and by their run-time.
3.1. Hard energies
- Concretely, the datasets are split into 3 categories: those for which (all/some/none) of the instances are solved to optimality.
- The authors follow these notions when they refer to hard models, with special attention to the type of pairwise interaction.
- Detailed results are presented in Table 2.
- The Scribble dataset  is an image segmentation task with a user-interactive interface, in which the user is asked to mark boundaries of objects in the scene (see Fig. 4).
- The authors have presented a multiscale framework for MRF energy minimization that uses variable grouping to form coarser levels of the problem.
- The authors demonstrated these concepts with an algorithm that groups variables based on a local approximation of their conditional entropy, namely based on an estimate of their statistical correlation.
- The algorithm was evaluated on a collection of datasets and results indicate that it is beneficial to apply existing single-scale methods within the presented multiscale algorithm.
- There are many possible directions for further developments, beginning with the interpolation rule.
- Indeed, even the set of labels can be expanded on a coarse scale to enrich the coarse search sub-space.
Did you find this useful? Give us your feedback
Cites background or methods from "A Multiscale Variable-Grouping Fram..."
...We thank anonymous reviewers for pointing out the references [37, 11]....
...Also, the multicut problem can be transformed into a Markov random field and solved with primal heuristics there, as done for the “scribbles” dataset in [37, 11]....
"A Multiscale Variable-Grouping Fram..." refers background in this paper
...Conditional entropy measures the uncertainty in the state of one random variable given the state of another random variable ....
"A Multiscale Variable-Grouping Fram..." refers background in this paper
...These benefits follow from the fact that, although only local interactions are encoded, the model is global in nature, and by working at multiple scales information is propagated more efficiently [9, 16]....
...Considerable research has been reported in the literature on approximating (1) in a coarse-to-fine framework [4, 6, 9, 11, 14, 15, 16, 17, 18]....
..., grouping together square patches of variables in a grid [9, 11]....
...Coarse-to-fine methods have been shown to be beneficial in terms of running time [6, 9, 16, 18]....
"A Multiscale Variable-Grouping Fram..." refers methods in this paper
...We use QPBO-I  and LSA-TR  for binary models and Swap/Expand-QPBO (αβ-swap/α-expand with a QPBO-I binary step) and Lazy-Flipper with a search...
...Subject to these limitations we use QPBO-I  and LSA-TR  for binary models....
...For multilabel models we use Swap/Expand-QPBO (αβ-swap/αexpand with a QPBO-I binary step)  and Lazy-Flipper with a search depth 2 ....
..., QPBO-I  and LSA-TR , yielding significantly lower energy values than those obtained with standard use of these methods....
...The method can efficiently incorporate any initializeable inference algorithm that can deal with general pairwise potentials, e.g., QPBO-I  and LSA-TR , yielding significantly lower energy values than those obtained with standard use of these methods....