# Partitioning into colorful components by minimum edge deletions

## Summary (2 min read)

### 1 Introduction

- The authors study a maximum parsimony approach to the discovery of heterogeneous components in vertex-colored graphs: Colorful Components Instance: Colorful Components is an edge modification problem originating from biological applications in sequence and network alignment as described next.
- The first application of Colorful Components stems from Multiple Sequence Alignment.
- Thus, it is a special case of the well-known NP-hard Multicut problem, which has as input an undirected graph and a set of vertex pairs and asks for a minimum number of edges to delete to disconnect each given vertex pair.
- First, the authors observe that Colorful Components is NP-hard even in trees.

### 2 Computational Hardness

- The authors present hardness results for two restricted variants of Colorful Components.
- Proposition 1. Colorful Components is NP-hard even in trees with diameter four.
- Cj of φ containing the variables xp, xq, and xr, the authors connect the three corresponding variable cycles by a clause gadget.
- Now, since at least 6m edges are deleted in the variable cycles, this means that for each clause Cj exactly four edges incident with aj are deleted by S. Consequently, for each variable cycle either all even or all odd edges are deleted.
- Altogether, this shows the correctness of the reduction.

### 3 Algorithms

- While Theorem 1 shows that Colorful Components is NP-hard for three colors, for two colors it can be solved in polynomial time via computing a maximum matching in bipartite graphs.
- First, the authors describe a simple O(ck ·m)-time search tree algorithm.
- Now, branch into the c cases to destroy this bad path by edge deletion, and for each case recursively solve the resulting instance.
- In the first case, the authors have visited at most c+1 vertices until a vertex pair with the same color has been found.
- The authors note that Rule 1 provides a trivial kernelization [9]5 for Colorful Components with respect to the combined parameter (k, c): obviously, after exhaustive data reduction, the instance has at most 2kc vertices, since an edge deletion can produce at most two colorful components, each of size at most c.

### 4 Formulation as Weighted Multi-Multiway Cut

- In the Colorful Components formulation, it is not possible to simplify a graph based on the knowledge that two vertices belong to the same connected component; the authors would like to be able to merge two such vertices.
- For this, the authors first need to allow not just a single color per vertex, but a set; moreover, they need to allow edge weights.
- Using the merge operation, the authors can do a simple branching on an edge [3]: either delete the edge, or merge its endpoints; in the experimental part this will be referred to as edge branching.
- Note that merging does not necessarily decrease the parameter; but it is easy to see that if the authors branch on each edge of a forbidden path successively, then the last edge of the path cannot be merged since it connects vertices with an intersecting color set.
- The factor 3 has been tuned heuristically.

### 5 Experiments

- The authors performed experiments with instances from the multiple sequence alignment application.
- The source code and the test instances are available under the GNU GPL license at http://fpt.akt.tu-berlin.de/colcom/.
- To efficiently find data reduction opportunities with Rule 2 and Rule 3, the authors try starting with each vertex and successively add more vertices with disjoint colors that minimize the cut to other edges, until they have either found a reduction opportunity or no more vertices can be added.
- For the heuristics, the authors compare the solution quality for the 112 instances for which they know the optimal solution.
- Finally, for the instances for which an exact solution was found, the authors compared the solution quality of the alignments obtained by using DIALIGN with and without the partial alignment columns indicated by an exact solution for Colorful Components, by the merge heuristic, and by the min-cut heuristic.

### 6 Outlook

- In preliminary experiments with network alignment data, the authors found that allowing only one protein of each species to be matched was, while a natural model, too strict.
- Generalizing Color Components to allow a constant number of occurrences of each color for the connected components could result in improved network alignments.

Did you find this useful? Give us your feedback

##### Citations

280 citations

15 citations

15 citations

### Cites background or methods or result from "Partitioning into colorful componen..."

...Previously, we showed that it is NP-hard even in three-colored graphs with maximum degree six [4], and proposed an exact branching algorithm with running time O((c− 1) · |E|) where k is the number of deleted edges....

[...]

...2(a), we compare the running times for the three approaches and additionally the branching algorithm from [4], with a time limit of 15 minutes....

[...]

...Similar to our previous results for multiple sequence alignment [4], the mergebased heuristic gives an excellent approximation here....

[...]

...Before starting the solver, we use data reduction as described before [4]....

[...]

...For completeness, we briefly recall this greedy heuristic [4]....

[...]

9 citations

8 citations

##### References

4,755 citations

424 citations

### "Partitioning into colorful componen..." refers background in this paper

...0 benchmark [14], using the diafragm 1....

[...]

...0 benchmark [14] each time within five minutes on a standard PC, with up to 5 000 vertices and 13 000 edges....

[...]

406 citations

### "Partitioning into colorful componen..." refers background in this paper

...We note that Rule 1 provides a trivial kernelization [8](5) for Colorful Components with respect to the combined parameter (k, c): obviously, after exhaustive data reduction, the instance has at most 2kc vertices, since an edge deletion can produce at most two colorful components, each of size at most c....

[...]

396 citations

391 citations

### "Partitioning into colorful componen..." refers background in this paper

...Note that Multicut is NP-hard and MaxSNPhard even if the input is a star, that is, a tree consisting of a central vertex with attached degree-1 vertices [7]....

[...]