scispace - formally typeset
Search or ask a question

Showing papers on "Approximation algorithm published in 2004"


Journal ArticleDOI
TL;DR: This article presents new results on using a greedy algorithm, orthogonal matching pursuit (OMP), to solve the sparse approximation problem over redundant dictionaries and develops a sufficient condition under which OMP can identify atoms from an optimal approximation of a nonsparse signal.
Abstract: This article presents new results on using a greedy algorithm, orthogonal matching pursuit (OMP), to solve the sparse approximation problem over redundant dictionaries. It provides a sufficient condition under which both OMP and Donoho's basis pursuit (BP) paradigm can recover the optimal representation of an exactly sparse signal. It leverages this theory to show that both OMP and BP succeed for every sparse input signal from a wide class of dictionaries. These quasi-incoherent dictionaries offer a natural generalization of incoherent dictionaries, and the cumulative coherence function is introduced to quantify the level of incoherence. This analysis unifies all the recent results on BP and extends them to OMP. Furthermore, the paper develops a sufficient condition under which OMP can identify atoms from an optimal approximation of a nonsparse signal. From there, it argues that OMP is an approximation algorithm for the sparse problem over a quasi-incoherent dictionary. That is, for every input signal, OMP calculates a sparse approximant whose error is only a small factor worse than the minimal error that can be attained with the same number of terms.

3,865 citations


Book
01 Jan 2004
TL;DR: This paper establishes the possibility of stable recovery under a combination of sufficient sparsity and favorable structure of the overcomplete system and shows that similar stability is also available using the basis and the matching pursuit algorithms.
Abstract: Overcomplete representations are attracting interest in signal processing theory, particularly due to their potential to generate sparse representations of signals. However, in general, the problem of finding sparse representations must be unstable in the presence of noise. This paper establishes the possibility of stable recovery under a combination of sufficient sparsity and favorable structure of the overcomplete system. Considering an ideal underlying signal that has a sufficiently sparse representation, it is assumed that only a noisy version of it can be observed. Assuming further that the overcomplete system is incoherent, it is shown that the optimally sparse approximation to the noisy data differs from the optimally sparse decomposition of the ideal noiseless signal by at most a constant multiple of the noise level. As this optimal-sparsity method requires heavy (combinatorial) computational effort, approximation algorithms are considered. It is shown that similar stability is also available using the basis and the matching pursuit algorithms. Furthermore, it is shown that these methods result in sparse approximation of the noisy data that contains only terms also appearing in the unique sparsest representation of the ideal noiseless sparse signal.

2,365 citations


Journal ArticleDOI
31 Aug 2004
TL;DR: It is shown that the data complexity of some queries is #P-complete, which implies that these queries do not admit any efficient evaluation methods, and an optimization algorithm is described that can compute efficiently most queries.
Abstract: We describe a system that supports arbitrarily complex SQL queries on probabilistic databases. The query semantics is based on a probabilistic model and the results are ranked, much like in Information Retrieval. Our main focus is efficient query evaluation, a problem that has not received attention in the past. We describe an optimization algorithm that can compute efficiently most queries. We show, however, that the data complexity of some queries is #P-complete, which implies that these queries do not admit any efficient evaluation methods. For these queries we describe both an approximation algorithm and a Monte-Carlo simulation algorithm.

1,113 citations


Journal ArticleDOI
25 Jun 2004
TL;DR: This formulation is motivated from a document clustering problem in which one has a pairwise similarity function f learned from past data, and the goal is to partition the current set of documents in a way that correlates with f as much as possible; it can also be viewed as a kind of “agnostic learning” problem.
Abstract: We consider the following clustering problem: we have a complete graph on n vertices (items), where each edge (u, v) is labeled either + or − depending on whether u and v have been deemed to be similar or different. The goal is to produce a partition of the vertices (a clustering) that agrees as much as possible with the edge labels. That is, we want a clustering that maximizes the number of + edges within clusters, plus the number of − edges between clusters (equivalently, minimizes the number of disagreements: the number of − edges inside clusters plus the number of + edges between clusters). This formulation is motivated from a document clustering problem in which one has a pairwise similarity function f learned from past data, and the goal is to partition the current set of documents in a way that correlates with f as much as possibles it can also be viewed as a kind of “agnostic learning” problem. An interesting feature of this clustering formulation is that one does not need to specify the number of clusters k as a separate parameter, as in measures such as k-median or min-sum or min-max clustering. Instead, in our formulation, the optimal number of clusters could be any value between 1 and n, depending on the edge labels. We look at approximation algorithms for both minimizing disagreements and for maximizing agreements. For minimizing disagreements, we give a constant factor approximation. For maximizing agreements we give a PTAS, building on ideas of Goldreich, Goldwasser, and Ron (1998) and de la Veg (1996). We also show how to extend some of these results to graphs with edge labels in [−1, +1], and give some results for the case of random noise.

996 citations


Journal ArticleDOI
TL;DR: This work analyzes local search heuristics for the metric k-median and facility location problems and shows that local search with swaps has a locality gap of 5 and introduces a new local search operation which opens one or more copies of a facility and drops zero or more facilities.
Abstract: We analyze local search heuristics for the metric k-median and facility location problems. We define the locality gap of a local search procedure for a minimization problem as the maximum ratio of a locally optimum solution (obtained using this procedure) to the global optimum. For k-median, we show that local search with swaps has a locality gap of 5. Furthermore, if we permit up to p facilities to be swapped simultaneously, then the locality gap is 3+2/p. This is the first analysis of a local search for k-median that provides a bounded performance guarantee with only k medians. This also improves the previous known 4 approximation for this problem. For uncapacitated facility location, we show that local search, which permits adding, dropping, and swapping a facility, has a locality gap of 3. This improves the bound of 5 given by M. Korupolu, C. Plaxton, and R. Rajaraman [Analysis of a Local Search Heuristic for Facility Location Problems, Technical Report 98-30, DIMACS, 1998]. We also consider a capacitated facility location problem where each facility has a capacity and we are allowed to open multiple copies of a facility. For this problem we introduce a new local search operation which opens one or more copies of a facility and drops zero or more facilities. We prove that this local search has a locality gap between 3 and 4.

671 citations


Proceedings ArticleDOI
26 Sep 2004
TL;DR: This paper presents an efficient solution to determine the user-AP associations for max-min fair bandwidth allocation, and shows the strong correlation between fairness and load balancing, which enables them to use load balancing techniques for obtaining optimal maximum-minFair bandwidth allocation.
Abstract: Recent studies on operational wireless LANs (WLANs) have shown that user load is often unevenly distributed among wireless access points (APs). This unbalanced load results in unfair bandwidth allocation among users. We observe that the unbalanced load and unfair bandwidth allocation can be greatly alleviated by intelligently associating users to APs, termed association control, rather than having users greedily associate APs of best received signal strength.In this study, we present an efficient algorithmic solution to determine the user-AP associations that ensure max-min fair bandwidth allocation. We provide a rigorous formulation of the association control problem that considers bandwidth constraints of both the wireless and backhaul links. Our formulation indicates the strong correlation between fairness and load balancing, which enables us to use load balancing techniques for obtaining near optimal max-min fair bandwidth allocation. Since this problem is NP-hard, we present algorithms that achieve a constant-factor approximate max-min fair bandwidth allocation. First, we calculate a fractional load balancing solution, where users can be associated with multiple APs simultaneously. This solution guarantees the fairest bandwidth allocation in terms of max-min fairness. Then, by utilizing a rounding method we obtain an efficient integral association. In particular, we provide a 2-approximation algorithm for unweighted greedy users and a 3-approximation algorithm for weighted and bounded-demand users. In addition to bandwidth fairness, we also consider time fairness and we show it can be solved optimally. We further extend our schemes for the on-line case where users may join and leave. Our simulations demonstrate that the proposed algorithms achieve close to optimal load balancing and max-min fairness and they outperform commonly used heuristic approaches.

537 citations


Proceedings ArticleDOI
17 May 2004
TL;DR: In the presence of indivisibilities, it is shown that there exist allocations in which the envy is bounded by the maximum marginal utility, and an algorithm for computing such allocations is presented.
Abstract: We study the problem of fairly allocating a set of indivisible goods to a set of people from an algorithmic perspective. fair division has been a central topic in the economic literature and several concepts of fairness have been suggested. The criterion that we focus on is envy-freeness. In our model, a monotone utility function is associated with every player specifying the value of each subset of the goods for the player. An allocation is envy-free if every player prefers her own share than the share of any other player. When the goods are divisible, envy-free allocations always exist. In the presence of indivisibilities, we show that there exist allocations in which the envy is bounded by the maximum marginal utility, and present a simple algorithm for computing such allocations. We then look at the optimization problem of finding an allocation with minimum possible envy. In the general case the problem is not solvable or approximable in polynomial time unless P = NP. We consider natural special cases (e.g.additive utilities) which are closely related to a class of job scheduling problems. Approximation algorithms as well as inapproximability results are obtained. Finally we investigate the problem of designing truthful mechanisms for producing allocations with bounded envy.

514 citations


Journal ArticleDOI
TL;DR: The paper presents a general method of designing constant-factor approximation algorithms for some discrete optimization problems with assignment-type constraints with better performance guarantees for some well-known problems including MAXIMUM COVERAGE, MAX CUT and some of their generalizations.
Abstract: The paper presents a general method of designing constant-factor approximation algorithms for some discrete optimization problems with assignment-type constraints. The core of the method is a simple deterministic procedure of rounding of linear relaxations (referred to as pipage rounding). With the help of the method we design approximation algorithms with better performance guarantees for some well-known problems including MAXIMUM COVERAGE, MAX CUT with given sizes of parts and some of their generalizations.

417 citations


Proceedings ArticleDOI
26 Apr 2004
TL;DR: In this article, three approximation algorithms for a variation of the set k-cover problem, where the objective is to partition the sensors into covers such that the number of covers that include an area, summed over all areas, is maximized, are presented.
Abstract: Wireless sensor networks (WSNs) are emerging as an effective means for environment monitoring. This paper investigates a strategy for energy efficient monitoring in WSNs that partitions the sensors into covers, and then activates the covers iteratively in a round-robin fashion. This approach takes advantage of the overlap created when many sensors monitor a single area. Our work builds upon previous work by Slijepcevic and Potkonjak (2001), where the model is first formulated. We have designed three approximation algorithms for a variation of the set k-cover problem, where the objective is to partition the sensors into covers such that the number of covers that include an area, summed over all areas, is maximized. The first algorithm is randomized and partitions the sensors, in expectation, within a fraction 1 - (1/e) (/spl sim/ .63) of the optimum. We present two other deterministic approximation algorithms. One is a distributed greedy algorithm with a 1/2 approximation ratio and the other is a centralized greedy algorithm with a 1 - (1/e) approximation ratio. We show that it is NP-complete to guarantee better than 15/16 of the optimal coverage, indicating that all three algorithms perform well with respect to the best approximation algorithm possible in polynomial time, assuming P /spl ne/ NP. Simulations indicate that in practice, the deterministic algorithms perform far above their worst case bounds, consistently covering more than 72% of what is covered by an optimum solution. Simulations also indicate that the increase in longevity is proportional to the amount of overlap amongst the sensors. The algorithms are fast, easy to use, and according to simulations, significantly increase the longevity of sensor networks. The randomized algorithm in particular seems quite practical.

386 citations


Proceedings ArticleDOI
07 Mar 2004
TL;DR: An on-line distributed protocol that relies only on the local information available at each sensor node within the aggregation tree, and a pseudo-polynomial time approximation algorithm based on dynamic programming are developed.
Abstract: We study the problem of scheduling packet transmissions for data gathering in wireless sensor networks. The focus is to explore the energy-latency tradeoffs in wireless communication using techniques such as modulation scaling. The data aggregation tree - a multiple-source single-sink communication paradigm - is employed for abstracting the packet flow. We consider a real-time scenario where the data gathering must be performed within a specified latency constraint. We present algorithms to minimize the overall energy dissipation of the sensor nodes in the aggregation tree subject to the latency constraint. For the off-line problem, we propose (a) a numerical algorithm for the optimal solution, and (h) a pseudo-polynomial time approximation algorithm based on dynamic programming. We also discuss techniques for handling interference among the sensor nodes. Simulations have been conducted for both long-range communication and short-range communication. The simulation results show that compared with the classic shutdown technique, between 20% to 90% energy savings can be achieved by our techniques, under different settings of several key system parameters. We also develop an on-line distributed protocol that relies only on the local information available at each sensor node within the aggregation tree. Simulation results show that between 15% to 90% energy conservation can be achieved by the on-line protocol. The adaptability of the protocol with respect to variations in the packet size and latency constraint is also demonstrated through several run-time scenarios.

383 citations


Proceedings ArticleDOI
11 Oct 2004
TL;DR: This article designs a centralized approximation algorithm that delivers a near-optimal (within a factor of O(lg n)) solution, and presents a distributed version of the algorithm.
Abstract: In overdeployed sensor networks, one approach to conserve energy is to keep only a small subset of sensors active at any instant. In this article, we consider the problem of selecting a minimum size connected K-cover, which is defined as a set of sensors M such that each point in the sensor network is "covered" by at least K different sensors in M, and the communication graph induced by M is connected. For the above optimization problem, we design a centralized approximation algorithm that delivers a near-optimal (within a factor of O(lg n)) solution, and present a distributed version of the algorithm. We also present a communication-efficient localized distributed algorithm which is empirically shown to perform well

Journal ArticleDOI
TL;DR: The clustering problem is the task of making the fewest changes to the edge set of an input graph so that it becomes a cluster graph, and it is shown that Cluster Editing is NP-complete, Cluster Deletion isNP-hard to approximate to within some constant factor, and Cluster Completion is polynomial.

Journal ArticleDOI
TL;DR: A (1+2/e)-approximation algorithm is obtained, which is a significant improvement on the previously known approximation guarantees, and works by rounding an optimal fractional solution to a linear programming relaxation.
Abstract: We consider the uncapacitated facility location problem. In this problem, there is a set of locations at which facilities can be built; a fixed cost fi is incurred if a facility is opened at location i. Furthermore, there is a set of demand locations to be serviced by the opened facilities; if the demand location j is assigned to a facility at location i, then there is an associated service cost proportional to the distance between i and j, cij. The objective is to determine which facilities to open and an assignment of demand points to the opened facilities, so as to minimize the total cost. We assume that the distance function c is symmetric and satisfies the triangle inequality. For this problem we obtain a (1+2/e)-approximation algorithm, where $1+2/e \approx 1.736$, which is a significant improvement on the previously known approximation guarantees. The algorithm works by rounding an optimal fractional solution to a linear programming relaxation. Our techniques use properties of optimal solutions to the linear program, randomized rounding, as well as a generalization of the decomposition techniques of Shmoys, Tardos, and Aardal [Proceedings of the 29th ACM Symposium on Theory of Computing, El Paso, TX, 1997, pp. 265--274].

Journal ArticleDOI
TL;DR: An optimization criterion is presented for discriminant analysis that extends the optimization criteria of the classical Linear Discriminant Analysis through the use of the pseudoinverse when the scatter matrices are singular, overcoming a limitation of classical LDA.
Abstract: An optimization criterion is presented for discriminant analysis. The criterion extends the optimization criteria of the classical Linear Discriminant Analysis (LDA) through the use of the pseudoinverse when the scatter matrices are singular. It is applicable regardless of the relative sizes of the data dimension and sample size, overcoming a limitation of classical LDA. The optimization problem can be solved analytically by applying the Generalized Singular Value Decomposition (GSVD) technique. The pseudoinverse has been suggested and used for undersampled problems in the past, where the data dimension exceeds the number of data points. The criterion proposed in this paper provides a theoretical justification for this procedure. An approximation algorithm for the GSVD-based approach is also presented. It reduces the computational complexity by finding subclusters of each cluster and uses their centroids to capture the structure of each cluster. This reduced problem yields much smaller matrices to which the GSVD can be applied efficiently. Experiments on text data, with up to 7,000 dimensions, show that the approximation algorithm produces results that are close to those produced by the exact algorithm.

Proceedings ArticleDOI
17 Oct 2004
TL;DR: This work presents the first linear time (1 + /spl epsiv/)-approximation algorithm for the k-means problem for fixed k and /spl Epsiv/, which runs in O(nd) time.
Abstract: We present the first linear time (1 + /spl epsiv/)-approximation algorithm for the k-means problem for fixed k and /spl epsiv/. Our algorithm runs in O(nd) time, which is linear in the size of the input. Another feature of our algorithm is its simplicity - the only technique involved is random sampling.

Proceedings ArticleDOI
13 Jun 2004
TL;DR: The problem of approximating the cut-norm of a given real matrix is MAX SNP hard, and the algorithm combines semidefinite programming with a rounding technique based on Grothendieck's Inequality to provide an efficient approximation algorithm.
Abstract: The cut-norm ||A||C of a real matrix A=(aij)i∈ R,j∈S is the maximum, over all I ⊂ R, J ⊂ S of the quantity | Σi ∈ I, j ∈ J aij|. This concept plays a major role in the design of efficient approximation algorithms for dense graph and matrix problems. Here we show that the problem of approximating the cut-norm of a given real matrix is MAX SNP hard, and provide an efficient approximation algorithm. This algorithm finds, for a given matrix A=(aij)i ∈ R, j ∈ S, two subsets I ⊂ R and J ⊂ S, such that | Σi ∈ I, j ∈ J aij| ≥ ρ ||A||C, where ρ > 0 is an absolute constant satisfying $ρ > 0. 56. The algorithm combines semidefinite programming with a rounding technique based on Grothendieck's Inequality. We present three known proofs of Grothendieck's inequality, with the necessary modifications which emphasize their algorithmic aspects. These proofs contain rounding techniques which go beyond the random hyperplane rounding of Goemans and Williamson [12], allowing us to transfer various algorithms for dense graph and matrix problems to the sparse case.

Book
01 Jan 2004
TL;DR: This thesis improves a long line of previous results and gives a 1.52-approximation algorithm for the uncapacitated facility location problem and considers several important generalizations of the unccapacitated problem, including: (1) Capacitate facility location: Each facility can serve only a certain amount of clients.
Abstract: One of the most important aspects of logistics is to decide where to locate new facilities such as plants, distribution centers, and retailers. Facility location models not only have important applications in designing distribution systems, but also often form identifiable parts of other practical problems. However, many discrete location problems are NP-hard and the scale of the instances arising in practice is often too large to be solved optimally. In this thesis, we focus on polynomial time approximation algorithms for solving the well-known uncapacitated facility location problem and its generalizations. In the uncapacitated problem, we are given a set of clients; a set of possible locations for facilities, the cost of opening a facility at each location, and the cost of connecting each client to a facility at each location. The objective is to open facilities at a subset of these locations, and connect each client to an open facility to minimize the sum of facility opening and connection costs. We assume that connection costs obey the triangle inequality. It is known that a 1.46-approximation algorithm for the uncapacitated problem would imply P = NP. In this thesis, we improve a long line of previous results and give a 1.52-approximation algorithm for the uncapacitated problem. We also consider several important generalizations of the uncapacitated problem, including: (1) Capacitated facility location: Each facility can serve only a certain amount of clients. We present a multi-exchange local search algorithm for this problem. We show its approximation ratio is between 3 + 22 − e and 3 + 22 + e for any given constant e > 0. (2) Multi-level facility location: The demands must be routed among the facilities in a hierarchical order. We give the best combinatorial algorithm for this problem. For the special case when there are only two levels of facilities, we give a 1.77-approximation algorithm, which is currently the best. (3) Dynamic facility location: The demand varies between time-periods. In addition to the question of where to locate facilities, this problem also addresses the question of when to locate them. We develop the first approximation algorithm for this problem.

Journal ArticleDOI
TL;DR: This work devise a version of randomized rounding that is incentive compatible, giving a truthful mechanism for combinatorial auctions with single parameter agents (e.g., "single minded bidders") that approximately maximizes the social value of the auction.
Abstract: Mechanism design seeks algorithms whose inputs are provided by selfish agents who would lie if it were to their advantage. Incentive-compatible mechanisms compel the agents to tell the truth by making it in their self-interest to do so. Often, as in combinatorial auctions, such mechanisms involve the solution of NP-hard problems. Unfortunately, approximation algorithms typically destroy incentive compatibility. Randomized rounding is a commonly used technique for designing approximation algorithms. We devise a version of randomized rounding that is incentivecompatible, giving a truthful mechanism for combinatorial auctions with single parameter agents (e.g., "single minded bidders") that approximately maximizes the social value of the auction. We discuss two orthogonal notions of truthfulness for a randomized mechanism–truthfulness with high probability and in expectation–and give a mechanism that achieves both simultaneously. We consider combinatorial auctions where multiple copies of many different item...

Journal ArticleDOI
TL;DR: This work considers the optimal investment and operational planning of gas field developments under uncertainty in gas reserves and presents a novel stochastic programming model that incorporates the decision-dependence of the scenario tree.

Proceedings ArticleDOI
13 Jun 2004
TL;DR: A constant-factor approximation is developed for a generalization of the orienteering problem in which both the start and the end nodes of the path are fixed, improving on the previously best known 4-approximation of [6].
Abstract: Given a metric space G on n nodes, with a start node r and deadlines D(v) for each vertex v, we consider the Deadline-TSP problem of finding a path starting at r that visits as many nodes as possible by their deadlines. We also consider the more general Vehicle Routing with Time-Windows problem, in which each node v also has a release-time R(v) and the goal is to visit as many nodes as possible within their "time-windows" [R(v),D(v)]. No good approximations were known previously for these problems on general metric spaces. We give an O(logn) approximation algorithm for Deadline-TSP, and extend this algorithm to an O(log2n) approximation for the Time-Window problem. We also give a bicriteria approximation algorithm for both problems: Given an e>0, our algorithm produces a (1/e) approximation, while exceeding the deadlines by a factor of 1+e. We use as a subroutine for these results a constant-factor approximation that we develop for a generalization of the orienteering problem in which both the start and the end nodes of the path are fixed. In the process, we give a 3-approximation to the orienteering problem, improving on the previously best known 4-approximation of [6].

Book ChapterDOI
22 Aug 2004
TL;DR: In this paper, a variant of the maximum coverage problem with group budget constraints (MCG) is studied, where each set S i is a subset of a given ground set X and the goal is to pick k sets from S i to maximize the cardinality of their union but with the additional restriction that at most one set be picked from each group.
Abstract: We study a variant of the maximum coverage problem which we label the maximum coverage problem with group budget constraints (MCG). We are given a collection of sets \({\cal S} = \{S_1, S_2, \ldots, S_m\}\) where each set S i is a subset of a given ground set X. In the maximum coverage problem the goal is to pick k sets from \({\cal S}\) to maximize the cardinality of their union. In the MCG problem \({\cal S}\) is partitioned into groupsG 1, G 2, ..., G l. The goal is to pick k sets from \({\cal S}\) to maximize the cardinality of their union but with the additional restriction that at most one set be picked from each group. We motivate the study of MCG by pointing out a variety of applications. We show that the greedy algorithm gives a 2-approximation algorithm for this problem which is tight in the oracle model. We also obtain a constant factor approximation algorithm for the cost version of the problem. We then use MCG to obtain the first constant factor approximation algorithms for the following problems: (i) multiple depot k-traveling repairmen problem with covering constraints and (ii) orienteering problem with time windows when the number of time windows is a constant.

Journal ArticleDOI
TL;DR: A randomized approximation algorithm for centrality in weighted graphs is described that estimates the centrality of all vertices with high probability within a (1 + ∈) factor in near-linear time for graphs exhibiting the small world phenomenon.
Abstract: Social studies researchers use graphs to model group activities in social networks. An important property in this context is the centrality of a vertex: the inverse of the average distance to each other vertex. We describe a randomized approximation algorithm for centrality in weighted graphs. For graphs exhibiting the small world phenomenon, our method estimates the centrality of all vertices with high probability within a (1+epsilon) factor in near-linear time.

Journal ArticleDOI
TL;DR: This paper shows that MAP is complete for NPPP and provides further negative complexity results for algorithms based on variable elimination and introduces a generic MAP approximation framework that allows MAP approximation on networks that are too complex to even exactly solve the easier problems, Pr and MPE.
Abstract: MAP is the problem of finding a most probable instantiation of a set of variables given evidence. MAP has always been perceived to be significantly harder than the related problems of computing the probability of a variable instantiation (Pr), or the problem of computing the most probable explanation (MPE). This paper investigates the complexity of MAP in Bayesian networks. Specifically, we show that MAP is complete for NPPP and provide further negative complexity results for algorithms based on variable elimination. We also show that MAP remains hard even when MPE and Pr become easy. For example, we show that MAP is NP-complete when the networks are restricted to polytrees, and even then can not be effectively approximated. Given the difficulty of computing MAP exactly, and the difficulty of approximating MAP while providing useful guarantees on the resulting approximation, we investigate best effort approximations. We introduce a generic MAP approximation framework. We provide two instantiations of the framework; one for networks which are amenable to exact inference (Pr), and one for networks for which even exact inference is too hard. This allows MAP approximation on networks that are too complex to even exactly solve the easier problems, Pr and MPE. Experimental results indicate that using these approximation algorithms provides much better solutions than standard techniques, and provide accurate MAP estimates in many cases.

Journal ArticleDOI
TL;DR: A polynomial-time approximation scheme for k-partial vertex cover on planar graphs, and for covering k points in Rd by disks is obtained, and an approximation of 4/3 is obtained.

Proceedings ArticleDOI
27 Jun 2004
TL;DR: A bicriteria polynomial time approximation algorithm with an O(log2n)-approximation for any constant υ > 1 and it is shown that no polytime approximation algorithm can guarantee a finite approximation ratio unless P=NP.
Abstract: In this paper we consider the problem of (k, υ)-balanced graph partitioning - dividing the vertices of a graph into k almost equal size components (each of size less than υ • nk) so that the capacity of edges between different components is minimized. This problem is a natural generalization of several other problems such as minimum bisection, which is the (2,1)-balanced partitioning problem. We present a bicriteria polynomial time approximation algorithm with an O(log2n)-approximation for any constant υ > 1. For υ = 1 we show that no polytime approximation algorithm can guarantee a finite approximation ratio unless P=NP. Previous work has only considered the (k, υ)-balanced partitioning problem for υ ≥ 2.

Proceedings ArticleDOI
07 Mar 2004
TL;DR: This work is the first attempt to derive a performance guaranteed polynomial time approximation algorithm for jointly solving three problems of energy efficient communication strategies over a multi-hop wireless network.
Abstract: With increasing interest in energy constrained multi-hop wireless networks (Bambos, N. et al., 1991), a fundamental problem is one of determining energy efficient communication strategies over these multi-hop networks. The simplest problem is one where a given source node wants to communicate with a given destination, with a given rate over a multi-hop wireless network, using minimum power. Here the power refers to the total amount of power consumed over the entire network in order to achieve this rate between the source and the destination. There are three decisions that have to be made (jointly) in order to minimize the power requirement. (1) The path(s) that the data has to take between the source and the destination. (Routing). (2) The power with each link transmission is done. (Power Control). (3) Depending on the interference or the MAC characteristics, the time slots in which specific link transmissions have to take place. (Scheduling). (4) To the best of our knowledge, ours is the first attempt to derive a performance guaranteed polynomial time approximation algorithm for jointly solving these three problems. We formulate the overall problem as an optimization problem with non-linear objective function and non-linear constraints. We then derive a polynomial time 3-approximation algorithm to solve this problem. We also present a simple version of the algorithm, with the same performance bound, which involves solving only shortest path problems and which is quite efficient in practice. Our approach readily extends to the case where there are multiple source-destination pairs that have to communicate simultaneously over the multi-hop network.

Journal Article
TL;DR: It is observed that a solution of the semidefinite relaxation for vertex cover, when strengthened with the triangle inequalities, can be transformed into a solution to a balanced cut problem, and therefore the existence of big well-separated sets in the sense of Arora et al.
Abstract: We reduce the approximation factor for the vertex cover to 2 − Θ (1/√logn) (instead of the previous 2 − Θ ln ln n/2ln n obtained by Bar-Yehuda and Even [1985] and Monien and Speckenmeyer [1985]). The improvement of the vanishing factor comes as an application of the recent results of Arora et al. [2004] that improved the approximation factor of the sparsest cut and balanced cut problems. In particular, we use the existence of two big and well-separated sets of nodes in the solution of the semidefinite relaxation for balanced cut, proven by Arora et al. [2004]. We observe that a solution of the semidefinite relaxation for vertex cover, when strengthened with the triangle inequalities, can be transformed into a solution of a balanced cut problem, and therefore the existence of big well-separated sets in the sense of Arora et al. [2004] translates into the existence of a big independent set.

Journal ArticleDOI
TL;DR: Packing integer programs capture a core problem that directly relates to both vector scheduling and vector bin packing, namely, the problem of packing a maximum number of vectors in a single bin of unit height.
Abstract: We study the approximability of multidimensional generalizations of three classical packing problems: multiprocessor scheduling, bin packing, and the knapsack problem. Specifically, we study the vector scheduling problem, its dual problem, namely, the vector bin packing problem, and a class of packing integer programs. The vector scheduling problem is to schedule n d-dimensional tasks on m machines such that the maximum load over all dimensions and all machines is minimized. The vector bin packing problem, on the other hand, seeks to minimize the number of bins needed to schedule all n tasks such that the maximum load on any dimension across all bins is bounded by a fixed quantity, say, 1. Such problems naturally arise when scheduling tasks that have multiple resource requirements. Finally, packing integer programs capture a core problem that directly relates to both vector scheduling and vector bin packing, namely, the problem of packing a maximum number of vectors in a single bin of unit height. We obtain a variety of new algorithmic as well as inapproximability results for these three problems.

Proceedings ArticleDOI
22 Aug 2004
TL;DR: This paper addresses the issue of overwhelmingly large output size by introducing and studying the following problem: What are the k sets that best approximate a collection of frequent item sets and providing simple polynomial-time approximation algorithms.
Abstract: One of the most well-studied problems in data mining is computing the collection of frequent item sets in large transactional databases. One obstacle for the applicability of frequent-set mining is that the size of the output collection can be far too large to be carefully examined and understood by the users. Even restricting the output to the border of the frequent item-set collection does not help much in alleviating the problem.In this paper we address the issue of overwhelmingly large output size by introducing and studying the following problem: What are the k sets that best approximate a collection of frequent item sets? Our measure of approximating a collection of sets by k sets is defined to be the size of the collection covered by the the k sets, i.e., the part of the collection that is included in one of the k sets. We also specify a bound on the number of extra sets that are allowed to be covered. We examine different problem variants for which we demonstrate the hardness of the corresponding problems and we provide simple polynomial-time approximation algorithms. We give empirical evidence showing that the approximation methods work well in practice.

Proceedings ArticleDOI
01 Oct 2004
TL;DR: The MEGA algorithm is proposed which yields a minimum-energy data gathering topology in O(n)>(n) time for foreign coding and the first approximation algorithm for this problem with approximation ratio 2(1+ √2 ) and running time O(m) + log n.
Abstract: In this paper, we consider energy-efficient gathering of correlated data in sensor networks. We focus on single-input coding strategies in order to aggregate correlated data. For foreign coding we propose the MEGA algorithm which yields a minimum-energy data gathering topology in O (n3) time. We also consider self-coding for which the problem of finding an optimal data gathering tree was recently shown to be NP-complete; with LEGA, we present the first approximation algorithm for this problem with approximation ratio 2(1+ √2 ) and running time O m + n log n.