scispace - formally typeset
Search or ask a question

Showing papers on "Greedy algorithm published in 2011"


Journal ArticleDOI
TL;DR: It is shown that under conditions on the mutual incoherence and the minimum magnitude of the nonzero components of the signal, the support of the signals can be recovered exactly by the OMP algorithm with high probability.
Abstract: We consider the orthogonal matching pursuit (OMP) algorithm for the recovery of a high-dimensional sparse signal based on a small number of noisy linear measurements. OMP is an iterative greedy algorithm that selects at each step the column, which is most correlated with the current residuals. In this paper, we present a fully data driven OMP algorithm with explicit stopping rules. It is shown that under conditions on the mutual incoherence and the minimum magnitude of the nonzero components of the signal, the support of the signal can be recovered exactly by the OMP algorithm with high probability. In addition, we also consider the problem of identifying significant components in the case where some of the nonzero components are possibly small. It is shown that in this case the OMP algorithm will still select all the significant components before possibly selecting incorrect ones. Moreover, with modified stopping rules, the OMP algorithm can ensure that no zero components are selected.

1,093 citations


Journal ArticleDOI
TL;DR: This paper shows that reformulating that step as a constrained flow optimization results in a convex problem and takes advantage of its particular structure to solve it using the k-shortest paths algorithm, which is very fast.
Abstract: Multi-object tracking can be achieved by detecting objects in individual frames and then linking detections across frames. Such an approach can be made very robust to the occasional detection failure: If an object is not detected in a frame but is in previous and following ones, a correct trajectory will nevertheless be produced. By contrast, a false-positive detection in a few frames will be ignored. However, when dealing with a multiple target problem, the linking step results in a difficult optimization problem in the space of all possible families of trajectories. This is usually dealt with by sampling or greedy search based on variants of Dynamic Programming which can easily miss the global optimum. In this paper, we show that reformulating that step as a constrained flow optimization results in a convex problem. We take advantage of its particular structure to solve it using the k-shortest paths algorithm, which is very fast. This new approach is far simpler formally and algorithmically than existing techniques and lets us demonstrate excellent performance in two very different contexts.

1,076 citations


Proceedings ArticleDOI
20 Jun 2011
TL;DR: A near-optimal algorithm based on dynamic programming which runs in time linear in the number of objects andlinear in the sequence length is given which results in state-of-the-art performance.
Abstract: We analyze the computational problem of multi-object tracking in video sequences. We formulate the problem using a cost function that requires estimating the number of tracks, as well as their birth and death states. We show that the global solution can be obtained with a greedy algorithm that sequentially instantiates tracks using shortest path computations on a flow network. Greedy algorithms allow one to embed pre-processing steps, such as nonmax suppression, within the tracking algorithm. Furthermore, we give a near-optimal algorithm based on dynamic programming which runs in time linear in the number of objects and linear in the sequence length. Our algorithms are fast, simple, and scalable, allowing us to process dense input data. This results in state-of-the-art performance.

904 citations


Proceedings ArticleDOI
20 Jun 2011
TL;DR: An efficient greedy algorithm for superpixel segmentation is developed by exploiting submodular and mono-tonic properties of the objective function and proving an approximation bound of ½ for the optimality of the solution.
Abstract: We propose a new objective function for superpixel segmentation This objective function consists of two components: entropy rate of a random walk on a graph and a balancing term The entropy rate favors formation of compact and homogeneous clusters, while the balancing function encourages clusters with similar sizes We present a novel graph construction for images and show that this construction induces a matroid — a combinatorial structure that generalizes the concept of linear independence in vector spaces The segmentation is then given by the graph topology that maximizes the objective function under the matroid constraint By exploiting submodular and mono-tonic properties of the objective function, we develop an efficient greedy algorithm Furthermore, we prove an approximation bound of ½ for the optimality of the solution Extensive experiments on the Berkeley segmentation benchmark show that the proposed algorithm outperforms the state of the art in all the standard evaluation metrics

894 citations


Proceedings ArticleDOI
28 Mar 2011
TL;DR: This work proposes CELF++ and empirically show that it is 35-55% faster than CELF and proposes the CELF algorithm for tackling the second major source of inefficiency of the basic greedy algorithm.
Abstract: Kempe et al. [4] (KKT) showed the problem of influence maximization is NP-hard and a simple greedy algorithm guarantees the best possible approximation factor in PTIME. However, it has two major sources of inefficiency. First, finding the expected spread of a node set is #P-hard. Second, the basic greedy algorithm is quadratic in the number of nodes. The first source is tackled by estimating the spread using Monte Carlo simulation or by using heuristics[4, 6, 2, 5, 1, 3]. Leskovec et al. proposed the CELF algorithm for tackling the second. In this work, we propose CELF++ and empirically show that it is 35-55% faster than CELF.

778 citations


Proceedings ArticleDOI
28 Mar 2011
TL;DR: This work study the notion of competing campaigns in a social network and address the problem of influence limitation where a "bad" campaign starts propagating from a certain node in the network and use the concept of limiting campaigns to counteract the effect of misinformation.
Abstract: In this work, we study the notion of competing campaigns in a social network and address the problem of influence limitation where a "bad" campaign starts propagating from a certain node in the network and use the notion of limiting campaigns to counteract the effect of misinformation. The problem can be summarized as identifying a subset of individuals that need to be convinced to adopt the competing (or "good") campaign so as to minimize the number of people that adopt the "bad" campaign at the end of both propagation processes. We show that this optimization problem is NP-hard and provide approximation guarantees for a greedy solution for various definitions of this problem by proving that they are submodular. We experimentally compare the performance of the greedy method to various heuristics. The experiments reveal that in most cases inexpensive heuristics such as degree centrality compare well with the greedy approach. We also study the influence limitation problem in the presence of missing data where the current states of nodes in the network are only known with a certain probability and show that prediction in this setting is a supermodular problem. We propose a prediction algorithm that is based on generating random spanning trees and evaluate the performance of this approach. The experiments reveal that using the prediction algorithm, we are able to tolerate about 90% missing data before the performance of the algorithm starts degrading and even with large amounts of missing data the performance degrades only to 75% of the performance that would be achieved with complete data.

761 citations


Journal ArticleDOI
TL;DR: An improved coating pan apparatus and spray arm assembly are disclosed for providing facilitated maintenance and cleaning of sensitive spray nozzles.
Abstract: Let $f:2^X \rightarrow \cal R_+$ be a monotone submodular set function, and let $(X,\cal I)$ be a matroid. We consider the problem ${\rm max}_{S \in \cal I} f(S)$. It is known that the greedy algorithm yields a $1/2$-approximation [M. L. Fisher, G. L. Nemhauser, and L. A. Wolsey, Math. Programming Stud., no. 8 (1978), pp. 73-87] for this problem. For certain special cases, e.g., ${\rm max}_{|S| \leq k} f(S)$, the greedy algorithm yields a $(1-1/e)$-approximation. It is known that this is optimal both in the value oracle model (where the only access to $f$ is through a black box returning $f(S)$ for a given set $S$) [G. L. Nemhauser and L. A. Wolsey, Math. Oper. Res., 3 (1978), pp. 177-188] and for explicitly posed instances assuming $P eq NP$ [U. Feige, J. ACM, 45 (1998), pp. 634-652]. In this paper, we provide a randomized $(1-1/e)$-approximation for any monotone submodular function and an arbitrary matroid. The algorithm works in the value oracle model. Our main tools are a variant of the pipage rounding technique of Ageev and Sviridenko [J. Combin. Optim., 8 (2004), pp. 307-328], and a continuous greedy process that may be of independent interest. As a special case, our algorithm implies an optimal approximation for the submodular welfare problem in the value oracle model [J. Vondrak, Proceedings of the $38$th ACM Symposium on Theory of Computing, 2008, pp. 67-74]. As a second application, we show that the generalized assignment problem (GAP) is also a special case; although the reduction requires $|X|$ to be exponential in the original problem size, we are able to achieve a $(1-1/e-o(1))$-approximation for GAP, simplifying previously known algorithms. Additionally, the reduction enables us to obtain approximation algorithms for variants of GAP with more general constraints.

761 citations


Journal ArticleDOI
TL;DR: In this article, the concept of adaptive submodularity is introduced, which generalizes submodular set functions to adaptive policies and provides performance guarantees for both stochastic maximization and coverage, and can be exploited to speed up the greedy algorithm by using lazy evaluations.
Abstract: Many problems in artificial intelligence require adaptively making a sequence of decisions with uncertain outcomes under partial observability. Solving such stochastic optimization problems is a fundamental but notoriously difficult challenge. In this paper, we introduce the concept of adaptive submodularity, generalizing submodular set functions to adaptive policies. We prove that if a problem satisfies this property, a simple adaptive greedy algorithm is guaranteed to be competitive with the optimal policy. In addition to providing performance guarantees for both stochastic maximization and coverage, adaptive submodularity can be exploited to drastically speed up the greedy algorithm by using lazy evaluations. We illustrate the usefulness of the concept by giving several examples of adaptive submodular objectives arising in diverse AI applications including management of sensing resources, viral marketing and active learning. Proving adaptive submodularity for these problems allows us to recover existing results in these applications as special cases, improve approximation guarantees and handle natural generalizations.

570 citations


Journal ArticleDOI
TL;DR: A fast greedy algorithm to select a subset of measurements to be protected is proposed and another greedy algorithm that facilitates the placement of secure phasor measurement units (PMUs) to defend against data injection attacks is developed.
Abstract: Data injection attacks to manipulate system state estimators on power grids are considered. A unified formulation for the problem of constructing attacking vectors is developed for linearized measurement models. Based on this formulation, a new low-complexity attacking strategy is shown to significantly outperform naive l1 relaxation. It is demonstrated that it is possible to defend against malicious data injection if a small subset of measurements can be made immune to the attacks. However, selecting such subsets is a high-complexity combinatorial problem given the typically large size of electrical grids. To address the complexity issue, a fast greedy algorithm to select a subset of measurements to be protected is proposed. Another greedy algorithm that facilitates the placement of secure phasor measurement units (PMUs) to defend against data injection attacks is also developed. Simulations on the IEEE test systems demonstrate the benefits of the proposed algorithms.

555 citations


Journal ArticleDOI
TL;DR: A total cost minimization is formulated that allows for a flexible tradeoff between flow-level performance and energy consumption and a simple greedy-on and greedy-off algorithms are proposed that are inspired by the mathematical background of submodularity maximization problem.
Abstract: Energy-efficiency, one of the major design goals in wireless cellular networks, has received much attention lately, due to increased awareness of environmental and economic issues for network operators. In this paper, we develop a theoretical framework for BS energy saving that encompasses dynamic BS operation and the related problem of user association together. Specifically, we formulate a total cost minimization that allows for a flexible tradeoff between flow-level performance and energy consumption. For the user association problem, we propose an optimal energy-efficient user association policy and further present a distributed implementation with provable convergence. For the BS operation problem (i.e., BS switching on/off), which is a challenging combinatorial problem, we propose simple greedy-on and greedy-off algorithms that are inspired by the mathematical background of submodularity maximization problem. Moreover, we propose other heuristic algorithms based on the distances between BSs or the utilizations of BSs that do not impose any additional signaling overhead and thus are easy to implement in practice. Extensive simulations under various practical configurations demonstrate that the proposed user association and BS operation algorithms can significantly reduce energy consumption.

479 citations


Proceedings ArticleDOI
11 Dec 2011
TL;DR: This paper proposes Simpath, an efficient and effective algorithm for influence maximization under the linear threshold model that addresses these drawbacks by incorporating several clever optimizations, and shows that Simpath consistently outperforms the state of the art w.r.t. running time, memory consumption and the quality of the seed set chosen.
Abstract: There is significant current interest in the problem of influence maximization: given a directed social network with influence weights on edges and a number k, find k seed nodes such that activating them leads to the maximum expected number of activated nodes, according to a propagation model. Kempe et al. showed, among other things, that under the Linear Threshold Model, the problem is NP-hard, and that a simple greedy algorithm guarantees the best possible approximation factor in PTIME. However, this algorithm suffers from various major performance drawbacks. In this paper, we propose Simpath, an efficient and effective algorithm for influence maximization under the linear threshold model that addresses these drawbacks by incorporating several clever optimizations. Through a comprehensive performance study on four real data sets, we show that Simpath consistently outperforms the state of the art w.r.t. running time, memory consumption and the quality of the seed set chosen, measured in terms of expected influence spread achieved.

Posted Content
TL;DR: The submodularity ratio is introduced as a key quantity to help understand why greedy algorithms perform well even when the variables are highly correlated, and is a stronger predictor of the performance of greedy algorithms than other spectral parameters.
Abstract: We study the problem of selecting a subset of k random variables from a large set, in order to obtain the best linear prediction of another variable of interest. This problem can be viewed in the context of both feature selection and sparse approximation. We analyze the performance of widely used greedy heuristics, using insights from the maximization of submodular functions and spectral analysis. We introduce the submodularity ratio as a key quantity to help understand why greedy algorithms perform well even when the variables are highly correlated. Using our techniques, we obtain the strongest known approximation guarantees for this problem, both in terms of the submodularity ratio and the smallest k-sparse eigenvalue of the covariance matrix. We further demonstrate the wide applicability of our techniques by analyzing greedy algorithms for the dictionary selection problem, and significantly improve the previously known guarantees. Our theoretical analysis is complemented by experiments on real-world and synthetic data sets; the experiments show that the submodularity ratio is a stronger predictor of the performance of greedy algorithms than other spectral parameters.

Journal ArticleDOI
TL;DR: It is proved that, if constraints in the SP problem are optimally removed—i.e., one deletes those constraints leading to the largest possible cost improvement—, then a precise optimality link to the original chance-constrained problem CCP in addition holds.
Abstract: In this paper, we study the link between a Chance-Constrained optimization Problem (CCP) and its sample counterpart (SP). SP has a finite number, say N, of sampled constraints. Further, some of these sampled constraints, say k, are discarded, and the final solution is indicated by $x^{\ast}_{N,k}$ . Extending previous results on the feasibility of sample convex optimization programs, we establish the feasibility of $x^{\ast}_{N,k}$ for the initial CCP problem. Constraints removal allows one to improve the cost function at the price of a decreased feasibility. The cost improvement can be inspected directly from the optimization result, while the theory here developed permits to keep control on the other side of the coin, the feasibility of the obtained solution. In this way, trading feasibility for performance is put on solid mathematical grounds in this paper. The feasibility result here obtained applies to a vast class of chance-constrained optimization problems, and has the distinctive feature that it holds true irrespective of the algorithm used to discard k constraints in the SP problem. For constraints discarding, one can thus, e.g., resort to one of the many methods introduced in the literature to solve chance-constrained problems with discrete distribution, or even use a greedy algorithm, which is computationally very low-demanding, and the feasibility result remains intact. We further prove that, if constraints in the SP problem are optimally removed—i.e., one deletes those constraints leading to the largest possible cost improvement—, then a precise optimality link to the original chance-constrained problem CCP in addition holds.

Journal ArticleDOI
TL;DR: This paper proposes a new sparsity-based algorithm for automatic target detection in hyperspectral imagery (HSI) based on the concept that a pixel in HSI lies in a low-dimensional subspace and thus can be represented as a sparse linear combination of the training samples.
Abstract: In this paper, we propose a new sparsity-based algorithm for automatic target detection in hyperspectral imagery (HSI). This algorithm is based on the concept that a pixel in HSI lies in a low-dimensional subspace and thus can be represented as a sparse linear combination of the training samples. The sparse representation (a sparse vector corresponding to the linear combination of a few selected training samples) of a test sample can be recovered by solving an l0-norm minimization problem. With the recent development of the compressed sensing theory, such minimization problem can be recast as a standard linear programming problem or efficiently approximated by greedy pursuit algorithms. Once the sparse vector is obtained, the class of the test sample can be determined by the characteristics of the sparse vector on reconstruction. In addition to the constraints on sparsity and reconstruction accuracy, we also exploit the fact that in HSI the neighboring pixels have a similar spectral characteristic (smoothness). In our proposed algorithm, a smoothness constraint is also imposed by forcing the vector Laplacian at each reconstructed pixel to be minimum all the time within the minimization process. The proposed sparsity-based algorithm is applied to several hyperspectral imagery to detect targets of interest. Simulation results show that our algorithm outperforms the classical hyperspectral target detection algorithms, such as the popular spectral matched filters, matched subspace detectors, adaptive subspace detectors, as well as binary classifiers such as support vector machines.

Proceedings ArticleDOI
27 Jun 2011
TL;DR: This paper investigates the application of CS to data collection in wireless sensor networks, and aims at minimizing the network energy consumption through joint routing and compressed aggregation, and proposes a mixed-integer programming formulation along with a greedy heuristic.
Abstract: As a burgeoning technique for signal processing, compressed sensing (CS) is being increasingly applied to wireless communications. However, little work is done to apply CS to multihop networking scenarios. In this paper, we investigate the application of CS to data collection in wireless sensor networks, and we aim at minimizing the network energy consumption through joint routing and compressed aggregation. We first characterize the optimal solution to this optimization problem, then we prove its NP-completeness. We further propose a mixed-integer programming formulation along with a greedy heuristic, from which both the optimal (for small scale problems) and the near-optimal (for large scale problems) aggregation trees are obtained. Our results validate the efficacy of the greedy heuristics, as well as the great improvement in energy efficiency through our joint routing and aggregation scheme.

Journal Article
TL;DR: In this article, a general theory is developed for learning with structured sparsity, based on the notion of coding complexity associated with the structure, which is a natural extension of the standard sparsity concept in statistical learning and compressive sensing.
Abstract: This paper investigates a learning formulation called structured sparsity, which is a natural extension of the standard sparsity concept in statistical learning and compressive sensing. By allowing arbitrary structures on the feature set, this concept generalizes the group sparsity idea that has become popular in recent years. A general theory is developed for learning with structured sparsity, based on the notion of coding complexity associated with the structure. It is shown that if the coding complexity of the target signal is small, then one can achieve improved performance by using coding complexity regularization methods, which generalize the standard sparse regularization. Moreover, a structured greedy algorithm is proposed to efficiently solve the structured sparsity problem. It is shown that the greedy algorithm approximately solves the coding complexity optimization problem under appropriate conditions. Experiments are included to demonstrate the advantage of structured sparsity over standard sparsity on some real applications.

Proceedings ArticleDOI
21 Sep 2011
TL;DR: This work model the workload consolidation problem as an instance of the multi-dimensional bin-packing (MDBP) problem and design a novel, nature-inspired workload consolidation algorithm based on the Ant Colony Optimization (ACO), which outperforms the evaluated greedy algorithm.
Abstract: With increasing numbers of energy hungry data centers energy conservation has now become a major design constraint. One traditional approach to conserve energy in virtualized data centers is to perform workload (i.e., VM) consolidation. Thereby, workload is packed on the least number of physical machines and over-provisioned resources are transitioned into a lower power state. However, most of the workload consolidation approaches applied until now are limited to a single resource (e.g., CPU) and rely on simple greedy algorithms such as First-Fit Decreasing (FFD), which perform resource-dissipative workload placement. Moreover, they are highly centralized and known to be hard to distribute. In this work, we model the workload consolidation problem as an instance of the multi-dimensional bin-packing (MDBP) problem and design a novel, nature-inspired workload consolidation algorithm based on the Ant Colony Optimization (ACO). We evaluate the ACO-based approach by comparing it with one frequently applied greedy algorithm (i.e., FFD). Our simulation results demonstrate that ACO outperforms the evaluated greedy algorithm as it achieves superior energy gains through better server utilization and requires less machines. Moreover, it computes solutions which are nearly optimal. Finally, the autonomous nature of the approach allows it to be implemented in a fully distributed environment.

Proceedings ArticleDOI
06 Nov 2011
TL;DR: CoSand is proposed, a distributed cosegmentation approach for a highly variable large-scale image collection that takes advantage of a strong theoretic property in that the temperature under linear anisotropic diffusion is a submodular function; therefore, a greedy algorithm guarantees at least a constant factor approximation to the optimal solution for temperature maximization.
Abstract: The saliency of regions or objects in an image can be significantly boosted if they recur in multiple images. Leveraging this idea, cosegmentation jointly segments common regions from multiple images. In this paper, we propose CoSand, a distributed cosegmentation approach for a highly variable large-scale image collection. The segmentation task is modeled by temperature maximization on anisotropic heat diffusion, of which the temperature maximization with finite K heat sources corresponds to a K-way segmentation that maximizes the segmentation confidence of every pixel in an image. We show that our method takes advantage of a strong theoretic property in that the temperature under linear anisotropic diffusion is a submodular function; therefore, a greedy algorithm guarantees at least a constant factor approximation to the optimal solution for temperature maximization. Our theoretic result is successfully applied to scalable cosegmentation as well as diversity ranking and single-image segmentation. We evaluate CoSand on MSRC and ImageNet datasets, and show its competence both in competitive performance over previous work, and in much superior scalability.

Proceedings ArticleDOI
22 Oct 2011
TL;DR: This work presents a new unified continuous greedy algorithm which finds approximate fractional solutions for both the non-monotone and monotone cases, and improves on the approximation ratio for many applications.
Abstract: The study of combinatorial problems with a sub modular objective function has attracted much attention in recent years, and is partly motivated by the importance of such problems to economics, algorithmic game theory and combinatorial optimization. Classical works on these problems are mostly combinatorial in nature. Recently, however, many results based on continuous algorithmic tools have emerged. The main bottleneck of such continuous techniques is how to approximately solve a non-convex relaxation for the sub modular problem at hand. Thus, the efficient computation of better fractional solutions immediately implies improved approximations for numerous applications. A simple and elegant method, called ``continuous greedy'', successfully tackles this issue for monotone sub modular objective functions, however, only much more complex tools are known to work for general non-monotone sub modular objectives. In this work we present a new unified continuous greedy algorithm which finds approximate fractional solutions for both the non-monotone and monotone cases, and improves on the approximation ratio for many applications. For general non-monotone sub modular objective functions, our algorithm achieves an improved approximation ratio of about $1/e$. For monotone sub modular objective functions, our algorithm achieves an approximation ratio that depends on the density of the poly tope defined by the problem at hand, which is always at least as good as the previously known best approximation ratio of $1 - 1/e$. Some notable immediate implications are an improved $1/e$-approximation for maximizing a non-monotone sub modular function subject to a matroid or $O(1)$-knapsack constraints, and information-theoretic tight approximations for Sub modular Max-SAT and Sub modular Welfare with $k$ players, for {\em any} number of players $k$. A framework for sub modular optimization problems, called the \emph{contention resolution framework}, was introduced recently by Chekuri et al. The improved approximation ratio of the unified continuous greedy algorithm implies improved approximation ratios for many problems through this framework. Moreover, via a parameter called \emph{stopping time}, our algorithm merges the relaxation solving and re-normalization steps of the framework, and achieves, for some applications, further improvements. We also describe new monotone balanced contention resolution schemes for various matching, scheduling and packing problems, thus, improving the approximations achieved for these problems via the framework.

Proceedings ArticleDOI
06 Jun 2011
TL;DR: This paper studies the ranking algorithm in the random arrivals model, and shows that it has a competitive ratio of at least 0.696, beating the 1-1/e ≈ 0.632 barrier in the adversarial model.
Abstract: In a seminal paper, Karp, Vazirani, and Vazirani show that a simple ranking algorithm achieves a competitive ratio of 1-1/e for the online bipartite matching problem in the standard adversarial model, where the ratio of 1-1/e is also shown to be optimal. Their result also implies that in the random arrivals model defined by Goel and Mehta, where the online nodes arrive in a random order, a simple greedy algorithm achieves a competitive ratio of 1-1/e. In this paper, we study the ranking algorithm in the random arrivals model, and show that it has a competitive ratio of at least 0.696, beating the 1-1/e ≈ 0.632 barrier in the adversarial model. Our result also extends to the i.i.d. distribution model of Feldman et al., removing the assumption that the distribution is known.Our analysis has two main steps. First, we exploit certain dominance and monotonicity properties of the ranking algorithm to derive a family of factor-revealing linear programs (LPs). In particular, by symmetry of the ranking algorithm in the random arrivals model, we have the monotonicity property on both sides of the bipartite graph, giving good "strength" to the LPs. Second, to obtain a good lower bound on the optimal values of all these LPs and hence on the competitive ratio of the algorithm, we introduce the technique of strongly factor-revealing LPs. In particular, we derive a family of modified LPs with similar strength such that the optimal value of any single one of these new LPs is a lower bound on the competitive ratio of the algorithm. This enables us to leverage the power of computer LP solvers to solve for large instances of the new LPs to establish bounds that would otherwise be difficult to attain by human analysis.

Journal ArticleDOI
Tong Zhang1
TL;DR: This work proposes a novel combination that is based on the forward greedy algorithm but takes backward steps adaptively whenever beneficial, and develops strong theoretical results for the new procedure showing that it can effectively solve the problem of learning a sparse target function.
Abstract: Given a large number of basis functions that can be potentially more than the number of samples, we consider the problem of learning a sparse target function that can be expressed as a linear combination of a small number of these basis functions. We are interested in two closely related themes: · feature selection, or identifying the basis functions with nonzero coefficients; · estimation accuracy, or reconstructing the target function from noisy observations. Two heuristics that are widely used in practice are forward and backward greedy algorithms. First, we show that neither idea is adequate. Second, we propose a novel combination that is based on the forward greedy algorithm but takes backward steps adaptively whenever beneficial. For least squares regression, we develop strong theoretical results for the new procedure showing that it can effectively solve this problem under some assumptions. Experimental results support our theory.

Proceedings ArticleDOI
16 Jul 2011
TL;DR: This work develops a general framework for social choice problems in which a limited number of alternatives can be recommended to an agent population, and generalizes certain multiwinner election schemes.
Abstract: We develop a general framework for social choice problems in which a limited number of alternatives can be recommended to an agent population In our budgeted social choice model, this limit is determined by a budget, capturing problems that arise naturally in a variety of contexts, and spanning the continuum from pure consensus decision making (ie, standard social choice) to fully personalized recommendation Our approach applies a form of segmentation to social choice problems-- requiring the selection of diverse options tailored to different agent types--and generalizes certain multiwinner election schemes We show that standard rank aggregation methods perform poorly, and that optimization in our model is NP-complete; but we develop fast greedy algorithms with some theoretical guarantees Experiments on real-world datasets demonstrate the effectiveness of our algorithms

Proceedings Article
12 Dec 2011
TL;DR: This work proposes a natural optimization problem for signal recovery under this model and develops a new greedy algorithm called SpaRCS to solve it, which inherits a number of desirable properties from the state-of-the-art CoSaMP and ADMiRA algorithms.
Abstract: We consider the problem of recovering a matrix M that is the sum of a low-rank matrix L and a sparse matrix S from a small set of linear measurements of the form y = A(M)= A(L + S). This model subsumes three important classes of signal recovery problems: compressive sensing, affine rank minimization, and robust principal component analysis. We propose a natural optimization problem for signal recovery under this model and develop a new greedy algorithm called SpaRCS to solve it. Empirically, SpaRCS inherits a number of desirable properties from the state-of-the-art CoSaMP and ADMiRA algorithms, including exponential convergence and efficient implementation. Simulation results with video compressive sensing, hyperspectral imaging, and robust matrix completion data sets demonstrate both the accuracy and efficacy of the algorithm.

Journal ArticleDOI
TL;DR: One meta-heuristic search algorithm for constructing CIT samples is reformulate to more efficiently incorporate constraints, and the new version compares favorably with greedy algorithms on real-world problems, and, though the modifications were aimed at constrained problems, it shows similar advantages when feature constraints are absent.
Abstract: Combinatorial interaction testing (CIT) is a cost-effective sampling technique for discovering interaction faults in highly-configurable systems. Constrained CIT extends the technique to situations where some features cannot coexist in a configuration, and is therefore more applicable to real-world software. Recent work on greedy algorithms to build CIT samples now efficiently supports these feature constraints. But when testing a single system configuration is expensive, greedy techniques perform worse than meta-heuristic algorithms, because greedy algorithms generally need larger samples to exercise the same set of interactions. On the other hand, current meta-heuristic algorithms have long run times when feature constraints are present. Neither class of algorithm is suitable when both constraints and the cost of testing configurations are important factors. Therefore, we reformulate one meta-heuristic search algorithm for constructing CIT samples, simulated annealing, to more efficiently incorporate constraints. We identify a set of algorithmic changes and experiment with our modifications on 35 realistic constrained problems and on a set of unconstrained problems from the literature to isolate the factors that improve performance. Our evaluation determines that the optimizations reduce run time by a factor of 90 and accomplish the same coverage objectives with even fewer system configurations. Furthermore, the new version compares favorably with greedy algorithms on real-world problems, and, though our modifications were aimed at constrained problems, it shows similar advantages when feature constraints are absent.

Proceedings ArticleDOI
Feiping Nie1, Heng Huang1, Chris Ding1, Dijun Luo1, Hua Wang1 
16 Jul 2011
TL;DR: Experimental results on real world datasets show that the nongreedy method always obtains much better solution than that of the greedy method, and then a robust principal component analysis with non-greedy l1-norm maximization is proposed.
Abstract: Principal Component Analysis (PCA) is one of the most important methods to handle high-dimensional data. However, the high computational complexity makes it hard to apply to the large scale data with high dimensionality, and the used l2-norm makes it sensitive to outliers. A recent work proposed principal component analysis based on l1-normmaximization, which is efficient and robust to outliers. In that work, a greedy strategy was applied due to the difficulty of directly solving the l1-norm maximization problem, which is easy to get stuck in local solution. In this paper, we first propose an efficient optimization algorithmto solve a general l1-norm maximization problem, and then propose a robust principal component analysis with non-greedy l1-norm maximization. Experimental results on real world datasets show that the nongreedy method always obtains much better solution than that of the greedy method.

Proceedings Article
07 Aug 2011
TL;DR: This paper proposes a totally different approach based on Simulated Annealing for the influence maximization problem, which is the first SA based algorithm for the problem and proposes two heuristic methods to accelerate the convergence process of SA and a new method of computing influence to speed up the proposed algorithm.
Abstract: The problem of influence maximization, i.e., mining top-k influential nodes from a social network such that the spread of influence in the network is maximized, is NP-hard. Most of the existing algorithms for the problem are based on greedy algorithm. Although greedy algorithm can achieve a good approximation, it is computational expensive. In this paper, we propose a totally different approach based on Simulated Annealing(SA) for the influence maximization problem. This is the first SA based algorithm for the problem. Additionally, we propose two heuristic methods to accelerate the convergence process of SA, and a new method of computing influence to speed up the proposed algorithm. Experimental results on four real networks show that the proposed algorithms run faster than the state-of-the-art greedy algorithm by 2-3 orders of magnitude while being able to improve the accuracy of greedy algorithm.

Journal ArticleDOI
TL;DR: A new mutation operator has been developed to increase Genetic Algorithm performance to find the shortest distance in the known Traveling Salesman Problem (TSP) called Greedy Sub Tour Mutation (GSTM).
Abstract: In this study, a new mutation operator has been developed to increase Genetic Algorithm (GA) performance to find the shortest distance in the known Traveling Salesman Problem (TSP). We called this method as Greedy Sub Tour Mutation (GSTM). There exist two different greedy search methods and a component that provides a distortion in this new operator. The developed GSTM operator was tested with simple GA mutation operators in 14 different TSP examples selected from TSPLIB. The application of this GSTM operator gives much more effective results regarding to the best and average error values. The GSTM operator used with simple GAs decreases the best error values according to the other mutation operators with the ratio of between 74.24% and 88.32% and average error values between 59.42% and 79.51%.

Journal ArticleDOI
01 Jun 2011
TL;DR: This paper proposes an effective local search algorithm based on simulated annealing and greedy search techniques to solve the traveling salesman problem and shows that the proposed algorithm provides better compromise between CPU time and accuracy among some recent algorithms for the TSP.
Abstract: The traveling salesman problem (TSP) is a classical problem in discrete or combinatorial optimization and belongs to the NP-complete classes, which means that it may be require an infeasible processing time to be solved by an exhaustive search method, and therefore less expensive heuristics in respect to the processing time are commonly used in order to obtain satisfactory solutions in short running time. This paper proposes an effective local search algorithm based on simulated annealing and greedy search techniques to solve the TSP. In order to obtain more accuracy solutions, the proposed algorithm based on the standard simulated annealing algorithm adopts the combination of three kinds of mutations with different probabilities during its search. Then greedy search technique is used to speed up the convergence rate of the proposed algorithm. Finally, parameters such as cool coefficient of the temperature, the times of greedy search, and the times of compulsive accept and the probability of accept a new solution, are adaptive according to the size of the TSP instances. As a result, experimental results show that the proposed algorithm provides better compromise between CPU time and accuracy among some recent algorithms for the TSP.

Proceedings ArticleDOI
10 Apr 2011
TL;DR: A novel greedy matching pursuit algorithm (GMP) that complements the well-known signal recovery algorithms in CS theory and proves that GMP can accurately recover a sparse signal with a high probability.
Abstract: In this paper, we propose a novel compressive sensing (CS) based approach for sparse target counting and positioning in wireless sensor networks. While this is not the first work on applying CS to count and localize targets, it is the first to rigorously justify the validity of the problem formulation. Moreover, we propose a novel greedy matching pursuit algorithm (GMP) that complements the well-known signal recovery algorithms in CS theory and prove that GMP can accurately recover a sparse signal with a high probability. We also propose a framework for counting and positioning targets from multiple categories, a novel problem that has never been addressed before. Finally, we perform a comprehensive set of simulations whose results demonstrate the superiority of our approach over the existing CS and non-CS based techniques.

Journal ArticleDOI
TL;DR: In this paper, an iterated greedy algorithm for solving the blocking flow shop scheduling problem for makespan minimization is proposed, and an improved NEH-based heuristic is used as the initial solution procedure.
Abstract: This paper proposes an iterated greedy algorithm for solving the blocking flowshop scheduling problem for makespan minimization. Moreover, it presents an improved NEH-based heuristic, which is used as the initial solution procedure for the iterated greedy algorithm. The effectiveness of both procedures was tested on some of Taillard’s benchmark instances that are considered to be blocking flowshop instances. The experimental evaluation showed the efficiency of the proposed algorithm, in spite of its simple structure, in comparison with a state-of-the-art algorithm. In addition, new best solutions for Taillard’s instances are reported for this problem, which can be used as a basis of comparison in future studies.