scispace - formally typeset
Search or ask a question
Topic

Greedy algorithm

About: Greedy algorithm is a research topic. Over the lifetime, 15347 publications have been published within this topic receiving 393945 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, the authors proposed two algorithms for estimating regression coefficients with a lasso penalty, one based on greedy coordinate descent and another based on Edgeworth's algorithm for ordinary l1 regression.
Abstract: Imposition of a lasso penalty shrinks parameter estimates toward zero and performs continuous model selection. Lasso penalized regression is capable of handling linear regression problems where the number of predictors far exceeds the number of cases. This paper tests two exceptionally fast algorithms for estimating regression coefficients with a lasso penalty. The previously known l2 algorithm is based on cyclic coordinate descent. Our new l1 algorithm is based on greedy coordinate descent and Edgeworth’s algorithm for ordinary l1 regression. Each algorithm relies on a tuning constant that can be chosen by cross-validation. In some regression problems it is natural to group parameters and penalize parameters group by group rather than separately. If the group penalty is proportional to the Euclidean norm of the parameters of the group, then it is possible to majorize the norm and reduce parameter estimation to l2 regression with a lasso penalty. Thus, the existing algorithm can be extended to novel settings. Each of the algorithms discussed is tested via either simulated or real data or both. The Appendix proves that a greedy form of the l2 algorithm converges to the minimum value of the objective function.

821 citations

Journal ArticleDOI
TL;DR: This work proves that adding hidden units yields strictly improved modeling power, while a second theorem shows that RBMs are universal approximators of discrete distributions and suggests a new and less greedy criterion for training RBMs within DBNs.
Abstract: Deep belief networks (DBN) are generative neural network models with many layers of hidden explanatory factors, recently introduced by Hinton, Osindero, and Teh (2006) along with a greedy layer-wise unsupervised learning algorithm. The building block of a DBN is a probabilistic model called a restricted Boltzmann machine (RBM), used to represent one layer of the model. Restricted Boltzmann machines are interesting because inference is easy in them and because they have been successfully used as building blocks for training deeper models. We first prove that adding hidden units yields strictly improved modeling power, while a second theorem shows that RBMs are universal approximators of discrete distributions. We then study the question of whether DBNs with more layers are strictly more powerful in terms of representational power. This suggests a new and less greedy criterion for training RBMs within DBNs.

800 citations

Journal ArticleDOI
TL;DR: An on-line algorithm for learning preference functions that is based on Freund and Schapire's "Hedge" algorithm is considered, and it is shown that the problem of finding the ordering that agrees best with a learned preference function is NP-complete.
Abstract: There are many applications in which it is desirable to order rather than classify instances. Here we consider the problem of learning how to order instances given feedback in the form of preference judgments, i.e., statements to the effect that one instance should be ranked ahead of another. We outline a two-stage approach in which one first learns by conventional means a binary preference function indicating whether it is advisable to rank one instance before another. Here we consider an on-line algorithm for learning preference functions that is based on Freund and Schapire's "Hedge" algorithm. In the second stage, new instances are ordered so as to maximize agreement with the learned preference function. We show that the problem of finding the ordering that agrees best with a learned preference function is NP-complete. Nevertheless, we describe simple greedy algorithms that are guaranteed to find a good approximation. Finally, we show how metasearch can be formulated as an ordering problem, and present experimental results on learning a combination of "search experts," each of which is a domain-specific query expansion strategy for a web search engine.

779 citations

Proceedings ArticleDOI
28 Mar 2011
TL;DR: This work proposes CELF++ and empirically show that it is 35-55% faster than CELF and proposes the CELF algorithm for tackling the second major source of inefficiency of the basic greedy algorithm.
Abstract: Kempe et al. [4] (KKT) showed the problem of influence maximization is NP-hard and a simple greedy algorithm guarantees the best possible approximation factor in PTIME. However, it has two major sources of inefficiency. First, finding the expected spread of a node set is #P-hard. Second, the basic greedy algorithm is quadratic in the number of nodes. The first source is tackled by estimating the spread using Monte Carlo simulation or by using heuristics[4, 6, 2, 5, 1, 3]. Leskovec et al. proposed the CELF algorithm for tackling the second. In this work, we propose CELF++ and empirically show that it is 35-55% faster than CELF.

778 citations

Proceedings ArticleDOI
28 Mar 2011
TL;DR: This work study the notion of competing campaigns in a social network and address the problem of influence limitation where a "bad" campaign starts propagating from a certain node in the network and use the concept of limiting campaigns to counteract the effect of misinformation.
Abstract: In this work, we study the notion of competing campaigns in a social network and address the problem of influence limitation where a "bad" campaign starts propagating from a certain node in the network and use the notion of limiting campaigns to counteract the effect of misinformation. The problem can be summarized as identifying a subset of individuals that need to be convinced to adopt the competing (or "good") campaign so as to minimize the number of people that adopt the "bad" campaign at the end of both propagation processes. We show that this optimization problem is NP-hard and provide approximation guarantees for a greedy solution for various definitions of this problem by proving that they are submodular. We experimentally compare the performance of the greedy method to various heuristics. The experiments reveal that in most cases inexpensive heuristics such as degree centrality compare well with the greedy approach. We also study the influence limitation problem in the presence of missing data where the current states of nodes in the network are only known with a certain probability and show that prediction in this setting is a supermodular problem. We propose a prediction algorithm that is based on generating random spanning trees and evaluate the performance of this approach. The experiments reveal that using the prediction algorithm, we are able to tolerate about 90% missing data before the performance of the algorithm starts degrading and even with large amounts of missing data the performance degrades only to 75% of the performance that would be achieved with complete data.

761 citations


Network Information
Related Topics (5)
Optimization problem
96.4K papers, 2.1M citations
92% related
Wireless network
122.5K papers, 2.1M citations
88% related
Network packet
159.7K papers, 2.2M citations
88% related
Wireless sensor network
142K papers, 2.4M citations
87% related
Node (networking)
158.3K papers, 1.7M citations
87% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023350
2022690
2021809
2020939
20191,006
2018967