scispace - formally typeset
Search or ask a question

Showing papers on "Time complexity published in 2009"


Journal ArticleDOI
TL;DR: This paper explores how cutting-plane methods can provide fast training not only for classification SVMs, but also for structural SVMs and presents an extensive empirical evaluation of the method applied to binary classification, multi-class classification, HMM sequence tagging, and CFG parsing.
Abstract: Discriminative training approaches like structural SVMs have shown much promise for building highly complex and accurate models in areas like natural language processing, protein structure prediction, and information retrieval. However, current training algorithms are computationally expensive or intractable on large datasets. To overcome this bottleneck, this paper explores how cutting-plane methods can provide fast training not only for classification SVMs, but also for structural SVMs. We show that for an equivalent "1-slack" reformulation of the linear SVM training problem, our cutting-plane method has time complexity linear in the number of training examples. In particular, the number of iterations does not depend on the number of training examples, and it is linear in the desired precision and the regularization parameter. Furthermore, we present an extensive empirical evaluation of the method applied to binary classification, multi-class classification, HMM sequence tagging, and CFG parsing. The experiments show that the cutting-plane algorithm is broadly applicable and fast in practice. On large datasets, it is typically several orders of magnitude faster than conventional training methods derived from decomposition methods like SVM-light, or conventional cutting-plane methods. Implementations of our methods are available at www.joachims.org .

1,134 citations


Journal ArticleDOI
TL;DR: It is shown that finding a Nash equilibrium in three-player games is indeed PPAD-complete; and this result is resolved by a reduction from Brouwer's problem, thus establishing that the two problems are computationally equivalent.
Abstract: In 1951, John F. Nash proved that every game has a Nash equilibrium [Ann. of Math. (2), 54 (1951), pp. 286-295]. His proof is nonconstructive, relying on Brouwer's fixed point theorem, thus leaving open the questions, Is there a polynomial-time algorithm for computing Nash equilibria? And is this reliance on Brouwer inherent? Many algorithms have since been proposed for finding Nash equilibria, but none known to run in polynomial time. In 1991 the complexity class PPAD (polynomial parity arguments on directed graphs), for which Brouwer's problem is complete, was introduced [C. Papadimitriou, J. Comput. System Sci., 48 (1994), pp. 489-532], motivated largely by the classification problem for Nash equilibria; but whether the Nash problem is complete for this class remained open. In this paper we resolve these questions: We show that finding a Nash equilibrium in three-player games is indeed PPAD-complete; and we do so by a reduction from Brouwer's problem, thus establishing that the two problems are computationally equivalent. Our reduction simulates a (stylized) Brouwer function by a graphical game [M. Kearns, M. Littman, and S. Singh, Graphical model for game theory, in 17th Conference in Uncertainty in Artificial Intelligence (UAI), 2001], relying on “gadgets,” graphical games performing various arithmetic and logical operations. We then show how to simulate this graphical game by a three-player game, where each of the three players is essentially a color class in a coloring of the underlying graph. Subsequent work [X. Chen and X. Deng, Setting the complexity of 2-player Nash-equilibrium, in 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS), 2006] established, by improving our construction, that even two-player games are PPAD-complete; here we show that this result follows easily from our proof.

981 citations


Journal ArticleDOI
TL;DR: This paper proposes a new scheme based on adaptive critics for finding online the state feedback, infinite horizon, optimal control solution of linear continuous-time systems using only partial knowledge regarding the system dynamics.

716 citations


Journal ArticleDOI
TL;DR: It is shown that the PSPACE upper bounds cannot be substantially improved without a breakthrough on long standing open problems: the square-root sum problem and an arithmetic circuit decision problem that captures P-time on the unit-cost rational arithmetic RAM model.
Abstract: We define Recursive Markov Chains (RMCs), a class of finitely presented denumerable Markov chains, and we study algorithms for their analysis. Informally, an RMC consists of a collection of finite-state Markov chains with the ability to invoke each other in a potentially recursive manner. RMCs offer a natural abstract model for probabilistic programs with procedures. They generalize, in a precise sense, a number of well-studied stochastic models, including Stochastic Context-Free Grammars (SCFG) and Multi-Type Branching Processes (MT-BP).We focus on algorithms for reachability and termination analysis for RMCs: what is the probability that an RMC started from a given state reaches another target state, or that it terminatesq These probabilities are in general irrational, and they arise as (least) fixed point solutions to certain (monotone) systems of nonlinear equations associated with RMCs. We address both the qualitative problem of determining whether the probabilities are 0, 1 or in-between, and the quantitative problems of comparing the probabilities with a given bound, or approximating them to desired precision.We show that all these problems can be solved in PSPACE using a decision procedure for the Existential Theory of Reals. We provide a more practical algorithm, based on a decomposed version of multi-variate Newton's method, and prove that it always converges monotonically to the desired probabilities. We show this method applies more generally to any monotone polynomial system. We obtain polynomial-time algorithms for various special subclasses of RMCs. Among these: for SCFGs and MT-BPs (equivalently, for 1-exit RMCs) the qualitative problem can be solved in P-time; for linearly recursive RMCs the probabilities are rational and can be computed exactly in P-time.We show that our PSPACE upper bounds cannot be substantially improved without a breakthrough on long standing open problems: the square-root sum problem and an arithmetic circuit decision problem that captures P-time on the unit-cost rational arithmetic RAM model. We show that these problems reduce to the qualitative problem and to the approximation problem (to within any nontrivial error) for termination probabilities of general RMCs, and to the quantitative decision problem for termination (extinction) of SCFGs (MT-BPs).

632 citations


Book ChapterDOI
27 Aug 2009
TL;DR: Empirical evaluation over a broad range of multi-label datasets with a variety of evaluation metrics demonstrates the competitiveness of the chaining method against related and state-of-the-art methods, both in terms of predictive performance and time complexity.
Abstract: The widely known binary relevance method for multi-label classification, which considers each label as an independent binary problem, has been sidelined in the literature due to the perceived inadequacy of its label-independence assumption. Instead, most current methods invest considerable complexity to model interdependencies between labels. This paper shows that binary relevance-based methods have much to offer, especially in terms of scalability to large datasets. We exemplify this with a novel chaining method that can model label correlations while maintaining acceptable computational complexity. Empirical evaluation over a broad range of multi-label datasets with a variety of evaluation metrics demonstrates the competitiveness of our chaining method against related and state-of-the-art methods, both in terms of predictive performance and time complexity.

586 citations


Journal ArticleDOI
TL;DR: It is shown that a simple adaptation of a consensus algorithm leads to an averaging algorithm, and lower bounds on the worst-case convergence time for various classes of linear, time-invariant, distributed consensus methods are proved.
Abstract: We study the convergence speed of distributed iterative algorithms for the consensus and averaging problems, with emphasis on the latter. We first consider the case of a fixed communication topology. We show that a simple adaptation of a consensus algorithm leads to an averaging algorithm. We prove lower bounds on the worst-case convergence time for various classes of linear, time-invariant, distributed consensus methods, and provide an algorithm that essentially matches those lower bounds. We then consider the case of a time-varying topology, and provide a polynomial-time averaging algorithm.

563 citations


Journal ArticleDOI
TL;DR: The complexity of finding a Nash equilibrium in a two-player game is complete for the complexity class PPAD (Polynomial Parity Argument, Directed version) introduced by Papadimitriou in 1991 as discussed by the authors.
Abstract: We prove that Bimatrix, the problem of finding a Nash equilibrium in a two-player game, is complete for the complexity class PPAD (Polynomial Parity Argument, Directed version) introduced by Papadimitriou in 1991.Our result, building upon the work of Daskalakis et al. [2006a] on the complexity of four-player Nash equilibria, settles a long standing open problem in algorithmic game theory. It also serves as a starting point for a series of results concerning the complexity of two-player Nash equilibria. In particular, we prove the following theorems:—Bimatrix does not have a fully polynomial-time approximation scheme unless every problem in PPAD is solvable in polynomial time.—The smoothed complexity of the classic Lemke-Howson algorithm and, in fact, of any algorithm for Bimatrix is not polynomial unless every problem in PPAD is solvable in randomized polynomial time.Our results also have a complexity implication in mathematical economics:—Arrow-Debreu market equilibria are PPAD-hard to compute.

497 citations


Journal ArticleDOI
TL;DR: For a noisy linear observation model based on random measurement matrices drawn from general Gaussian measurementMatrices, this paper derives both a set of sufficient conditions for exact support recovery using an exhaustive search decoder, as well as aset of necessary conditions that any decoder must satisfy for exactSupport set recovery.
Abstract: The problem of sparsity pattern or support set recovery refers to estimating the set of nonzero coefficients of an unknown vector beta* isin Ropfp based on a set of n noisy observations. It arises in a variety of settings, including subset selection in regression, graphical model selection, signal denoising, compressive sensing, and constructive approximation. The sample complexity of a given method for subset recovery refers to the scaling of the required sample size n as a function of the signal dimension p, sparsity index k (number of non-zeroes in beta*), as well as the minimum value betamin of beta* over its support and other parameters of measurement matrix. This paper studies the information-theoretic limits of sparsity recovery: in particular, for a noisy linear observation model based on random measurement matrices drawn from general Gaussian measurement matrices, we derive both a set of sufficient conditions for exact support recovery using an exhaustive search decoder, as well as a set of necessary conditions that any decoder, regardless of its computational complexity, must satisfy for exact support recovery. This analysis of fundamental limits complements our previous work on sharp thresholds for support set recovery over the same set of random measurement ensembles using the polynomial-time Lasso method (lscr1-constrained quadratic programming).

491 citations


Proceedings Article
18 Jun 2009
TL;DR: In this paper, the authors consider the l2, 1-norm regularized regression model for joint feature selection from multiple tasks, which can be derived in the probabilistic framework by assuming a suitable prior from the exponential family, and propose to accelerate the computation by reformulating it as two equivalent smooth convex optimization problems which are then solved via the Nesterov's method.
Abstract: The problem of joint feature selection across a group of related tasks has applications in many areas including biomedical informatics and computer vision. We consider the l2, 1-norm regularized regression model for joint feature selection from multiple tasks, which can be derived in the probabilistic framework by assuming a suitable prior from the exponential family. One appealing feature of the l2, 1-norm regularization is that it encourages multiple predictors to share similar sparsity patterns. However, the resulting optimization problem is challenging to solve due to the non-smoothness of the l2, 1-norm regularization. In this paper, we propose to accelerate the computation by reformulating it as two equivalent smooth convex optimization problems which are then solved via the Nesterov's method---an optimal first-order black-box method for smooth convex optimization. A key building block in solving the reformulations is the Euclidean projection. We show that the Euclidean projection for the first reformulation can be analytically computed, while the Euclidean projection for the second one can be computed in linear time. Empirical evaluations on several data sets verify the efficiency of the proposed algorithms.

474 citations


Journal ArticleDOI
TL;DR: An iterative sampling procedure to improve the uniform sampling strategy, an automatic scheme of inferring the tuning parameter from the data, a precise initialization procedure for K-means, as well as a simple strategy for isolating outliers are suggested.
Abstract: This paper presents novel techniques for improving the performance of a multi-way spectral clustering framework (Govindu in Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), vol. 1, pp. 1150---1157, 2005; Chen and Lerman, 2007, preprint in the supplementary webpage) for segmenting affine subspaces. Specifically, it suggests an iterative sampling procedure to improve the uniform sampling strategy, an automatic scheme of inferring the tuning parameter from the data, a precise initialization procedure for K-means, as well as a simple strategy for isolating outliers. The resulting algorithm, Spectral Curvature Clustering (SCC), requires only linear storage and takes linear running time in the size of the data. It is supported by theory which both justifies its successful performance and guides our practical choices. We compare it with other existing methods on a few artificial instances of affine subspaces. Application of the algorithm to several real-world problems is also discussed.

428 citations


Journal ArticleDOI
01 Aug 2009
TL;DR: Three novel methods to compute the upper and lower bounds for the edit distance between two graphs in polynomial time are introduced and result shows that these methods achieve good scalability in terms of both the number of graphs and the size of graphs.
Abstract: Graph data have become ubiquitous and manipulating them based on similarity is essential for many applications. Graph edit distance is one of the most widely accepted measures to determine similarities between graphs and has extensive applications in the fields of pattern recognition, computer vision etc. Unfortunately, the problem of graph edit distance computation is NP-Hard in general. Accordingly, in this paper we introduce three novel methods to compute the upper and lower bounds for the edit distance between two graphs in polynomial time. Applying these methods, two algorithms AppFull and AppSub are introduced to perform different kinds of graph search on graph databases. Comprehensive experimental studies are conducted on both real and synthetic datasets to examine various aspects of the methods for bounding graph edit distance. Result shows that these methods achieve good scalability in terms of both the number of graphs and the size of graphs. The effectiveness of these algorithms also confirms the usefulness of using our bounds in filtering and searching of graphs.

Journal ArticleDOI
TL;DR: A combinatorial algorithm that runs in O(n5EO + n6) time, where EO is the time to evaluate f(S) for some submodular function f defined on a set V with n elements is given.
Abstract: We consider the problem of minimizing a submodular function f defined on a set V with n elements. We give a combinatorial algorithm that runs in O(n 5EO + n 6) time, where EO is the time to evaluate f(S) for some $$S \subseteq V$$. This improves the previous best strongly polynomial running time by more than a factor of n. We also extend our result to ring families.

Journal ArticleDOI
TL;DR: The proposed method analyzes video volumes as inputs avoiding the difficult problem of explicit motion estimation required in traditional methods and provides a way of spatiotemporal pattern matching that is robust to intraclass variations of actions.
Abstract: This paper addresses a spatiotemporal pattern recognition problem. The main purpose of this study is to find a right representation and matching of action video volumes for categorization. A novel method is proposed to measure video-to-video volume similarity by extending Canonical Correlation Analysis (CCA), a principled tool to inspect linear relations between two sets of vectors, to that of two multiway data arrays (or tensors). The proposed method analyzes video volumes as inputs avoiding the difficult problem of explicit motion estimation required in traditional methods and provides a way of spatiotemporal pattern matching that is robust to intraclass variations of actions. The proposed matching is demonstrated for action classification by a simple Nearest Neighbor classifier. We, moreover, propose an automatic action detection method, which performs 3D window search over an input video with action exemplars. The search is speeded up by dynamic learning of subspaces in the proposed CCA. Experiments on a public action data set (KTH) and a self-recorded hand gesture data showed that the proposed method is significantly better than various state-of-the-art methods with respect to accuracy. Our method has low time complexity and does not require any major tuning parameters.

Proceedings ArticleDOI
Joakim Nivre1
02 Aug 2009
TL;DR: A novel transition system for dependency parsing, which constructs arcs only between adjacent words but can parse arbitrary non-projective trees by swapping the order of words in the input, shows state-of-the-art accuracy.
Abstract: We present a novel transition system for dependency parsing, which constructs arcs only between adjacent words but can parse arbitrary non-projective trees by swapping the order of words in the input. Adding the swapping operation changes the time complexity for deterministic parsing from linear to quadratic in the worst case, but empirical estimates based on treebank data show that the expected running time is in fact linear for the range of data attested in the corpora. Evaluation on data from five languages shows state-of-the-art accuracy, with especially good results for the labeled exact match score.

Journal ArticleDOI
TL;DR: A lower bound of Omega(n log n) is derived for the complexity of computing the hypervolume indicator in any number of dimensions d > 1 by reducing the so-called uniformgap problem to it.
Abstract: The goal of multiobjective optimization is to find a set of best compromise solutions for typically conflicting objectives. Due to the complex nature of most real-life problems, only an approximation to such an optimal set can be obtained within reasonable (computing) time. To compare such approximations, and thereby the performance of multiobjective optimizers providing them, unary quality measures are usually applied. Among these, the hypervolume indicator (or S-metric) is of particular relevance due to its favorable properties. Moreover, this indicator has been successfully integrated into stochastic optimizers, such as evolutionary algorithms, where it serves as a guidance criterion for finding good approximations to the Pareto front. Recent results show that computing the hypervolume indicator can be seen as solving a specialized version of Klee's measure problem. In general, Klee's measure problem can be solved with O(n logn + nd/2logn) comparisons for an input instance of size n in d dimensions; as of this writing, it is unknown whether a lower bound higher than Omega(n log n) can be proven. In this paper, we derive a lower bound of Omega(n log n) for the complexity of computing the hypervolume indicator in any number of dimensions d > 1 by reducing the so-called uniformgap problem to it. For the 3-D case, we also present a matching upper bound of O(n log n) comparisons that is obtained by extending an algorithm for finding the maxima of a point set.

Book ChapterDOI
19 Aug 2009
TL;DR: In this paper, a related-key attack on the full 256-bit key AES was presented, which works for one out of every 235 keys with 2120 data and time complexity and negligible memory.
Abstract: In this paper we construct a chosen-key distinguisher and a related-key attack on the full 256-bit key AES. We define a notion of differential q -multicollision and show that for AES-256 q-multicollisions can be constructed in time q·267 and with negligible memory, while we prove that the same task for an ideal cipher of the same block size would require at least $O(q\cdot 2^{\frac{q-1}{q+1}128})$ time. Using similar approach and with the same complexity we can also construct q-pseudo collisions for AES-256 in Davies-Meyer mode, a scheme which is provably secure in the ideal-cipher model. We have also computed partial q-multicollisions in time q·237 on a PC to verify our results. These results show that AES-256 can not model an ideal cipher in theoretical constructions. Finally we extend our results to find the first publicly known attack on the full 14-round AES-256: a related-key distinguisher which works for one out of every 235 keys with 2120 data and time complexity and negligible memory. This distinguisher is translated into a key-recovery attack with total complexity of 2131 time and 265 memory.

Proceedings ArticleDOI
14 Jun 2009
TL;DR: A simple algorithm is presented, and it is proved that with high probability it is able to perform ε-close to the true (intractable) optimal Bayesian policy after some small (polynomial in quantities describing the system) number of time steps.
Abstract: We consider the exploration/exploitation problem in reinforcement learning (RL). The Bayesian approach to model-based RL offers an elegant solution to this problem, by considering a distribution over possible models and acting to maximize expected reward; unfortunately, the Bayesian solution is intractable for all but very restricted cases. In this paper we present a simple algorithm, and prove that with high probability it is able to perform e-close to the true (intractable) optimal Bayesian policy after some small (polynomial in quantities describing the system) number of time steps. The algorithm and analysis are motivated by the so-called PAC-MDP approach, and extend such results into the setting of Bayesian RL. In this setting, we show that we can achieve lower sample complexity bounds than existing algorithms, while using an exploration strategy that is much greedier than the (extremely cautious) exploration of PAC-MDP algorithms.

Journal ArticleDOI
TL;DR: In this paper, the authors consider a spiked covariance model in which a base matrix is perturbed by adding a k-sparse maximal eigenvector, and analyze two computationally tractable methods for recovering the support set of this maximal eigvector, as follows: (a) a simple diagonal thresholding method, which transitions from success to failure as a function of the rescaled sample size θdia(n, p, k)=n/[k2log(p−k)]; and (b) a more sophisticated semidefinite programming
Abstract: Principal component analysis (PCA) is a classical method for dimensionality reduction based on extracting the dominant eigenvectors of the sample covariance matrix. However, PCA is well known to behave poorly in the “large p, small n” setting, in which the problem dimension p is comparable to or larger than the sample size n. This paper studies PCA in this high-dimensional regime, but under the additional assumption that the maximal eigenvector is sparse, say, with at most k nonzero components. We consider a spiked covariance model in which a base matrix is perturbed by adding a k-sparse maximal eigenvector, and we analyze two computationally tractable methods for recovering the support set of this maximal eigenvector, as follows: (a) a simple diagonal thresholding method, which transitions from success to failure as a function of the rescaled sample size θdia(n, p, k)=n/[k2log(p−k)]; and (b) a more sophisticated semidefinite programming (SDP) relaxation, which succeeds once the rescaled sample size θsdp(n, p, k)=n/[klog(p−k)] is larger than a critical threshold. In addition, we prove that no method, including the best method which has exponential-time complexity, can succeed in recovering the support if the order parameter θsdp(n, p, k) is below a threshold. Our results thus highlight an interesting trade-off between computational and statistical efficiency in high-dimensional inference.

Proceedings ArticleDOI
01 Sep 2009
TL;DR: The model proposed here bypasses measurement of the histogram differences in a direct fashion and enables obtaining efficient solutions to the underlying optimization model, and can be solved to optimality in polynomial time using a maximum flow procedure on an appropriately constructed graph.
Abstract: This paper is focused on the Co-segmentation problem [1] - where the objective is to segment a similar object from a pair of images. The background in the two images may be arbitrary; therefore, simultaneous segmentation of both images must be performed with a requirement that the appearance of the two sets of foreground pixels in the respective images are consistent. Existing approaches [1, 2] cast this problem as a Markov Random Field (MRF) based segmentation of the image pair with a regularized difference of the two histograms - assuming a Gaussian prior on the foreground appearance [1] or by calculating the sum of squared differences [2]. Both are interesting formulations but lead to difficult optimization problems, due to the presence of the second (histogram difference) term. The model proposed here bypasses measurement of the histogram differences in a direct fashion; we show that this enables obtaining efficient solutions to the underlying optimization model. Our new algorithm is similar to the existing methods in spirit, but differs substantially in that it can be solved to optimality in polynomial time using a maximum flow procedure on an appropriately constructed graph. We discuss our ideas and present promising experimental results.

01 Jan 2009
TL;DR: For any integer d ≥ 3 and positive real e, it was shown in this article that satisfiability for n-variable d-CNF formulas has a protocol of cost O(nd √ log n−1/e) where n is the number of bits of communication from the first player to the second player.
Abstract: Consider the following two-player communication process to decide a language L: The first player holds the entire input x but is polynomially bounded; the second player is computationally unbounded but does not know any part of x; their goal is to decide cooperatively whether x belongs to L at small cost, where the cost measure is the number of bits of communication from the first player to the second player.For any integer d ≥ 3 and positive real e, we show that, if satisfiability for n-variable d-CNF formulas has a protocol of cost O(nd − e), then coNP is in NP/poly, which implies that the polynomial-time hierarchy collapses to its third level. The result even holds when the first player is conondeterministic, and is tight as there exists a trivial protocol for e = 0. Under the hypothesis that coNP is not in NP/poly, our result implies tight lower bounds for parameters of interest in several areas, namely sparsification, kernelization in parameterized complexity, lossy compression, and probabilistically checkable proofs.By reduction, similar results hold for other NP-complete problems. For the vertex cover problem on n-vertex d-uniform hypergraphs, this statement holds for any integer d ≥ 2. The case d = 2 implies that no NP-hard vertex deletion problem based on a graph property that is inherited by subgraphs can have kernels consisting of O(k2 − e) edges unless coNP is in NP/poly, where k denotes the size of the deletion set. Kernels consisting of O(k2) edges are known for several problems in the class, including vertex cover, feedback vertex set, and bounded-degree deletion.

Proceedings ArticleDOI
19 Apr 2009
TL;DR: It is shown that maximizing the number of supported connections is NP-hard, even when there is no background noise, in contrast to the problem of determining whether or not a given set of connections is feasible since that problem can be solved via linear programming.
Abstract: In this paper we consider the problem of maximizing the number of supported connections in arbitrary wireless networks where a transmission is supported if and only if the signal-to-interference-plus-noise ratio at the receiver is greater than some threshold. The aim is to choose transmission powers for each connection so as to maximize the number of connections for which this threshold is met. We believe that analyzing this problem is important both in its own right and also because it arises as a subproblem in many other areas of wireless networking. We study both the complexity of the problem and also present some game theoretic results regarding capacity that is achieved by completely distributed algorithms. We also feel that this problem is intriguing since it involves both continuous aspects (i.e. choosing the transmission powers) as well as discrete aspects (i.e. which connections should be supported). Our results are: ldr We show that maximizing the number of supported connections is NP-hard, even when there is no background noise. This is in contrast to the problem of determining whether or not a given set of connections is feasible since that problem can be solved via linear programming. ldr We present a number of approximation algorithms for the problem. All of these approximation algorithms run in polynomial time and have an approximation ratio that is independent of the number of connections. ldr We examine a completely distributed algorithm and analyze it as a game in which a connection receives a positive payoff if it is successful and a negative payoff if it is unsuccessful while transmitting with nonzero power. We show that in this game there is not necessarily a pure Nash equilibrium but if such an equilibrium does exist the corresponding price of anarchy is independent of the number of connections. We also show that a mixed Nash equilibrium corresponds to a probabilistic transmission strategy and in this case such an equilibrium always exists and has a price of anarchy that is independent of the number of connections. This work was supported by NSF contract CCF-0728980 and was performed while the second author was visiting Bell Labs in Summer, 2008.

Book ChapterDOI
12 Feb 2009
TL;DR: The main result is that dalks can be approximated efficiently, even for web-scale graphs, and is given a (1/3)-approximation algorithm for dalks that is based on the core decomposition of a graph, and that runs in time O(m + n), where n is the number of nodes and m is theNumber of edges.
Abstract: We consider the problem of finding dense subgraphs with specified upper or lower bounds on the number of vertices. We introduce two optimization problems: the densest at-least-k-subgraph problem (dalks), which is to find an induced subgraph of highest average degree among all subgraphs with at least k vertices, and the densest at-most-k-subgraph problem (damks), which is defined similarly. These problems are relaxed versions of the well-known densest k-subgraph problem (dks), which is to find the densest subgraph with exactly k vertices. Our main result is that dalks can be approximated efficiently, even for web-scale graphs. We give a (1/3)-approximation algorithm for dalks that is based on the core decomposition of a graph, and that runs in time O(m + n), where n is the number of nodes and m is the number of edges. In contrast, we show that damks is nearly as hard to approximate as the densest k-subgraph problem, for which no good approximation algorithm is known. In particular, we show that if there exists a polynomial time approximation algorithm for damks with approximation ratio γ, then there is a polynomial time approximation algorithm for dks with approximation ratio γ 2/8. In the experimental section, we test the algorithm for dalks on large publicly available web graphs. We observe that, in addition to producing near-optimal solutions for dalks, the algorithm also produces near-optimal solutions for dks for nearly all values of k.

Journal ArticleDOI
TL;DR: This paper improves upon the result shown earlier by considering expander graphs with expansion coefficient beyond 3/4 and shows that, with the same number of measurements, only only 2k recovery iterations are required, which is a significant improvement when n is large.
Abstract: Expander graphs have been recently proposed to construct efficient compressed sensing algorithms. In particular, it has been shown that any n-dimensional vector that is k-sparse can be fully recovered using O(klogn) measurements and only O(klogn) simple recovery iterations. In this paper, we improve upon this result by considering expander graphs with expansion coefficient beyond 3 / 4 and show that, with the same number of measurements, only O(k) recovery iterations are required, which is a significant improvement when n is large. In fact, full recovery can be accomplished by at most 2k very simple iterations. The number of iterations can be reduced arbitrarily close to k, and the recovery algorithm can be implemented very efficiently using a simple priority queue with total recovery time O(nlog(n/k))). We also show that by tolerating a small penalty on the number of measurements, and not on the number of recovery iterations, one can use the efficient construction of a family of expander graphs to come up with explicit measurement matrices for this method. We compare our result with other recently developed expander-graph-based methods and argue that it compares favorably both in terms of the number of required measurements and in terms of the time complexity and the simplicity of recovery. Finally, we will show how our analysis extends to give a robust algorithm that finds the position and sign of the k significant elements of an almost k-sparse signal and then, using very simple optimization techniques, finds a k-sparse signal which is close to the best k-term approximation of the original signal.

Proceedings ArticleDOI
14 Jun 2009
TL;DR: This paper proposes to cast both Euclidean projections as root finding problems associated with specific auxiliary functions, which can be solved in linear time via bisection, and makes use of the special structure of the auxiliary functions.
Abstract: We consider the problem of computing the Euclidean projection of a vector of length n onto a closed convex set including the l1 ball and the specialized polyhedra employed in (Shalev-Shwartz & Singer, 2006). These problems have played building block roles in solving several l1-norm based sparse learning problems. Existing methods have a worst-case time complexity of O(n log n). In this paper, we propose to cast both Euclidean projections as root finding problems associated with specific auxiliary functions, which can be solved in linear time via bisection. We further make use of the special structure of the auxiliary functions, and propose an improved bisection algorithm. Empirical studies demonstrate that the proposed algorithms are much more efficient than the competing ones for computing the projections.

Book ChapterDOI
TL;DR: The aim is to focus on general areas rather than particular open questions as such: the reader who has followed the earlier exposition will have noted that a number of specific open issues have already been raised in the text.
Abstract: argumentation may be further advanced. We stress that our aim is to focus on general areas rather than particular open questions as such: the reader who has followed the earlier exposition will have noted that a number of specific open issues have already been raised in the text. 6.1 Average case properties As discussed in Section 5.2, the lower bounds on problem complexity are worstcase, so leaving open the possibility that feasible algorithms may be available in suitable contexts. In addition to the use of restrictions on the form of instances one other approach that has been widely considered in the theory of algorithms is the study of average-case complexity. Underpinning this approach one considers a probability distribution, μ , on instances of a decision problem – often, but not invariably so, μ is the uniform distribution whereby each instance is equally likely, proceeding to define the average-case run time of an algorithm P on instances of size n of L as ∑x∈I(n) μ(x)ρ(P,x) where ρ(P,x) is the run-time of P on instance x. Formal definitions of average-case complexity classes may be found in [36]. To date surprisingly little work has been carried out concerning the application of average-case methods to decision problems in AFs either in terms of algorithmic development or in considering the limitations of such approaches. It remains open to what extent techniques such as those applied to other intractable problems, e.g., [1] for the NP–complete Hamiltonian cycle problem, or [46] for CNF satisfiability could be replicated in AF settings. Of some relevance to such approaches are so-called “phase-transition” effects, which received much attention in the mid-late 1990s as potential indicators of factors separating tractable and intractable classes of problem instances, e.g., the studies of random CNF-SAT from [37, 40]. Analytic studies of such effects appears to indicate connections between suitable witnessing structures, e.g., satisfying assignment, being present “almost certainly” and the performance of algorithms to identify such structures. Of some interest in the context of AF semantics are the results of [41, 17] which give conditions ensuring that a random AF “almost certainly” has a stable extension. There has as yet, however, been no detailed study of the implications of these results for fast on average methods for identifying or enumerating stable extensions. In the same way that the analyses of [41, 17] relate 102 Paul E. Dunne and Michael Wooldridge to the existence of stable extensions in AFs, it would be of some interest to examine to consider existence properties of other solution structures in random AFs and algorithmic consequences. 6.2 Approaches to dynamic updates An important feature of the argumentation forms discussed so far is that, in practice, these are not static systems: typically an AF, 〈A,R〉, represents only a “snapshot” of the environment, and, as further facts, information and opinions emerge the form of the initial view may change significantly in order to accommodate these. For example, additional arguments may have to be considered so changing A; existing attacks may cease to apply and new attacks (arising from changes to A) come into force. It is clear that accounting for such dynamic aspects raises a number of issues in terms of assessing the acceptability status of individual arguments. As with the study of average-case properties, the treatment of algorithms and complexity issues relating to determining argument status in dynamically changing environments has been somewhat neglected. Thus, given 〈A,R〉 and S ⊆A for which S ∈ Es(〈A,R〉) according to some semantics s, natural decision questions are: does x ∈ S continue to be credulously accepted (w.r.t. to semantics s) in the AF 〈B,S〉 where B results by removing some arguments from A and replacing these; similarly T modifies the attack relation R.

Journal ArticleDOI
TL;DR: Based on the frequency of attribute values, the average density of an object is defined and a novel initialization method for categorical data is proposed, in which the distance between objects and the density of the object is considered.
Abstract: In clustering algorithms, choosing a subset of representative examples is very important in data set. Such ''exemplars'' can be found by randomly choosing an initial subset of data objects and then iteratively refining it, but this works well only if that initial choice is close to a good solution. In this paper, based on the frequency of attribute values, the average density of an object is defined. Furthermore, a novel initialization method for categorical data is proposed, in which the distance between objects and the density of the object is considered. We also apply the proposed initialization method to k-modes algorithm and fuzzy k-modes algorithm. Experimental results illustrate that the proposed initialization method is superior to random initialization method and can be applied to large data sets for its linear time complexity with respect to the number of data objects.

Journal ArticleDOI
TL;DR: In this article, structural and connectivity-related properties of the space of solutions of Boolean satisfiability problems were studied and various dichotomies in Schaefer's framework were established for the kinds of subgraphs of the hypercube that can be induced by the solutions of boolean formulas.
Abstract: Boolean satisfiability problems are an important benchmark for questions about complexity, algorithms, heuristics, and threshold phenomena. Recent work on heuristics and the satisfiability threshold has centered around the structure and connectivity of the solution space. Motivated by this work, we study structural and connectivity-related properties of the space of solutions of Boolean satisfiability problems and establish various dichotomies in Schaefer's framework. On the structural side, we obtain dichotomies for the kinds of subgraphs of the hypercube that can be induced by the solutions of Boolean formulas, as well as for the diameter of the connected components of the solution space. On the computational side, we establish dichotomy theorems for the complexity of the connectivity and $st$-connectivity questions for the graph of solutions of Boolean formulas. Our results assert that the intractable side of the computational dichotomies is PSPACE-complete, while the tractable side—which includes but is not limited to all problems with polynomial-time algorithms for satisfiability—is in P for the $st$-connectivity question, and in coNP for the connectivity question. The diameter of components can be exponential for the PSPACE-complete cases, whereas in all other cases it is linear; thus, diameter and complexity of the connectivity problems are remarkably aligned. The crux of our results is an expressibility theorem showing that in the tractable cases, the subgraphs induced by the solution space possess certain good structural properties, whereas in the intractable cases, the subgraphs can be arbitrary.

Proceedings ArticleDOI
16 Mar 2009
TL;DR: The experimental results demonstrate that this newly proposed algorithm yields noticeably better time and space efficiencies than all the currently published linear time algorithms for SA construction.
Abstract: We present a linear time and space suffix array (SA) construction algorithm called the SA-IS algorithm.The SA-IS algorithm is novel because of the LMS-substrings used for the problem reduction and the pure induced-sorting (specially coined for this algorithm)used to propagate the order of suffixes as well as that of LMS-substrings, which makes the algorithm almost purely relying on induced sorting at both its crucial steps.The pure induced-sorting renders the algorithm an elegant design and in turn a surprisingly compact implementation which consists of less than 100 lines of C code.The experimental results demonstrate that this newly proposed algorithm yields noticeably better time and space efficiencies than all the currently published linear time algorithms for SA construction.

Journal ArticleDOI
TL;DR: This work describes how to consider the S-metric as a special case of a more general geometric problem called Klee's measure problem (KMP), an algorithm that exists with runtime O(n log n nd2 log n), for n points of d 3 dimensions.
Abstract: The dominated hypervolume (or S-metric) is a commonly accepted quality measure for comparing approximations of Pareto fronts generated by multi-objective optimizers. Since optimizers exist, namely evolutionary algorithms, that use the S-metric internally several times per iteration, a fast determination of the S-metric value is of essential importance. This work describes how to consider the S-metric as a special case of a more general geometric problem called Klee's measure problem (KMP). For KMP, an algorithm exists with runtime O(n log n + nd/2 log n), for n points of d ≥ 3 dimensions. This complex algorithm is adapted to the special case of calculating the S-metric. Conceptual simplifications realize the algorithm without complex data structures and establish an upper bound of O(nd/2 log n) for the S-metric calculation for d ≥ 3. The performance of the new algorithm is studied in comparison to another state of the art algorithm on a set of academic test functions.

Journal ArticleDOI
TL;DR: The result gives the first polynomial time algorithm for the minimum node multiway cut problem when the separator size is bounded by O(log n), significantly improving the previous algorithm of time.
Abstract: The parameterized node multiway cut problem is for a given graph to find a separator of size bounded by k whose removal separates a collection of terminal sets in the graph. In this paper, we develop an O(k4k n 3) time algorithm for this problem, significantly improving the previous algorithm of time $O(4^{k^{3}}n^{5})$for the problem. Our result gives the first polynomial time algorithm for the minimum node multiway cut problem when the separator size is bounded by O(log n).