Showing papers on "Time complexity published in 2019"

PDF

Open Access

Journal Article•DOI•

Fast image encryption algorithm based on parallel computing system

[...]

Xingyuan Wang¹, Xingyuan Wang², Le Feng¹, Hongyu Zhao²•Institutions (2)

Dalian University of Technology¹, Dalian Maritime University²

01 Jun 2019-Information Sciences

TL;DR: This paper proposes a parallel diffusion method that ensures the parallelism of diffusion to the utmost extent and achieves a qualitative improvement in efficiency over traditional streaming diffusion methods.

...read moreread less

301 citations

Proceedings Article•DOI•

SimGNN: A Neural Network Approach to Fast Graph Similarity Computation

[...]

Yunsheng Bai¹, Hao Ding², Song Bian³, Ting Chen¹, Yizhou Sun¹, Wei Wang¹ - Show less +2 more•Institutions (3)

University of California, Los Angeles¹, Purdue University², Zhejiang University³

30 Jan 2019

TL;DR: This work proposes a novel neural network based approach to address this classic yet challenging graph problem, aiming to alleviate the computational burden while preserving a good performance, and suggests SimGNN provides a new direction for future research on graph similarity computation and graph similarity search.

...read moreread less

Abstract: Graph similarity search is among the most important graph-based applications, e.g. finding the chemical compounds that are most similar to a query compound. Graph similarity/distance computation, such as Graph Edit Distance (GED) and Maximum Common Subgraph (MCS), is the core operation of graph similarity search and many other applications, but very costly to compute in practice. Inspired by the recent success of neural network approaches to several graph applications, such as node or graph classification, we propose a novel neural network based approach to address this classic yet challenging graph problem, aiming to alleviate the computational burden while preserving a good performance. The proposed approach, called SimGNN, combines two strategies. First, we design a learnable embedding function that maps every graph into an embedding vector, which provides a global summary of a graph. A novel attention mechanism is proposed to emphasize the important nodes with respect to a specific similarity metric. Second, we design a pairwise node comparison method to supplement the graph-level embeddings with fine-grained node-level information. Our model achieves better generalization on unseen graphs, and in the worst case runs in quadratic time with respect to the number of nodes in two graphs. Taking GED computation as an example, experimental results on three real graph datasets demonstrate the effectiveness and efficiency of our approach. Specifically, our model achieves smaller error rate and great time reduction compared against a series of baselines, including several approximation algorithms on GED computation, and many existing graph neural network based models. Our study suggests SimGNN provides a new direction for future research on graph similarity computation and graph similarity search.

...read moreread less

203 citations

Posted Content•

Dynamic Planar Convex Hull

[...]

Riko Jacob, Gerth Stølting Brodal

28 Feb 2019-arXiv: Computational Geometry

TL;DR: The computational complexity of the dynamic convex hull problem in the planar case is determined and a lower bound on the amortized asymptotic time complexity is given that matches the performance of this data structure.

...read moreread less

Abstract: In this article, we determine the amortized computational complexity of the planar dynamic convex hull problem by querying. We present a data structure that maintains a set of n points in the plane under the insertion and deletion of points in amortized O(log n) time per operation. The space usage of the data structure is O(n). The data structure supports extreme point queries in a given direction, tangent queries through a given point, and queries for the neighboring points on the convex hull in O(log n) time. The extreme point queries can be used to decide whether or not a given line intersects the convex hull, and the tangent queries to determine whether a given point is inside the convex hull. We give a lower bound on the amortized asymptotic time complexity that matches the performance of this data structure.

...read moreread less

184 citations

Journal Article•DOI•

A fast image encryption algorithm based on compressive sensing and hyperchaotic map

[...]

Qiaoyun Xu¹, Kehui Sun¹, Chun Cao¹, Congxu Zhu¹•Institutions (1)

Central South University¹

01 Oct 2019-Optics and Lasers in Engineering

TL;DR: Simulation and performance analysis verify that the new 2D-SLIM modulation map based on the improved two-dimensional closed-loop modulation coupling model has acceptable compression, high security and low time complexity.

...read moreread less

171 citations

Journal Article•DOI•

Fast approximate nearest neighbor search with the navigating spreading-out graph

[...]

Cong Fu¹, Chao Xiang¹, Changxu Wang¹, Deng Cai¹•Institutions (1)

Zhejiang University¹

01 Jan 2019

TL;DR: Wang et al. as discussed by the authors proposed a novel graph structure called Monotonic Relative Neighborhood Graph (MRNG) which guarantees very low search complexity (close to logarithmic time).

...read moreread less

Abstract: Approximate nearest neighbor search (ANNS) is a fundamental problem in databases and data mining. A scalable ANNS algorithm should be both memory-efficient and fast. Some early graph-based approaches have shown attractive theoretical guarantees on search time complexity, but they all suffer from the problem of high indexing time complexity. Recently, some graph-based methods have been proposed to reduce indexing complexity by approximating the traditional graphs; these methods have achieved revolutionary performance on million-scale datasets. Yet, they still can not scale to billion-node databases. In this paper, to further improve the search-efficiency and scalability of graph-based methods, we start by introducing four aspects: (1) ensuring the connectivity of the graph; (2) lowering the average out-degree of the graph for fast traversal; (3) shortening the search path; and (4) reducing the index size. Then, we propose a novel graph structure called Monotonic Relative Neighborhood Graph (MRNG) which guarantees very low search complexity (close to logarithmic time). To further lower the indexing complexity and make it practical for billion-node ANNS problems, we propose a novel graph structure named Navigating Spreading-out Graph (NSG) by approximating the MRNG. The NSG takes the four aspects into account simultaneously. Extensive experiments show that NSG outperforms all the existing algorithms significantly. In addition, NSG shows superior performance in the E-commercial scenario of Taobao (Alibaba Group) and has been integrated into their billion-scale search engine.

...read moreread less

156 citations

Proceedings Article•DOI•

Fiat-Shamir: from practice to theory

[...]

Ran Canetti¹, Yilei Chen, Justin Holmgren², Alex Lombardi³, Guy N. Rothblum⁴, Ron D. Rothblum⁵, Daniel Wichs⁶ - Show less +3 more•Institutions (6)

Tel Aviv University¹, Princeton University², Massachusetts Institute of Technology³, Weizmann Institute of Science⁴, Technion – Israel Institute of Technology⁵, Northeastern University⁶

23 Jun 2019

TL;DR: A framework for reducing the security of protocols based on the learning with errors (LWE) problem to qualitatively simpler and weaker computational hardness assumptions is presented.

...read moreread less

Abstract: We give new instantiations of the Fiat-Shamir transform using explicit, efficiently computable hash functions. We improve over prior work by reducing the security of these protocols to qualitatively simpler and weaker computational hardness assumptions. As a consequence of our framework, we obtain the following concrete results. 1) There exists a succinct publicly verifiable non-interactive argument system for log-space uniform computations, under the assumption that any one of a broad class of fully homomorphic encryption (FHE) schemes has almost optimal security against polynomial-time adversaries. The class includes all FHE schemes in the literature that are based on the learning with errors (LWE) problem. 2) There exists a non-interactive zero-knowledge argument system for in the common reference string model, under either of the following two assumptions: (i) Almost optimal hardness of search-LWE against polynomial-time adversaries, or (ii) The existence of a circular-secure FHE scheme with a standard (polynomial time, negligible advantage) level of security. 3) The classic quadratic residuosity protocol of [Goldwasser, Micali, and Rackoff, SICOMP ’89] is not zero knowledge when repeated in parallel, under any of the hardness assumptions above.

...read moreread less

142 citations

Journal Article•DOI•

An image encryption method based on chaos system and AES algorithm

[...]

Alireza Arab¹, Mohammad Javad Rostami¹, Behnam Ghavami¹•Institutions (1)

Shahid Bahonar University of Kerman¹

01 Oct 2019-The Journal of Supercomputing

TL;DR: Using statistical analyses, it is shown that this approach can protect the image against the statistical attacks and the entropy test results illustrate that the entropy values are close to the ideal, and hence, the proposed algorithm is secure against the entropy attacks.

...read moreread less

Abstract: In this paper, a novel image encryption algorithm is proposed based on the combination of the chaos sequence and the modified AES algorithm. In this method, the encryption key is generated by Arnold chaos sequence. Then, the original image is encrypted using the modified AES algorithm and by implementing the round keys produced by the chaos system. The proposed approach not only reduces the time complexity of the algorithm but also adds the diffusion ability to the proposed algorithm, which make the encrypted images by the proposed algorithm resistant to the differential attacks. The key space of the proposed method is large enough to resist the brute-force attacks. This method is so sensitive to the initial values and input image so that the small changes in these values can lead to significant changes in the encrypted image. Using statistical analyses, we show that this approach can protect the image against the statistical attacks. The entropy test results illustrate that the entropy values are close to the ideal, and hence, the proposed algorithm is secure against the entropy attacks. The simulation results clarify that the small changes in the original image and key result in the significant changes in the encrypted image and the original image cannot be accessed.

...read moreread less

124 citations

Proceedings Article•DOI•

The reachability problem for Petri nets is not elementary

[...]

Wojciech Czerwiński¹, Sławomir Lasota¹, Ranko Lazić², Jérôme Leroux³, Filip Mazowiecki³ - Show less +1 more•Institutions (3)

University of Warsaw¹, University of Warwick², University of Bordeaux³

23 Jun 2019

TL;DR: A non-elementary lower bound is established, i.e. that the reachability problem needs a tower of exponentials of time and space, which implies that a plethora of problems from formal languages, logic, concurrent systems, process calculi and other areas, that are known to admit reductions from the Petri nets reachable problem, are also not elementary.

...read moreread less

Abstract: Petri nets, also known as vector addition systems, are a long established model of concurrency with extensive applications in modelling and analysis of hardware, software and database systems, as well as chemical, biological and business processes. The central algorithmic problem for Petri nets is reachability: whether from the given initial configuration there exists a sequence of valid execution steps that reaches the given final configuration. The complexity of the problem has remained unsettled since the 1960s, and it is one of the most prominent open questions in the theory of verification. Decidability was proved by Mayr in his seminal STOC 1981 work, and the currently best published upper bound is non-primitive recursive Ackermannian of Leroux and Schmitz from LICS 2019. We establish a non-elementary lower bound, i.e. that the reachability problem needs a tower of exponentials of time and space. Until this work, the best lower bound has been exponential space, due to Lipton in 1976. The new lower bound is a major breakthrough for several reasons. Firstly, it shows that the reachability problem is much harder than the coverability (i.e., state reachability) problem, which is also ubiquitous but has been known to be complete for exponential space since the late 1970s. Secondly, it implies that a plethora of problems from formal languages, logic, concurrent systems, process calculi and other areas, that are known to admit reductions from the Petri nets reachability problem, are also not elementary. Thirdly, it makes obsolete the currently best lower bounds for the reachability problems for two key extensions of Petri nets: with branching and with a pushdown stack. At the heart of our proof is a novel gadget so called the factorial amplifier that, assuming availability of counters that are zero testable and bounded by k, guarantees to produce arbitrarily large pairs of values whose ratio is exactly the factorial of k. We also develop a novel construction that uses arbitrarily large pairs of values with ratio R to provide zero testable counters that are bounded by R. Repeatedly composing the factorial amplifier with itself by means of the construction then enables us to compute in linear time Petri nets that simulate Minsky machines whose counters are bounded by a tower of exponentials, which yields the non-elementary lower bound. By refining this scheme further, we in fact establish hardness for h-exponential space already for Petri nets with h + 13 counters.

...read moreread less

119 citations

Journal Article•DOI•

Efficient task-specific data valuation for nearest neighbor algorithms

[...]

Ruoxi Jia¹, David Dao², Boxin Wang³, Frances Ann Hubis², Nezihe Merve Gürel², Bo Li⁴, Ce Zhang², Costas J. Spanos¹, Dawn Song¹ - Show less +5 more•Institutions (4)

University of California, Berkeley¹, ETH Zurich², Zhejiang University³, University of Illinois at Urbana–Champaign⁴

01 Jul 2019

TL;DR: This paper defines the "relative value of data" via the Shapley value, as it uniquely possesses properties with appealing real-world interpretations, such as fairness, rationality and decentralizability, and develops an algorithm based on Locality Sensitive Hashing (LSH) with only sublinear complexity.

...read moreread less

Abstract: Given a data set D containing millions of data points and a data consumer who is willing to pay for $X to train a machine learning (ML) model over D, how should we distribute this $X to each data point to reflect its "value"? In this paper, we define the "relative value of data" via the Shapley value, as it uniquely possesses properties with appealing real-world interpretations, such as fairness, rationality and decentralizability. For general, bounded utility functions, the Shapley value is known to be challenging to compute: to get Shapley values for all N data points, it requires O(2N) model evaluations for exact computation and O(N log N) for (ϵ, δ)-approximation.In this paper, we focus on one popular family of ML models relying on K-nearest neighbors (KNN). The most surprising result is that for unweighted KNN classifiers and regressors, the Shapley value of all N data points can be computed, exactly, in O(N log N) time - an exponential improvement on computational complexity! Moreover, for (ϵ, δ)-approximation, we are able to develop an algorithm based on Locality Sensitive Hashing (LSH) with only sublinear complexity O(Nh(ϵ, K) log N) when ϵ is not too small and K is not too large. We empirically evaluate our algorithms on up to 10 million data points and even our exact algorithm is up to three orders of magnitude faster than the baseline approximation algorithm. The LSH-based approximation algorithm can accelerate the value calculation process even further.We then extend our algorithm to other scenarios such as (1) weighed KNN classifiers, (2) different data points are clustered by different data curators, and (3) there are data analysts providing computation who also requires proper valuation. Some of these extensions, although also being improved exponentially, are less practical for exact computation (e.g., O(NK) complexity for weigthed KNN). We thus propose an Monte Carlo approximation algorithm, which is O(N(log N)2/(log K)2) times more efficient than the baseline approximation algorithm.

...read moreread less

115 citations

Journal Article•DOI•

Submodular Functions: from Discrete to Continous Domains

[...]

Francis Bach¹•Institutions (1)

École Normale Supérieure¹

01 May 2019-Mathematical Programming

TL;DR: In this article, the authors show that most of the results relating submodularity and convexity for set-functions can be extended to all submodular functions and provide a new interpretation of existing results for set functions.

...read moreread less

Abstract: Submodular set-functions have many applications in combinatorial optimization, as they can be minimized and approximately maximized in polynomial time. A key element in many of the algorithms and analyses is the possibility of extending the submodular set-function to a convex function, which opens up tools from convex optimization. Submodularity goes beyond set-functions and has naturally been considered for problems with multiple labels or for functions defined on continuous domains, where it corresponds essentially to cross second-derivatives being nonpositive. In this paper, we show that most results relating submodularity and convexity for set-functions can be extended to all submodular functions. In particular, (a) we naturally define a continuous extension in a set of probability measures, (b) show that the extension is convex if and only if the original function is submodular, (c) prove that the problem of minimizing a submodular function is equivalent to a typically non-smooth convex optimization problem, and (d) propose another convex optimization problem with better computational properties (e.g., a smooth dual problem). Most of these extensions from the set-function situation are obtained by drawing links with the theory of multi-marginal optimal transport, which provides also a new interpretation of existing results for set-functions. We then provide practical algorithms to minimize generic submodular functions on discrete domains, with associated convergence rates.

...read moreread less

114 citations

Journal Article•DOI•

Sampled-data control for a class of linear time-varying systems

[...]

Wenbing Zhang¹, Qing-Long Han², Yang Tang³, Yurong Liu¹•Institutions (3)

Yangzhou University¹, Swinburne University of Technology², East China University of Science and Technology³

01 May 2019-Automatica

TL;DR: New criteria for globally uniformly exponential stability and globally uniformly asymptotic stability of the corresponding closed-loop system are derived and an algorithm is presented to solve the gain synthesis problem.

...read moreread less

Posted Content•

Minimizing State Preparations in Variational Quantum Eigensolver by Partitioning into Commuting Families

[...]

Pranav Gokhale¹, Olivia Angiuli, Yongshan Ding, Kaiwen Gui, Teague Tomesh, Martin Suchara, Margaret Martonosi, Frederic T. Chong - Show less +4 more•Institutions (1)

University of Chicago¹

31 Jul 2019-arXiv: Quantum Physics

TL;DR: This work introduces a systematic technique for minimizing requisite state preparations by exploiting the simultaneous measurability of partitions of commuting Pauli strings, and encompasses algorithms for efficiently approximating a MIN-COMMUTING-PARTITION, as well as a synthesis tool for compiling simultaneous measurement circuits.

...read moreread less

Abstract: Variational quantum eigensolver (VQE) is a promising algorithm suitable for near-term quantum machines. VQE aims to approximate the lowest eigenvalue of an exponentially sized matrix in polynomial time. It minimizes quantum resource requirements both by co-processing with a classical processor and by structuring computation into many subproblems. Each quantum subproblem involves a separate state preparation terminated by the measurement of one Pauli string. However, the number of such Pauli strings scales as $N^4$ for typical problems of interest--a daunting growth rate that poses a serious limitation for emerging applications such as quantum computational chemistry. We introduce a systematic technique for minimizing requisite state preparations by exploiting the simultaneous measurability of partitions of commuting Pauli strings. Our work encompasses algorithms for efficiently approximating a MIN-COMMUTING-PARTITION, as well as a synthesis tool for compiling simultaneous measurement circuits. For representative problems, we achieve 8-30x reductions in state preparations, with minimal overhead in measurement circuit cost. We demonstrate experimental validation of our techniques by estimating the ground state energy of deuteron on an IBM Q 20-qubit machine. We also investigate the underlying statistics of simultaneous measurement and devise an adaptive strategy for mitigating harmful covariance terms.

...read moreread less

Proceedings Article•DOI•

Efficient algorithms and lower bounds for robust linear regression

[...]

Ilias Diakonikolas¹, Weihao Kong², Alistair Stewart¹•Institutions (2)

University of Southern California¹, Stanford University²

06 Jan 2019

TL;DR: In this paper, the authors studied the problem of high-dimensional linear regression in a robust model where an e-fraction of the samples can be adversarially corrupted and gave a sample near-optimal and computationally efficient algorithm that draws O(d/e2) labeled examples and outputs a candidate hypothesis vector that approximates the unknown regression vector β within e2-norm O(e log(1/e)σ, where σ is the standard deviation of the random observation noise.

...read moreread less

Abstract: We study the prototypical problem of high-dimensional linear regression in a robust model where an e-fraction of the samples can be adversarially corrupted. We focus on the fundamental setting where the covariates of the uncorrupted samples are drawn from a Gaussian distribution N(0, Σ) on Rd. We give nearly tight upper bounds and computational lower bounds for this problem. Specifically our main contributions are as follows:• For the case that the covariance matrix is known to be the identity we give a sample near-optimal and computationally efficient algorithm that draws O(d/e2) labeled examples and outputs a candidate hypothesis vector [MATH HERE] that approximates the unknown regression vector β within e2-norm O(e log(1/e)σ), where σ is the standard deviation of the random observation noise. An error of Ω(eσ) is information-theoretically necessary even with infinite sample size. Hence, the error guarantee of our algorithm is optimal, up to a logarithmic factor in 1/e. Prior work gave an algorithm for this problem with sample complexity [MATH HERE] whose error guarantee scales with the e2-norm of β.• For the case of unknown covariance Σ, we show that we can efficiently achieve the same error guarantee of O(e log(1/e)σ), as in the known covariance case, using an additional O(d2 / e2) unlabeled examples. On the other hand, an error of O(eσ) can be information-theoretically attained with O(d/e2) samples. We prove a Statistical Query (SQ) lower bound providing evidence that this quadratic tradeoff in the sample size is inherent. More specifically, we show that any polynomial time SQ learning algorithm for robust linear regression (in Huber's contamination model) with estimation complexity O(d2−c), where c > 0 is an arbitrarily small constant, must incur an error of [MATH HERE].

...read moreread less

Proceedings Article•DOI•

Matching Images and Text with Multi-modal Tensor Fusion and Re-ranking

[...]

Tan Wang¹, Xing Xu¹, Yang Yang¹, Alan Hanjalic², Heng Tao Shen¹, Jingkuan Song¹ - Show less +2 more•Institutions (2)

University of Electronic Science and Technology of China¹, Delft University of Technology²

15 Oct 2019

TL;DR: This work proposes a novel Multi-modal Tensor Fusion Network (MTFN) to explicitly learn an accurate image-text similarity function with rank-based tensor fusion rather than seeking a common embedding space for each image- text instance.

...read moreread less

Abstract: A major challenge in matching images and text is that they have intrinsically different data distributions and feature representations. Most existing approaches are based either on embedding or classification, the first one mapping image and text instances into a common embedding space for distance measuring, and the second one regarding image-text matching as a binary classification problem. Neither of these approaches can, however, balance the matching accuracy and model complexity well. We propose a novel framework that achieves remarkable matching performance with acceptable model complexity. Specifically, in the training stage, we propose a novel Multi-modal Tensor Fusion Network (MTFN) to explicitly learn an accurate image-text similarity function with rank-based tensor fusion rather than seeking a common embedding space for each image-text instance. Then, during testing, we deploy a generic Cross-modal Re-ranking (RR) scheme for refinement without requiring additional training procedure. Extensive experiments on two datasets demonstrate that our MTFN-RR consistently achieves the state-of-the-art matching performance with much less time complexity.

...read moreread less

Proceedings Article•DOI•

High-dimensional robust mean estimation in nearly-linear time

[...]

Yu Cheng¹, Ilias Diakonikolas², Rong Ge¹•Institutions (2)

Duke University¹, University of Southern California²

06 Jan 2019

TL;DR: This work gives the first nearly-linear time algorithms for high-dimensional robust mean estimation on distributions with known covariance and sub-gaussian tails and unknown bounded covariance, and exploits the special structure of the corresponding SDPs to show that they are approximately solvable in nearly- linear time.

...read moreread less

Abstract: We study the fundamental problem of high-dimensional mean estimation in a robust model where a constant fraction of the samples are adversarially corrupted. Recent work gave the first polynomial time algorithms for this problem with dimension-independent error guarantees for several families of structured distributions.In this work, we give the first nearly-linear time algorithms for high-dimensional robust mean estimation. Specifically, we focus on distributions with (i) known covariance and subgaussian tails, and (ii) unknown bounded covariance. Given N samples on Rd, an e-fraction of which may be arbitrarily corrupted, our algorithms run in time O(Nd)/poly(e) and approximate the true mean within the information-theoretically optimal error, up to constant factors. Previous robust algorithms with comparable error guarantees have running times [MATH HERE], for e = Ω(1).Our algorithms rely on a natural family of SDPs parameterized by our current guess v for the unknown mean μ*. We give a win-win analysis establishing the following: either a near-optimal solution to the primal SDP yields a good candidate for μ* --- independent of our current guess v --- or a near-optimal solution to the dual SDP yields a new guess v' whose distance from μ* is smaller by a constant factor. We exploit the special structure of the corresponding SDPs to show that they are approximately solvable in nearly-linear time. Our approach is quite general, and we believe it can also be applied to obtain nearly-linear time algorithms for other high-dimensional robust learning problems.

...read moreread less

Journal Article•DOI•

Speed up grid-search for parameter selection of support vector machines

[...]

Hatem A. Fayed¹, Amir F. Atiya¹•Institutions (1)

Cairo University¹

01 Jul 2019-Applied Soft Computing

TL;DR: A novel approach is proposed that prunes the data points by removing the ones that have extremely small chance of becoming support vectors that can serve in reducing the grid-search time for the standard SVM and for the approximate methods that search heuristically for a small range of the kernel parameters first.

...read moreread less

Proceedings Article•DOI•

Sublinear algorithms for (Δ + 1) vertex coloring

[...]

Sepehr Assadi¹, Yu Chen¹, Sanjeev Khanna¹•Institutions (1)

University of Pennsylvania¹

06 Jan 2019

TL;DR: A remarkably simple meta-algorithm for the (∆ + 1) coloring problem: Sample O(log n) colors for each vertex independently and uniformly at random from the ∆+ 1 colors; find a proper coloring of the graph using only the sampled colors of each vertex.

...read moreread less

Abstract: Any graph with maximum degree Δ admits a proper vertex coloring with Δ + 1 colors that can be found via a simple sequential greedy algorithm in linear time and space. But can one find such a coloring via a sublinear algorithm?We answer this fundamental question in the affirmative for several canonical classes of sublinear algorithms including graph streaming, sublinear time, and massively parallel computation (MPC) algorithms. In particular, we design:• A single-pass semi-streaming algorithm in dynamic streams using O(n) space. The only known semi-streaming algorithm prior to our work was a folklore O(log n)-pass algorithm obtained by simulating classical distributed algorithms in the streaming model.• A sublinear-time algorithm in the standard query model that allows neighbor queries and pair queries using [MATH HERE] time. We further show that any algorithm that outputs a valid coloring with sufficiently large constant probability requires [MATH HERE] time. No non-trivial sublinear time algorithms were known prior to our work.• A parallel algorithm in the massively parallel computation (MPC) model using O(n) memory per machine and O(1) MPC rounds. Our number of rounds significantly improves upon the recent O(log log Δ · log* (n))-round algorithm of Parter [ICALP 2018].At the core of our results is a remarkably simple meta-algorithm for the (Δ + 1) coloring problem: Sample O(log n) colors for each vertex independently and uniformly at random from the Δ + 1 colors; find a proper coloring of the graph using only the sampled colors of each vertex. As our main result, we prove that the sampled set of colors with high probability contains a proper coloring of the input graph. The sublinear algorithms are then obtained by designing efficient algorithms for finding a proper coloring of the graph from the sampled colors in each model.We note that all our upper bound results for (Δ + 1) coloring are either optimal or close to best possible in each model studied. We also establish new lower bounds that rule out the possibility of achieving similar results in these models for the closely related problems of maximal independent set and maximal matching. Collectively, our results highlight a sharp contrast between the complexity of (Δ+1) coloring vs maximal independent set and maximal matching in various models of sublinear computation even though all three problems are solvable by a simple greedy algorithm in the classical setting.

...read moreread less

Journal Article•DOI•

Byzantine-resilient distributed observers for LTI systems

[...]

Aritra Mitra¹, Shreyas Sundaram¹•Institutions (1)

Purdue University¹

01 Oct 2019-Automatica

TL;DR: An attack-resilient, provably correct state estimation algorithm is developed that admits a fully distributed implementation and a notion of `strong-robustness' that captures both measurement and communication redundancy is introduced.

...read moreread less

Proceedings Article•DOI•

Analysis of Space & Time Complexity with PSO Based Synchronous MC-CDMA System

[...]

Muhammad Asif, Muhammad Adnan Khan, Sagheer Abbas, Muhammad Saleem

01 Jan 2019

TL;DR: The main objective of this paper includes the enactment of the recommended Particle Swarm Optimization (PSO) density algorithm of this era for MC-CDMA framework as far as space and time.

...read moreread less

Abstract: A layout for high data rate self-actualization is proposed that is the solution study regions of the upcoming time’s frameworks. The most grounded structures for satisfying the demand for high data of upcoming time-period systems are the Multi-Carrier transmission designs like Multi Carrier Code Division Multiple Access and Orthogonal Frequency Division Multiple Access. In the frequency domain, MC-CDMA spreads every user. The main objective of this paper includes the enactment of the recommended Particle Swarm Optimization (PSO) density algorithm of this era for MC-CDMA framework as far as space and time.

...read moreread less

Proceedings Article•

Scalable Fair Clustering

[...]

Arturs Backurs¹, Piotr Indyk¹, Krzysztof Onak², Baruch Schieber³, Ali Vakilian¹, Tal Wagner¹ - Show less +2 more•Institutions (3)

Massachusetts Institute of Technology¹, IBM², New Jersey Institute of Technology³

10 Feb 2019

TL;DR: In this article, the authors proposed an approximate fairlet decomposition algorithm that runs in nearly linear time for the fair variant of the classic $k$-median problem, where the points are colored and the goal is to minimize the same average distance objective while ensuring that all clusters have an "approximately equal" number of points of each color.

...read moreread less

Abstract: We study the fair variant of the classic $k$-median problem introduced by Chierichetti et al. [2017]. In the standard $k$-median problem, given an input pointset $P$, the goal is to find $k$ centers $C$ and assign each input point to one of the centers in $C$ such that the average distance of points to their cluster center is minimized. In the fair variant of $k$-median, the points are colored, and the goal is to minimize the same average distance objective while ensuring that all clusters have an "approximately equal" number of points of each color. Chierichetti et al. proposed a two-phase algorithm for fair $k$-clustering. In the first step, the pointset is partitioned into subsets called fairlets that satisfy the fairness requirement and approximately preserve the $k$-median objective. In the second step, fairlets are merged into $k$ clusters by one of the existing $k$-median algorithms. The running time of this algorithm is dominated by the first step, which takes super-quadratic time. In this paper, we present a practical approximate fairlet decomposition algorithm that runs in nearly linear time. Our algorithm additionally allows for finer control over the balance of resulting clusters than the original work. We complement our theoretical bounds with empirical evaluation.

...read moreread less

Proceedings Article•DOI•

Distant Supervised Centroid Shift: A Simple and Efficient Approach to Visual Domain Adaptation

[...]

Jian Liang, Ran He¹, Zhenan Sun¹, Tieniu Tan•Institutions (1)

Chinese Academy of Sciences¹

01 Jun 2019

TL;DR: This paper design a unified objective without accessing the source domain data and adopt an alternating minimization scheme to iteratively discover the pseudo target labels, invariant subspace, and target centroids, which could be regarded as a well-performing baseline for domain adaptation tasks.

...read moreread less

Abstract: Conventional domain adaptation methods usually resort to deep neural networks or subspace learning to find invariant representations across domains. However, most deep learning methods highly rely on large-size source domains and are computationally expensive to train, while subspace learning methods always have a quadratic time complexity that suffers from the large domain size. This paper provides a simple and efficient solution, which could be regarded as a well-performing baseline for domain adaptation tasks. Our method is built upon the nearest centroid classifier, seeking a subspace where the centroids in the target domain are moderately shifted from those in the source domain. Specifically, we design a unified objective without accessing the source domain data and adopt an alternating minimization scheme to iteratively discover the pseudo target labels, invariant subspace, and target centroids. Besides its privacy-preserving property (distant supervision), the algorithm is provably convergent and has a promising linear time complexity. In addition, the proposed method can be readily extended to multi-source setting and domain generalization, and it remarkably enhances popular deep adaptation methods by borrowing the learned transferable features. Extensive experiments on several benchmarks including object, digit, and face recognition datasets validate that our methods yield state-of-the-art results in various domain adaptation tasks.

...read moreread less

Journal Article•DOI•

LinearFold: linear-time approximate RNA folding by 5'-to-3' dynamic programming and beam search.

[...]

Liang Huang¹, Liang Huang², He Zhang², Dezhong Deng¹, Kai Zhao¹, Kaibo Liu², Kaibo Liu¹, David A. Hendrix³, David A. Hendrix¹, David H. Mathews³ - Show less +6 more•Institutions (3)

Oregon State University¹, Baidu², University of Rochester Medical Center³

15 Jul 2019-Bioinformatics

TL;DR: This work is the first RNA folding algorithm to achieve linear runtime (and linear space) without imposing constraints on the output structure, and leads to significantly more accurate predictions on the longest sequence families in that database, as well as improved accuracies for long-range base pairs.

...read moreread less

Abstract: Motivation Predicting the secondary structure of an ribonucleic acid (RNA) sequence is useful in many applications. Existing algorithms [based on dynamic programming] suffer from a major limitation: their runtimes scale cubically with the RNA length, and this slowness limits their use in genome-wide applications. Results We present a novel alternative O(n3)-time dynamic programming algorithm for RNA folding that is amenable to heuristics that make it run in O(n) time and O(n) space, while producing a high-quality approximation to the optimal solution. Inspired by incremental parsing for context-free grammars in computational linguistics, our alternative dynamic programming algorithm scans the sequence in a left-to-right (5'-to-3') direction rather than in a bottom-up fashion, which allows us to employ the effective beam pruning heuristic. Our work, though inexact, is the first RNA folding algorithm to achieve linear runtime (and linear space) without imposing constraints on the output structure. Surprisingly, our approximate search results in even higher overall accuracy on a diverse database of sequences with known structures. More interestingly, it leads to significantly more accurate predictions on the longest sequence families in that database (16S and 23S Ribosomal RNAs), as well as improved accuracies for long-range base pairs (500+ nucleotides apart), both of which are well known to be challenging for the current models. Availability and implementation Our source code is available at https://github.com/LinearFold/LinearFold, and our webserver is at http://linearfold.org (sequence limit: 100 000nt). Supplementary information Supplementary data are available at Bioinformatics online.

...read moreread less

Proceedings Article•

A Polynomial-time Solution for Robust Registration with Extreme Outlier Rates

[...]

Heng Yang¹, Luca Carlone¹•Institutions (1)

Massachusetts Institute of Technology¹

22 Jun 2019

TL;DR: TEASER as mentioned in this paper uses a Truncated Least Squares (TLS) cost that makes the estimation insensitive to a large fraction of spurious point-to-point correspondences.

...read moreread less

Abstract: We propose a robust approach for the registration of two sets of 3D points in the presence of a large amount of outliers. Our first contribution is to reformulate the registration problem using a Truncated Least Squares (TLS) cost that makes the estimation insensitive to a large fraction of spurious point-to-point correspondences. The second contribution is a general framework to decouple rotation, translation, and scale estimation, which allows solving in cascade for the three transformations. Since each subproblem (scale, rotation, and translation estimation) is still non-convex and combinatorial in nature, out third contribution is to show that (i) TLS scale and (component-wise) translation estimation can be solved exactly and in polynomial time via an adaptive voting scheme, (ii) TLS rotation estimation can be relaxed to a semidefinite program and the relaxation is tight in practice, even in the presence of an extreme amount of outliers. We validate the proposed algorithm, named TEASER (Truncated least squares Estimation And SEmidefinite Relaxation), in standard registration benchmarks showing that the algorithm outperforms RANSAC and robust local optimization techniques, and favorably compares with Branch-and-Bound methods, while being a polynomial-time algorithm. TEASER can tolerate up to 99% outliers and returns highly-accurate solutions.

...read moreread less

Proceedings Article•DOI•

Optimization of the Sherrington-Kirkpatrick Hamiltonian

[...]

Andrea Montanari¹•Institutions (1)

Stanford University¹

01 Nov 2019

TL;DR: In this article, a message-passing algorithm was proposed to find the ground state of the Sherrington-Kirkpatrick model of spin glasses, which is a special case of the problem of maximizing the quadratic form associated to A over binary vectors.

...read moreread less

Abstract: Let A be a symmetric random matrix with independent and identically distributed Gaussian entries above the diagonal. We consider the problem of maximizing the quadratic form associated to A over binary vectors. In the language of statistical physics, this amounts to finding the ground state of the Sherrington-Kirkpatrick model of spin glasses. The asymptotic value of this optimization problem was characterized by Parisi via a celebrated variational principle, subsequently proved by Talagrand. We give an algorithm that, for any e > 0, outputs a feasible solution whose value is at least (1 – e) of the optimum, with probability converging to one as the dimension n of the matrix diverges. The algorithm's time complexity is of order n^2. It is a message-passing algorithm, but the specific structure of its update rules is new. As a side result, we prove that, at (low) non-zero temperature, the algorithm constructs approximate solutions of the Thouless-Anderson-Palmer equations.

...read moreread less

Journal Article•DOI•

Ranking nodes in complex networks based on local structure and improving closeness centrality

[...]

Chiman Salavati¹, Alireza Abdollahpouri¹, Zhaleh Manbari¹•Institutions (1)

University of Kurdistan¹

07 Apr 2019-Neurocomputing

TL;DR: A novel ranking algorithm is proposed which improves closeness centrality by taking advantage of local structure of nodes and aims to decrease the computational complexity of this method.

...read moreread less

Journal Article•DOI•

Geometric inhomogeneous random graphs

[...]

Karl Bringmann¹, Ralph Keusch², Johannes Lengler²•Institutions (2)

Max Planck Society¹, ETH Zurich²

14 Feb 2019-Theoretical Computer Science

TL;DR: Geometric inhomogeneous random graphs (GIRGs) as discussed by the authors are a generalization of hyperbolic random graphs, and they have small separators, i.e., it suffices to delete a sublinear number of edges to break the giant component into two large pieces.

...read moreread less

Posted Content•

Scalable Gromov-Wasserstein Learning for Graph Partitioning and Matching

[...]

Hongteng Xu¹, Dixin Luo¹, Lawrence Carin¹•Institutions (1)

Duke University¹

18 May 2019-arXiv: Learning

TL;DR: This method is the first attempt to make Gromov-Wasserstein discrepancy applicable to large-scale graph analysis and unify graph partitioning and matching into the same framework, and outperforms state-of-the-art graph partitions and matching methods, achieving a trade-off between accuracy and efficiency.

...read moreread less

Abstract: We propose a scalable Gromov-Wasserstein learning (S-GWL) method and establish a novel and theoretically-supported paradigm for large-scale graph analysis. The proposed method is based on the fact that Gromov-Wasserstein discrepancy is a pseudometric on graphs. Given two graphs, the optimal transport associated with their Gromov-Wasserstein discrepancy provides the correspondence between their nodes and achieves graph matching. When one of the graphs has isolated but self-connected nodes ($i.e.$, a disconnected graph), the optimal transport indicates the clustering structure of the other graph and achieves graph partitioning. Using this concept, we extend our method to multi-graph partitioning and matching by learning a Gromov-Wasserstein barycenter graph for multiple observed graphs; the barycenter graph plays the role of the disconnected graph, and since it is learned, so is the clustering. Our method combines a recursive $K$-partition mechanism with a regularized proximal gradient algorithm, whose time complexity is $\mathcal{O}(K(E+V)\log_K V)$ for graphs with $V$ nodes and $E$ edges. To our knowledge, our method is the first attempt to make Gromov-Wasserstein discrepancy applicable to large-scale graph analysis and unify graph partitioning and matching into the same framework. It outperforms state-of-the-art graph partitioning and matching methods, achieving a trade-off between accuracy and efficiency.

...read moreread less

Journal Article•DOI•

Ultra-Scalable Spectral Clustering and Ensemble Clustering

[...]

Dong Huang¹, Chang-Dong Wang², Jian-Sheng Wu³, Jian-Huang Lai², Chee Keong Kwoh⁴ - Show less +1 more•Institutions (4)

South China Agricultural University¹, Sun Yat-sen University², Nanchang University³, Nanyang Technological University⁴

04 Mar 2019-arXiv: Learning

TL;DR: It is noteworthy that both U-SPEC and U-SENC have nearly linear time and space complexity, and are capable of robustly and efficiently partitioning 10-million-level nonlinearly-separable datasets on a PC with 64 GB memory.

...read moreread less

Abstract: This paper focuses on scalability and robustness of spectral clustering for extremely large-scale datasets with limited resources. Two novel algorithms are proposed, namely, ultra-scalable spectral clustering (U-SPEC) and ultra-scalable ensemble clustering (U-SENC). In U-SPEC, a hybrid representative selection strategy and a fast approximation method for K-nearest representatives are proposed for the construction of a sparse affinity sub-matrix. By interpreting the sparse sub-matrix as a bipartite graph, the transfer cut is then utilized to efficiently partition the graph and obtain the clustering result. In U-SENC, multiple U-SPEC clusterers are further integrated into an ensemble clustering framework to enhance the robustness of U-SPEC while maintaining high efficiency. Based on the ensemble generation via multiple U-SEPC's, a new bipartite graph is constructed between objects and base clusters and then efficiently partitioned to achieve the consensus clustering result. It is noteworthy that both U-SPEC and U-SENC have nearly linear time and space complexity, and are capable of robustly and efficiently partitioning ten-million-level nonlinearly-separable datasets on a PC with 64GB memory. Experiments on various large-scale datasets have demonstrated the scalability and robustness of our algorithms. The MATLAB code and experimental data are available at this https URL.

...read moreread less

Journal Article•DOI•

Cost Optimization for Dynamic Replication and Migration of Data in Cloud Data Centers

[...]

Yaser Mansouri¹, Adel Nadjaran Toosi¹, Rajkumar Buyya¹•Institutions (1)

University of Melbourne¹

01 Jul 2019-IEEE Transactions on Cloud Computing

TL;DR: An optimal offline algorithm that leverages dynamic and linear programming techniques with the assumption of available exact knowledge of workload on objects is proposed and two online algorithms that make a trade-off between residential and migration costs and dynamically select storage classes across CSPs are proposed.

...read moreread less

Abstract: Cloud Storage Providers (CSPs) offer geographically data stores providing several storage classes with different prices. An important problem facing by cloud users is how to exploit these storage classes to serve an application with a time-varying workload on its objects at minimum cost. This cost consists of residential cost (i.e., storage, Put and Get costs) and potential migration cost (i.e., network cost). To address this problem, we first propose the optimal offline algorithm that leverages dynamic and linear programming techniques with the assumption of available exact knowledge of workload on objects. Due to the high time complexity of this algorithm and its requirement for a priori knowledge, we propose two online algorithms that make a trade-off between residential and migration costs and dynamically select storage classes across CSPs. The first online algorithm is deterministic with no need of any knowledge of workload and incurs no more than $2\gamma -1$2γ-1 times of the minimum cost obtained by the optimal offline algorithm, where $\gamma$γ is the ratio of the residential cost in the most expensive data store to the cheapest one in either network or storage cost. The second online algorithm is randomized that leverages “Receding Horizon Control” (RHC) technique with the exploitation of available future workload information for $w$w time slots. This algorithm incurs at most $1+\frac{\gamma }{w}$1+γw times the optimal cost. The effectiveness of the proposed algorithms is demonstrated through simulations using a workload synthesized based on characteristics of the Facebook workload.

...read moreread less

Journal Article•DOI•

A single shot coherent Ising machine based on a network of injection-locked multicore fiber lasers

[...]

Masoud Babaeian¹, Dan T. Nguyen¹, Veysi Demir², Mehmetcan Akbulut¹, Pierre Alexandre Blanche¹, Yushi Kaneda¹, Saikat Guha¹, Mark A. Neifeld¹, N. Peyghambarian¹ - Show less +5 more•Institutions (2)

University of Arizona¹, ASML Holding²

06 Aug 2019-Nature Communications

TL;DR: An analog all-optical implementation of a coherent Ising machine (CIM) based on a network of injection-locked multicore fiber lasers using spatial light modulators (SLMs) to solve several Ising Hamiltonians.

...read moreread less

Abstract: Combinatorial optimization problems over large and complex systems have many applications in social networks, image processing, artificial intelligence, computational biology and a variety of other areas. Finding the optimized solution for such problems in general are usually in non-deterministic polynomial time (NP)-hard complexity class. Some NP-hard problems can be easily mapped to minimizing an Ising energy function. Here, we present an analog all-optical implementation of a coherent Ising machine (CIM) based on a network of injection-locked multicore fiber (MCF) lasers. The Zeeman terms and the mutual couplings appearing in the Ising Hamiltonians are implemented using spatial light modulators (SLMs). As a proof-of-principle, we demonstrate the use of optics to solve several Ising Hamiltonians for up to thirteen nodes. Overall, the average accuracy of the CIM to find the ground state energy was ~90% for 120 trials. The fundamental bottlenecks for the scalability and programmability of the presented CIM are discussed as well. For specific computation problems where electronic digital processors have shortcomings in simulating, an analog optical system may be a solution, Here, the authors present an analog all-optical implementation of a coherent Ising machine based on a network of injection-locked multicore fiber lasers.

...read moreread less

Collapse