scispace - formally typeset
Search or ask a question

Showing papers on "Communication complexity published in 2018"


Proceedings ArticleDOI
20 May 2018
TL;DR: Hyrax as mentioned in this paper is a zero-knowledge argument for NP with low communication complexity, low concrete cost for both the prover and the verifier, and no trusted setup, based on standard cryptographic assumptions.
Abstract: We present a zero-knowledge argument for NP with low communication complexity, low concrete cost for both the prover and the verifier, and no trusted setup, based on standard cryptographic assumptions. Communication is proportional to d log G (for d the depth and G the width of the verifying circuit) plus the square root of the witness size. When applied to batched or data-parallel statements, the prover's runtime is linear and the verifier's is sub-linear in the verifying circuit size, both with good constants. In addition, witness-related communication can be reduced, at the cost of increased verifier runtime, by leveraging a new commitment scheme for multilinear polynomials, which may be of independent interest. These properties represent a new point in the tradeoffs among setup, complexity assumptions, proof size, and computational cost. We apply the Fiat-Shamir heuristic to this argument to produce a zero-knowledge succinct non-interactive argument of knowledge (zkSNARK) in the random oracle model, based on the discrete log assumption, which we call Hyrax. We implement Hyrax and evaluate it against five state-of-the-art baseline systems. Our evaluation shows that, even for modest problem sizes, Hyrax gives smaller proofs than all but the most computationally costly baseline, and that its prover and verifier are each faster than three of the five baselines.

199 citations


Posted Content
TL;DR: This work presents HotStuff, a leader-based Byzantine fault-tolerant replication protocol for the partially synchronous model that enables a correct leader to drive the protocol to consensus at the pace of actual network delay with communication complexity that is linear in the number of replicas.
Abstract: We present HotStuff, a leader-based Byzantine fault-tolerant replication protocol for the partially synchronous model. Once network communication becomes synchronous, HotStuff enables a correct leader to drive the protocol to consensus at the pace of actual (vs. maximum) network delay--a property called responsiveness--and with communication complexity that is linear in the number of replicas. To our knowledge, HotStuff is the first partially synchronous BFT replication protocol exhibiting these combined properties. HotStuff is built around a novel framework that forms a bridge between classical BFT foundations and blockchains. It allows the expression of other known protocols (DLS, PBFT, Tendermint, Casper), and ours, in a common framework. Our deployment of HotStuff over a network with over 100 replicas achieves throughput and latency comparable to that of BFT-SMaRt, while enjoying linear communication footprint during leader failover (vs. quadratic with BFT-SMaRt).

125 citations


Posted Content
TL;DR: GossipGraD can achieve perfect efficiency for these datasets and their associated neural network topologies such as GoogLeNet and ResNet50 and is able to achieve ~100% compute efficiency using 128 NVIDIA Pascal P100 GPUs - while matching the top-1 classification accuracy published in literature.
Abstract: In this paper, we present GossipGraD - a gossip communication protocol based Stochastic Gradient Descent (SGD) algorithm for scaling Deep Learning (DL) algorithms on large-scale systems. The salient features of GossipGraD are: 1) reduction in overall communication complexity from {\Theta}(log(p)) for p compute nodes in well-studied SGD to O(1), 2) model diffusion such that compute nodes exchange their updates (gradients) indirectly after every log(p) steps, 3) rotation of communication partners for facilitating direct diffusion of gradients, 4) asynchronous distributed shuffle of samples during the feedforward phase in SGD to prevent over-fitting, 5) asynchronous communication of gradients for further reducing the communication cost of SGD and GossipGraD. We implement GossipGraD for GPU and CPU clusters and use NVIDIA GPUs (Pascal P100) connected with InfiniBand, and Intel Knights Landing (KNL) connected with Aries network. We evaluate GossipGraD using well-studied dataset ImageNet-1K (~250GB), and widely studied neural network topologies such as GoogLeNet and ResNet50 (current winner of ImageNet Large Scale Visualization Research Challenge (ILSVRC)). Our performance evaluation using both KNL and Pascal GPUs indicates that GossipGraD can achieve perfect efficiency for these datasets and their associated neural network topologies. Specifically, for ResNet50, GossipGraD is able to achieve ~100% compute efficiency using 128 NVIDIA Pascal P100 GPUs - while matching the top-1 classification accuracy published in literature.

79 citations


Proceedings ArticleDOI
11 Jul 2018
TL;DR: In this paper, the authors propose a new integrated method of exploiting model, batch and domain parallelism for the training of deep neural networks (DNNs) on large distributed-memory computers using minibatch stochastic gradient descent (SGD).
Abstract: We propose a new integrated method of exploiting model, batch and domain parallelism for the training of deep neural networks (DNNs) on large distributed-memory computers using minibatch stochastic gradient descent (SGD). Our goal is to find an efficient parallelization strategy for a fixed batch size using P processes. Our method is inspired by the communication-avoiding algorithms in numerical linear algebra. We see P processes as logically divided into a P_r x P_c grid where the P_r dimension is implicitly responsible for model/domain parallelism and the P_c dimension is implicitly responsible for batch parallelism. In practice, the integrated matrix-based parallel algorithm encapsulates these types of parallelism automatically. We analyze the communication complexity and analytically demonstrate that the lowest communication costs are often achieved neither with pure model nor with pure data parallelism. We also show how the domain parallel approach can help in extending the theoretical scaling limit of the typical batch parallel method.

77 citations


Proceedings ArticleDOI
15 Oct 2018
TL;DR: This work builds upon the unbalanced PSI protocol of Chen, Laine, and Rindal in several ways, adds efficient support for arbitrary length items, construct and implement an unbalanced Labeled PSI Protocol with small communication complexity, and strengthens the security model using Oblivious Pseudo-Random Function (OPRF) in a pre-processing phase.
Abstract: Private Set Intersection (PSI) allows two parties, the sender and the receiver, to compute the intersection of their private sets without revealing extra information to each other We are interested in the unbalanced PSI setting, where (1) the receiver's set is significantly smaller than the sender's, and (2) the receiver (with the smaller set) has a low-power device Also, in a Labeled PSI setting, the sender holds a label per each item in its set, and the receiver obtains the labels from the items in the intersection We build upon the unbalanced PSI protocol of Chen, Laine, and Rindal (CCS~2017) in several ways: we add efficient support for arbitrary length items, we construct and implement an unbalanced Labeled PSI protocol with small communication complexity, and also strengthen the security model using Oblivious Pseudo-Random Function (OPRF) in a pre-processing phase Our protocols outperform previous ones: for an intersection of 220 and $512$ size sets of arbitrary length items our protocol has a total online running time of just $1$~second (single thread), and a total communication cost of 4 MB For a larger example, an intersection of 228 and 1024 size sets of arbitrary length items has an online running time of $12$ seconds (multi-threaded), with less than 18 MB of total communication

74 citations


Journal Article
TL;DR: New protocols for Byzantine agreement in the synchronous and authenticated setting, tolerating the optimal number of f faults among \(n=2f+1\) parties are presented, achieving an expected O(1) round complexity and an expected \(O(n^2)\) communication complexity.
Abstract: We present new protocols for Byzantine agreement in the synchronous and authenticated setting, tolerating the optimal number of f faults among \(n=2f+1\) parties. Our protocols achieve an expected O(1) round complexity and an expected \(O(n^2)\) communication complexity. The exact round complexity in expectation is 10 for a static adversary and 16 for a strongly rushing adaptive adversary. For comparison, previous protocols in the same setting require expected 29 rounds.

66 citations


Posted Content
TL;DR: In this paper, the authors show that disallowing after-the-fact removal is necessary for achieving subquadratic-communication Byzantine agreement (BA) protocols with near-optimal resilience and expected constant rounds under standard cryptographic assumptions and a public-key infrastructure.
Abstract: As Byzantine Agreement (BA) protocols find application in large-scale decentralized cryptocurrencies, an increasingly important problem is to design BA protocols with improved communication complexity. A few existing works have shown how to achieve subquadratic BA under an {\it adaptive} adversary. Intriguingly, they all make a common relaxation about the adaptivity of the attacker, that is, if an honest node sends a message and then gets corrupted in some round, the adversary {\it cannot erase the message that was already sent} --- henceforth we say that such an adversary cannot perform "after-the-fact removal". By contrast, many (super-)quadratic BA protocols in the literature can tolerate after-the-fact removal. In this paper, we first prove that disallowing after-the-fact removal is necessary for achieving subquadratic-communication BA. Next, we show new subquadratic binary BA constructions (of course, assuming no after-the-fact removal) that achieves near-optimal resilience and expected constant rounds under standard cryptographic assumptions and a public-key infrastructure (PKI) in both synchronous and partially synchronous settings. In comparison, all known subquadratic protocols make additional strong assumptions such as random oracles or the ability of honest nodes to erase secrets from memory, and even with these strong assumptions, no prior work can achieve the above properties. Lastly, we show that some setup assumption is necessary for achieving subquadratic multicast-based BA.

65 citations


Posted Content
TL;DR: A unified analysis framework for distributed gradient methods operating with staled and compressed gradients is presented and non-asymptotic bounds on convergence rates and information exchange are derived for several optimization algorithms.
Abstract: Asynchronous computation and gradient compression have emerged as two key techniques for achieving scalability in distributed optimization for large-scale machine learning. This paper presents a unified analysis framework for distributed gradient methods operating with staled and compressed gradients. Non-asymptotic bounds on convergence rates and information exchange are derived for several optimization algorithms. These bounds give explicit expressions for step-sizes and characterize how the amount of asynchrony and the compression accuracy affect iteration and communication complexity guarantees. Numerical results highlight convergence properties of different gradient compression algorithms and confirm that fast convergence under limited information exchange is indeed possible.

62 citations


Proceedings ArticleDOI
11 Jul 2018
TL;DR: In this article, Klauck et al. showed a lower bound of Ω(n/k^1/3 ) for the round complexity of triangle enumeration in the message-passing model.
Abstract: Motivated by the increasing need to understand the distributed algorithmic foundations of large-scale graph computations, we study some fundamental graph problems in a message-passing model for distributed computing where $k \geq 2$ machines jointly perform computations on graphs with n nodes (typically, $n \gg k$). The input graph is assumed to be initially randomly partitioned among the k machines, a common implementation in many real-world systems. Communication is point-to-point, and the goal is to minimize the number of communication \em rounds of the computation. Our main contribution is the \em General Lower Bound Theorem, a theorem that can be used to show non-trivial lower bounds on the round complexity of distributed large-scale data computations. The General Lower Bound Theorem is established via an information-theoretic approach that relates the round complexity to the minimal amount of information required by machines to solve the problem. Our approach is generic and this theorem can be used in a "cookbook" fashion to show distributed lower bounds in the context of several problems, including non-graph problems. We present two applications by showing (almost) tight lower bounds for the round complexity of two fundamental graph problems, namely \em PageRank computation and \em triangle enumeration. Our approach, as demonstrated in the case of PageRank, can yield tight lower bounds for problems (including, and especially, under a stochastic partition of the input) where communication complexity techniques are not obvious. Our approach, as demonstrated in the case of triangle enumeration, can yield stronger round lower bounds as well as message-round tradeoffs compared to approaches that use communication complexity techniques. We then present distributed algorithms for PageRank and triangle enumeration with a round complexity that (almost) matches the respective lower bounds; these algorithms exhibit a round complexity which scales superlinearly in k , improving significantly over previous results for these problems [Klauck et al., SODA 2015]. Specifically, we show the following results: \beginitemize item \em PageRank: We show a lower bound of $\tildeOmega (n/k^2)$ rounds, and present a distributed algorithm that computes the PageRank of all the nodes of a graph in $\tildeO (n/k^2)$ rounds. item \em Triangle enumeration: We show that there exist graphs with m edges where any distributed algorithm requires $\tildeOmega (m/k^5/3 )$ rounds. This result also implies the first non-trivial lower bound of $\tildeOmega(n^1/3 )$ rounds for the \em congested clique model, which is tight up to logarithmic factors. We then present a distributed algorithm that enumerates all the triangles of a graph in $\tildeO (m/k^5/3 + n/k^4/3 )$ rounds. \enditemize

59 citations


Posted Content
TL;DR: In this paper, Chen, Laine, and Rindal proposed an unbalanced PSI protocol with small communication complexity, and also strengthened the security model using oblivious pseudo-random function (OPRF) in a preprocessing phase.
Abstract: Private Set Intersection (PSI) allows two parties, the sender and the receiver, to compute the intersection of their private sets without revealing extra information to each other. We are interested in the unbalanced PSI setting, where (1) the receiver's set is significantly smaller than the sender's, and (2) the receiver (with the smaller set) has a low-power device. Also, in a Labeled PSI setting, the sender holds a label per each item in its set, and the receiver obtains the labels from the items in the intersection. We build upon the unbalanced PSI protocol of Chen, Laine, and Rindal (CCS~2017) in several ways: we add efficient support for arbitrary length items, we construct and implement an unbalanced Labeled PSI protocol with small communication complexity, and also strengthen the security model using Oblivious Pseudo-Random Function (OPRF) in a pre-processing phase. Our protocols outperform previous ones: for an intersection of 220 and $512$ size sets of arbitrary length items our protocol has a total online running time of just $1$~second (single thread), and a total communication cost of 4 MB. For a larger example, an intersection of 228 and 1024 size sets of arbitrary length items has an online running time of $12$ seconds (multi-threaded), with less than 18 MB of total communication.

58 citations


Book ChapterDOI
26 Feb 2018
TL;DR: This paper builds on modern cryptographic engineering techniques and proposes optimizations for a promising one-way PSI protocol based on public-key cryptography that outperforms the communication complexity and the run time of previous proposals by around one thousand times.
Abstract: Protocols for Private Set Intersection (PSI) are important cryptographic primitives that perform joint operations on datasets in a privacy-preserving way. They allow two parties to compute the intersection of their private sets without revealing any additional information beyond the intersection itself. Unfortunately, PSI implementations in the literature do not usually employ the best possible cryptographic implementation techniques. This results in protocols presenting computational and communication complexities that are prohibitive, particularly in the case when one of the participants is a low-powered device and there are bandwidth restrictions. This paper builds on modern cryptographic engineering techniques and proposes optimizations for a promising one-way PSI protocol based on public-key cryptography. For the case when one of the parties holds a set much smaller than the other (a realistic assumption in many scenarios) we show that our improvements and optimizations yield a protocol that outperforms the communication complexity and the run time of previous proposals by around one thousand times.

Journal ArticleDOI
TL;DR: It is shown that deterministic communication complexity can be superlogarithmic in the partition number of the associated communication matrix and near-optimal deterministic lower bounds for the Clique vs. Independent Set problem are obtained.
Abstract: We show that deterministic communication complexity can be superlogarithmic in the partition number of the associated communication matrix. We also obtain near-optimal deterministic lower bounds fo...

Book ChapterDOI
19 Aug 2018
TL;DR: In this paper, a packed version of Shamir's secret sharing scheme with active security was proposed. But this scheme is not suitable for general n-player MPC, since the adversary threshold t < n/3 and the communication complexity of the adversary is amortized.
Abstract: A fundamental and widely-applied paradigm due to Franklin and Yung (STOC 1992) on Shamir-secret-sharing based general n-player MPC shows how one may trade the adversary threshold t against amortized communication complexity, by using a so-called packed version of Shamir’s scheme. For e.g. the BGW-protocol (with active security), this trade-off means that if \(t + 2k -2 < n/3\), then k parallel evaluations of the same arithmetic circuit on different inputs can be performed at the overall cost corresponding to a single BGW-execution.

Journal ArticleDOI
TL;DR: This work focuses on a family of CCPs, based on facet Bell inequalities, and finds that the advantages are tied to the use of measurements that are not rank-one projective, and provides an experimental semi-device-independent falsification of such measurements in Hilbert space dimension six.
Abstract: Quantum resources can improve communication complexity problems (CCPs) beyond their classical constraints. One quantum approach is to share entanglement and create correlations violating a Bell inequality, which can then assist classical communication. A second approach is to resort solely to the preparation, transmission, and measurement of a single quantum system, in other words, quantum communication. Here, we show the advantages of the latter over the former in high-dimensional Hilbert space. We focus on a family of CCPs, based on facet Bell inequalities, study the advantage of high-dimensional quantum communication, and realize such quantum communication strategies using up to ten-dimensional systems. The experiment demonstrates, for growing dimension, an increasing advantage over quantum strategies based on Bell inequality violation. For sufficiently high dimensions, quantum communication also surpasses the limitations of the postquantum Bell correlations obeying only locality in the macroscopic limit. We find that the advantages are tied to the use of measurements that are not rank-one projective, and provide an experimental semi-device-independent falsification of such measurements in Hilbert space dimension six.

Journal ArticleDOI
TL;DR: Two measures of communication complexity of dual decomposition methods are introduced and explored to identify the most efficient communication among these algorithms to demonstrate a tradeoff between fast convergence and primal feasibility.
Abstract: Dual decomposition methods are among the most prominent approaches for finding primal/dual saddle point solutions of resource allocation optimization problems. To deploy these methods in the emerging Internet of things networks, which will often have limited data rates, it is important to understand the communication overhead they require. Motivated by this, we introduce and explore two measures of communication complexity of dual decomposition methods to identify the most efficient communication among these algorithms. The first measure is $\epsilon$ -complexity, which quantifies the minimal number of bits needed to find an $\epsilon$ -accurate solution. The second measure is $b$ -complexity, which quantifies the best possible solution accuracy that can be achieved from communicating $b$ bits. We find the exact $\epsilon$ - and $b$ -complexity of a class of resource allocation problems where a single supplier allocates resources to multiple users. For both the primal and dual problems, the $\epsilon$ -complexity grows proportionally to $\log _2(1/\epsilon)$ and the $b$ -complexity proportionally to $1/2^b$ . We also introduce a variant of the $\epsilon$ - and $b$ -complexity measures where only algorithms that ensure primal feasibility of the iterates are allowed. Such algorithms are often desirable because overuse of the resources can overload the respective systems, e.g., by causing blackouts in power systems. We provide upper and lower bounds on the convergence rate of these primal feasible complexity measures. In particular, we show that the $b$ -complexity cannot converge at a faster rate than $\mathcal {O}(1/b)$ . Therefore, the results demonstrate a tradeoff between fast convergence and primal feasibility. We illustrate the result by numerical studies.

Posted Content
TL;DR: This work presents a hashing-based algorithm for Private Set Intersection (PSI) in the honest-but-curious setting that is generic, modular and provides both asymptotic and concrete efficiency improvements over existing PSI protocols.
Abstract: This work presents a hashing-based algorithm for Private Set Intersection (PSI) in the honest-but-curious setting. The protocol is generic, modular and provides both asymptotic and concrete efficiency improvements over existing PSI protocols. If each player has m elements, our scheme requires only $O(m \secpar)$ communication between the parties, where $\secpar$ is a security parameter. Our protocol builds on the hashing-based PSI protocol of Pinkas et al. (USENIX 2014, USENIX 2015), but we replace one of the sub-protocols (handling the cuckoo "stash'') with a special-purpose PSI protocol that is optimized for comparing sets of unbalanced size. This brings the asymptotic communication complexity of the overall protocol down from $omega(m \secpar)$ to $O(m\secpar)$, and provides concrete performance improvements (10-15% reduction in communication costs) over Kolesnikov et al. (CCS 2016) under real-world parameter choices. Our protocol is simple, generic and benefits from the permutation-hashing optimizations of Pinkas et al. (USENIX 2015) and the Batched, Relaxed Oblivious Pseudo Random Functions of Kolesnikov et al. (CCS 2016).

Proceedings ArticleDOI
20 Jun 2018
TL;DR: The authors' scheme is privately verifiable, where the verifier needs the corresponding secret key in order to verify proofs, and yields succinct non-interactive arguments based on sub-exponential LWE, for many natural languages believed to be outside of P.
Abstract: We construct a delegation scheme for verifying non-deterministic computations, with complexity proportional only to the non-deterministic space of the computation. Specifically, letting n denote the input length, we construct a delegation scheme for any language verifiable in non-deterministic time and space (T(n), S(n)) with communication complexity poly(S(n)), verifier runtime n.polylog(T(n))+poly(S(n)), and prover runtime poly(T(n)). Our scheme consists of only two messages and has adaptive soundness, assuming the existence of a sub-exponentially secure private information retrieval (PIR) scheme, which can be instantiated under standard (albeit, sub-exponential) cryptographic assumptions, such as the sub-exponential LWE assumption. Specifically, the verifier publishes a (short) public key ahead of time, and this key can be used by any prover to non-interactively prove the correctness of any adaptively chosen non-deterministic computation. Such a scheme is referred to as a non-interactive delegation scheme. Our scheme is privately verifiable, where the verifier needs the corresponding secret key in order to verify proofs. Prior to our work, such results were known only in the Random Oracle Model, or under knowledge assumptions. Our results yield succinct non-interactive arguments based on sub-exponential LWE, for many natural languages believed to be outside of P.

Journal ArticleDOI
TL;DR: It is shown that randomized communication complexity can be superlogarithmic in the partition number of the associated communication matrix, and near-optimal randomized lower bounds for the Clique versus Independent Set problem are obtained.
Abstract: We show that randomized communication complexity can be superlogarithmic in the partition number of the associated communication matrix, and we obtain near-optimal randomized lower bounds for the Clique versus Independent Set problem. These results strengthen the deterministic lower bounds obtained in prior work (Goos, Pitassi, and Watson, FOCS’15). One of our main technical contributions states that information complexity when the cost is measured with respect to only 1-inputs (or only 0-inputs) is essentially equivalent to information complexity with respect to all inputs.

Proceedings ArticleDOI
19 Jul 2018
TL;DR: In this paper, a communication and computation-efficient distributed optimization algorithm using second-order information for solving ERM problems with a nonsmooth regularization term is proposed, which enjoys global linear convergence for a broad range of non-strongly convex problems.
Abstract: We propose a communication- and computation-efficient distributed optimization algorithm using second-order information for solving ERM problems with a nonsmooth regularization term. Current second-order and quasi-Newton methods for this problem either do not work well in the distributed setting or work only for specific regularizers. Our algorithm uses successive quadratic approximations, and we describe how to maintain an approximation of the Hessian and solve subproblems efficiently in a distributed manner. The proposed method enjoys global linear convergence for a broad range of non-strongly convex problems that includes the most commonly used ERMs, thus requiring lower communication complexity. It also converges on non-convex problems, so has the potential to be used on applications such as deep learning. Initial computational results on convex problems demonstrate that our method significantly improves on communication cost and running time over the current state-of-the-art methods.

Proceedings ArticleDOI
20 Jun 2018
TL;DR: A new technique for proving lower bounds in the setting of asymmetric communication is developed and a deterministic lower- bound is obtained which is provably better than any lower-bound that may be obtained by the classical Richness Method.
Abstract: We develop a new technique for proving lower bounds in the setting of asymmetric communication, a model that was introduced in the famous works of Miltersen (STOC’94) and Miltersen, Nisan, Safra and Wigderson (STOC’95). At the core of our technique is the first simulation theorem in the asymmetric setting, where Alice gets a p × n matrix x over F2 and Bob gets a vector y ∈ F2n. Alice and Bob need to evaluate f(x· y) for a Boolean function f: {0,1}p → {0,1}. Our simulation theorems show that a deterministic/randomized communication protocol exists for this problem, with cost C· n for Alice and C for Bob, if and only if there exists a deterministic/randomized *parity decision tree* of cost Θ(C) for evaluating f. As applications of this technique, we obtain the following results: 1. The first strong lower-bounds against randomized data-structure schemes for the Vector-Matrix-Vector product problem over F2. Moreover, our method yields strong lower bounds even when the data-structure scheme has tiny advantage over random guessing. 2. The first lower bounds against randomized data-structures schemes for two natural Boolean variants of Orthogonal Vector Counting. 3. We construct an asymmetric communication problem and obtain a deterministic lower-bound for it which is provably better than any lower-bound that may be obtained by the classical Richness Method of Miltersen et al. (STOC ’95). This seems to be the first known limitation of the Richness Method in the context of proving deterministic lower bounds.

Journal ArticleDOI
TL;DR: This work gives an example of a boolean function whose information complexity is exponentially smaller than its communication complexity, and simplifies recent work of Ganor, Kol and Raz.
Abstract: We give an example of a boolean function whose information complexity is exponentially smaller than its communication complexity. Our result simplifies recent work of Ganor, Kol and Raz [GKR14a, GKR14b].

Posted Content
13 Feb 2018
TL;DR: Hadamard Response (HR) is proposed, a local non-interactive privatization mechanism with order optimal sample complexity (for all privacy regimes), a communication complexity of $\log k+2$ bits, and runs in nearly linear time.
Abstract: We consider discrete distribution estimation over $k$ elements under $\varepsilon$-local differential privacy from $n$ samples The samples are distributed across users who send privatized versions of their sample to the server All previously known sample optimal algorithms require linear (in $k$) communication complexity in the high privacy regime $(\varepsilon<1)$, and have a running time that grows as $n\cdot k$, which can be prohibitive for large domain size $k$ We study the task simultaneously under four resource constraints, privacy, sample complexity, computational complexity, and communication complexity We propose \emph{Hadamard Response (HR)}, a local non-interactive privatization mechanism with order optimal sample complexity (for all privacy regimes), a communication complexity of $\log k+2$ bits, and runs in nearly linear time Our encoding and decoding mechanisms are based on Hadamard matrices, and are simple to implement The gain in sample complexity comes from the large Hamming distance between rows of Hadamard matrices, and the gain in time complexity is achieved by using the Fast Walsh-Hadamard transform We compare our approach with Randomized Response (RR), RAPPOR, and subset-selection mechanisms (SS), theoretically, and experimentally For $k=10000$, our algorithm runs about 100x faster than SS, and RAPPOR

Posted Content
TL;DR: A tight trade-off between the memory/communication complexity and the sample complexity is proved, implying (for example) that to detect pairwise correlations with optimal sample complexity, the number of required memory/ communication bits is at least quadratic in the dimension.
Abstract: We study the problem of identifying correlations in multivariate data, under information constraints: Either on the amount of memory that can be used by the algorithm, or the amount of communication when the data is distributed across several machines. We prove a tight trade-off between the memory/communication complexity and the sample complexity, implying (for example) that to detect pairwise correlations with optimal sample complexity, the number of required memory/communication bits is at least quadratic in the dimension. Our results substantially improve those of Shamir [2014], which studied a similar question in a much more restricted setting. To the best of our knowledge, these are the first provable sample/memory/communication trade-offs for a practical estimation problem, using standard distributions, and in the natural regime where the memory/communication budget is larger than the size of a single data point. To derive our theorems, we prove a new information-theoretic result, which may be relevant for studying other information-constrained learning problems.

Proceedings ArticleDOI
01 Jan 2018
TL;DR: In this article, the authors introduce a monotone variant of Xor-Sat and show that it has exponential circuit complexity, which can be interpreted as separating subclasses of TFNP in communication complexity.
Abstract: Separations: We introduce a monotone variant of Xor-Sat and show it has exponential monotone circuit complexity. Since Xor-Sat is in NC^2, this improves qualitatively on the monotone vs. non-monotone separation of Tardos (1988). We also show that monotone span programs over R can be exponentially more powerful than over finite fields. These results can be interpreted as separating subclasses of TFNP in communication complexity. Characterizations: We show that the communication (resp. query) analogue of PPA (subclass of TFNP) captures span programs over F_2 (resp. Nullstellensatz degree over F_2). Previously, it was known that communication FP captures formulas (Karchmer - Wigderson, 1988) and that communication PLS captures circuits (Razborov, 1995).

Book ChapterDOI
Peter Scholl1
25 Mar 2018
TL;DR: This work presents a new approach to extending oblivious transfer with communication complexity that is logarithmic in the security parameter, and results in the first oblivious transfer protocol with sublinear communication and active security, which does not require any non-black-box use of cryptographic primitives.
Abstract: We present a new approach to extending oblivious transfer with communication complexity that is logarithmic in the security parameter. Our method only makes black-box use of the underlying cryptographic primitives, and can achieve security against an active adversary with almost no overhead on top of passive security. This results in the first oblivious transfer protocol with sublinear communication and active security, which does not require any non-black-box use of cryptographic primitives.

Journal ArticleDOI
TL;DR: This work uses critical block sensitivity, a new complexity measure introduced by Huynh and Nordstrom [STOC 2012, ACM, NY, 2012, 2012], to study the communication complexity of search problems.
Abstract: We use critical block sensitivity, a new complexity measure introduced by Huynh and Nordstrom [STOC 2012, ACM, NY, 2012, pp. 233--248], to study the communication complexity of search problems. To ...

Posted Content
TL;DR: A cloud-based protocol for a constrained quadratic optimization problem involving multiple parties, each holding private data, that exploits partially homomorphic encryption and secure communication techniques is developed and shown to achieve computational privacy.
Abstract: The development of large-scale distributed control systems has led to the outsourcing of costly computations to cloud-computing platforms, as well as to concerns about privacy of the collected sensitive data. This paper develops a cloud-based protocol for a quadratic optimization problem involving multiple parties, each holding information it seeks to maintain private. The protocol is based on the projected gradient ascent on the Lagrange dual problem and exploits partially homomorphic encryption and secure multi-party computation techniques. Using formal cryptographic definitions of indistinguishability, the protocol is shown to achieve computational privacy, i.e., there is no computationally efficient algorithm that any involved party can employ to obtain private information beyond what can be inferred from the party's inputs and outputs only. In order to reduce the communication complexity of the proposed protocol, we introduced a variant that achieves this objective at the expense of weaker privacy guarantees. We discuss in detail the computational and communication complexity properties of both algorithms theoretically and also through implementations. We conclude the paper with a discussion on computational privacy and other notions of privacy such as the non-unique retrieval of the private information from the protocol outputs.

Proceedings ArticleDOI
01 Jul 2018
TL;DR: The Musch protocol is presented, this is the first BFT-based blockchain protocol which efficiently addresses simultaneously the issues of communication complexity and latency under the presence of failures.
Abstract: There is surge of interest to the blockchain technology not only in the scientific community but in the business community as well. Proof of Work (PoW) and Byzantine Fault Tolerant (BFT) are the two main classes of consensus protocols that are used in the blockchain consensus layer. PoW is highly scalable but very slow with about 7 (transactions/second) performance. BFT based protocols are highly efficient but their scalability are limited to only tens of nodes. One of the main reasons for the BFT limitation is the quadratic $O(n^{2})$ communication complexity of BFT based protocols for $n$ nodes that requires $n \times n$ broadcasting. In this paper, we present the Musch protocol which is BFT based and provides communication complexity $O(fn+n)$ for $f$ failures and $n$ nodes, where $f , without compromising the latency. Hence, the performance adjusts to $f$ such that for constant $f$ the communication complexity is linear. Musch achieves this by introducing the notion of exponentially increasing windows of nodes to which complains are reported, instead of broadcasting to all the nodes. To our knowledge, this is the first BFT-based blockchain protocol which efficiently addresses simultaneously the issues of communication complexity and latency under the presence of failures.

Book ChapterDOI
29 Jan 2018
TL;DR: In this paper, a lower bound for the communication complexity of quantum ordered read-k-times branching programs (k-QOBDDs) has been shown, where k is the number of players in the model.
Abstract: We explore multi-round quantum memoryless communication protocols. These are restricted version of multi-round quantum communication protocols. The “memoryless” term means that players forget history from previous rounds, and their behavior is obtained only by input and message from the opposite player. The model is interesting because this allows us to get lower bounds for models like automata, Ordered Binary Decision Diagrams and streaming algorithms. At the same time, we can prove stronger results with this restriction. We present a lower bound for quantum memoryless protocols. Additionally, we show a lower bound for Disjointness function for this model. As an application of communication complexity results, we consider Quantum Ordered Read-k-times Branching Programs (k-QOBDD). Our communication complexity result allows us to get lower bound for k-QOBDD and to prove hierarchies for sublinear width bounded error k-QOBDDs, where \(k=o(\sqrt{n})\). Furthermore, we prove a hierarchy for polynomial size bounded error k-QOBDDs for constant k. This result differs from the situation with an unbounded error where it is known that an increase of k does not give any advantage.

Posted Content
TL;DR: This work revisits the fundamental question of the amount of communication needed to securely evaluate a circuit of size s in the setting of information-theoretically secure \(\mathsf {MPC}\) in the correlated randomness model, where a trusted dealer distributes correlated random coins to all parties before the start of the protocol.
Abstract: Secure multiparty computation (\(\mathsf {MPC}\)) addresses the challenge of evaluating functions on secret inputs without compromising their privacy. A central question in multiparty computation is to understand the amount of communication needed to securely evaluate a circuit of size s. In this work, we revisit this fundamental question in the setting of information-theoretically secure \(\mathsf {MPC}\) in the correlated randomness model, where a trusted dealer distributes correlated random coins, independent of the inputs, to all parties before the start of the protocol. This setting is of strong theoretical interest, and has led to the most practically efficient \(\mathsf {MPC}\) protocols known to date.