scispace - formally typeset
Search or ask a question

Showing papers on "Communication complexity published in 2010"


Journal ArticleDOI
TL;DR: Algorithm to train support vector machines when training data are distributed across different nodes, and their communication to a centralized processing unit is prohibited due to, for example, communication complexity, scalability, or privacy reasons is developed.
Abstract: This paper develops algorithms to train support vector machines when training data are distributed across different nodes, and their communication to a centralized processing unit is prohibited due to, for example, communication complexity, scalability, or privacy reasons. To accomplish this goal, the centralized linear SVM problem is cast as a set of decentralized convex optimization sub-problems (one per node) with consensus constraints on the wanted classifier parameters. Using the alternating direction method of multipliers, fully distributed training algorithms are obtained without exchanging training data among nodes. Different from existing incremental approaches, the overhead associated with inter-node communications is fixed and solely dependent on the network topology rather than the size of the training sets available per node. Important generalizations to train nonlinear SVMs in a distributed fashion are also developed along with sequential variants capable of online processing. Simulated tests illustrate the performance of the novel algorithms.

420 citations


Proceedings ArticleDOI
05 Jun 2010
TL;DR: New ways to simulate 2-party communication protocols to get protocols with potentially smaller communication and a direct sum theorem for randomized communication complexity are described.
Abstract: We describe new ways to simulate 2-party communication protocols to get protocols with potentially smaller communication. We show that every communication protocol that communicates C bits and reveals I bits of information about the inputs to the participating parties can be simulated by a new protocol involving at most ~O(√CI) bits of communication. If the protocol reveals I bits of information about the inputs to an observer that watches the communication in the protocol, we show how to carry out the simulation with ~O(I) bits of communication.These results lead to a direct sum theorem for randomized communication complexity. Ignoring polylogarithmic factors, we show that for worst case computation, computing n copies of a function requires √n times the communication required for computing one copy of the function. For average case complexity, given any distribution μ on inputs, computing n copies of the function on n inputs sampled independently according to μ requires √n times the communication for computing one copy. If μ is a product distribution, computing n copies on n independent inputs sampled according to μ requires n times the communication required for computing the function. We also study the complexity of computing the sum (or parity) of nevaluations of f, and obtain results analogous to those above.

182 citations


Proceedings ArticleDOI
23 May 2010
TL;DR: This paper proposes a simple online selection algorithm that efficiently minimizes the mean completion delay of a frame of broadcast packets, compared to the random and greedy selection algorithms with a similar computational complexity.
Abstract: In this paper, we consider the problem of minimizing the mean completion delay in wireless broadcast for instantly decodable network coding. We first formulate the problem as a stochastic shortest path (SSP) problem. Although finding the packet selection policy using SSP is intractable, we use this formulation to draw the theoretical properties of efficient selection algorithms. Based on these properties, we propose a simple online selection algorithm that efficiently minimizes the mean completion delay of a frame of broadcast packets, compared to the random and greedy selection algorithms with a similar computational complexity. Simulation results show that our proposed algorithm indeed outperforms these random and greedy selection algorithms.

154 citations


Journal ArticleDOI
TL;DR: Computer simulations demonstrate that the proposed algorithm, Multiple Output Selection-LAS, which has the same complexity order as that of conventional LAS algorithms, is superior in bit error rate (BER) performance to LAS conventional algorithms.
Abstract: We present a low-complexity algorithm for detection in large MIMO systems based on the likelihood ascent search (LAS) algorithm. The key idea in our work is to generate multiple possible solutions or outputs from which we select the best one. We propose two possible approaches to achieve this goal and both are investigated. Computer simulations demonstrate that the proposed algorithm, Multiple Output Selection-LAS, which has the same complexity order as that of conventional LAS algorithms, is superior in bit error rate (BER) performance to LAS conventional algorithms. For example, with 20 antennas at both the transmitter and receiver, the proposed MOS-LAS algorithm needs about 4 dB less SNR to achieve a target BER of 10-4 for 4-QAM.

154 citations


Journal ArticleDOI
TL;DR: A direct-sum theorem in communication complexity is derived that substantially improves the previous such result shown by Jain, Radhakrishnan, and Sen and relates the relative entropy between two distributions to the communication complexity of generating one distribution from the other.
Abstract: Let X and Y be finite nonempty sets and (X,Y) a pair of random variables taking values in X?Y. We consider communication protocols between two parties, Alice and Bob, for generating X and Y. Alice is provided an x ? X generated according to the distribution of X , and is required to send a message to Bob in order to enable him to generate y ? Y, whose distribution is the same as that of Y|X=x. Both parties have access to a shared random string generated in advance. Let T[X:Y] be the minimum (over all protocols) of the expected number of bits Alice needs to transmit to achieve this. We show that I[X:Y] ? T[X:Y] ? I [X:Y] + 2 log2 (I[X:Y]+ O(1). We also consider the worst case communication required for this problem, where we seek to minimize the average number of bits Alice must transmit for the worst case x ? X. We show that the communication required in this case is related to the capacity C(E) of the channel E, derived from (X,Y) , that maps x ? X to the distribution of Y|X=x. We also show that the required communication T(E) satisfies C(E) ? T(E) ? C (E) + 2 log2 (C(E)+1) + O(1). Using the first result, we derive a direct-sum theorem in communication complexity that substantially improves the previous such result shown by Jain, Radhakrishnan, and Sen [In Proc. 30th International Colloquium of Automata, Languages and Programming (ICALP), ser. Lecture Notes in Computer Science, vol. 2719. 2003, pp. 300-315]. These results are obtained by employing a rejection sampling procedure that relates the relative entropy between two distributions to the communication complexity of generating one distribution from the other.

119 citations


Journal ArticleDOI
TL;DR: In this paper, a chunk-based resource allocation approach is proposed for single-antenna base stations with the consideration of guaranteeing an average bit error rate constraint per chunk and is compared to subcarrier-based allocation.
Abstract: The aim of this tutorial article is to present low-complexity resource allocation approaches that rely on chunks of subcarriers for downlink distributed antenna systems. The chunk-based resource allocation approach is first introduced for single-antenna base stations with the consideration of guaranteeing an average bit error rate constraint per chunk and is compared to subcarrier-based allocation. How it can be combined with maximal ratio transmission and zero-forcing beamforming for base stations with many antennas is then described. Finally, we discuss how the techniques can be applied to DASs. It is shown that in typical wireless environments chunk based resource allocation coupled with MRT and ZFB in the DAS can reduce the complexity of resource allocation significantly at the cost of negligible performance loss compared to subcarrier-based allocation.

116 citations


Book ChapterDOI
12 Aug 2010
TL;DR: This paper proposes a 5-pass code-based protocol with a lower communication complexity, allowing an impersonator to succeed with only a probability of 1/2, and proposes to use double-circulant construction in order to dramatically reduce the size of the public key.
Abstract: At CRYPTO'93, Stern proposed a 3-pass code-based identification scheme with a cheating probability of 2/3. In this paper, we propose a 5-pass code-based protocol with a lower communication complexity, allowing an impersonator to succeed with only a probability of 1/2. Furthermore, we propose to use double-circulant construction in order to dramatically reduce the size of the public key. The proposed scheme is zero-knowledge and relies on an NP-complete coding theory problem (namely the q-ary Syndrome Decoding problem). The parameters we suggest for the instantiation of this scheme take into account a recent study of (a generalization of) Stern's information set decoding algorithm, applicable to linear codes over arbitrary fields Fq; the public data of our construction is then 4 Kbytes, whereas that of Stern's scheme is 15 Kbytes for the same level of security. This provides a very practical identification scheme which is especially attractive for light-weight cryptography.

112 citations


Posted Content
TL;DR: An optimal $\Omega(n)$ lower bound on the randomized communication complexity of the much-studied gap-hamming-distance problem is proved and essentially optimal multipass space lower bounds in the data stream model are obtained for a number of fundamental problems, including the estimation of frequency moments.
Abstract: We prove an optimal $\Omega(n)$ lower bound on the randomized communication complexity of the much-studied Gap-Hamming-Distance problem. As a consequence, we obtain essentially optimal multi-pass space lower bounds in the data stream model for a number of fundamental problems, including the estimation of frequency moments. The Gap-Hamming-Distance problem is a communication problem, wherein Alice and Bob receive $n$-bit strings $x$ and $y$, respectively. They are promised that the Hamming distance between $x$ and $y$ is either at least $n/2+\sqrt{n}$ or at most $n/2-\sqrt{n}$, and their goal is to decide which of these is the case. Since the formal presentation of the problem by Indyk and Woodruff (FOCS, 2003), it had been conjectured that the naive protocol, which uses $n$ bits of communication, is asymptotically optimal. The conjecture was shown to be true in several special cases, e.g., when the communication is deterministic, or when the number of rounds of communication is limited. The proof of our aforementioned result, which settles this conjecture fully, is based on a new geometric statement regarding correlations in Gaussian space, related to a result of C. Borell (1985). To prove this geometric statement, we show that random projections of not-too-small sets in Gaussian space are close to a mixture of translated normal variables.

107 citations


Proceedings ArticleDOI
05 Jun 2010
TL;DR: A strong direct product theorem is established that if the authors want to compute k independent instances of a function, using less than k times the resources needed for one instance, then the overall success probability will be exponentially small in k, which solves an open problem of [KSW07, LSS08].
Abstract: A strong direct product theorem states that if we want to compute k independent instances of a function, using less than k times the resources needed for one instance, then the overall success probability will be exponentially small in k. We establish such a theorem for the randomized communication complexity of the Disjointness problem, i.e., with communication const• kn the success probability of solving k instances of size n can only be exponentially small in k. This solves an open problem of [KSW07, LSS08]. We also show that this bound even holds for $AM$-communication protocols with limited ambiguity. The main result implies a new lower bound for Disjointness in a restricted 3-player NOF protocol, and optimal communication-space tradeoffs for Boolean matrix product. Our main result follows from a solution to the dual of a linear programming problem, whose feasibility comes from a so-called Intersection Sampling Lemma that generalizes a result by Razborov [Raz92].

106 citations


Journal ArticleDOI
TL;DR: A simple algorithmic model for massive, unordered, distributed (mud) computation, as implemented by Google's MapReduce and Apache's Hadoop, and it is shown that in principle, mud algorithms are equivalent in power to symmetric streaming algorithms.
Abstract: A common approach for dealing with large datasets is to stream over the input in one pass, and perform computations using sublinear resources. For truly massive datasets, however, even making a single pass over the data is prohibitive. Therefore, streaming computations must be distributed over many machines. In practice, obtaining significant speedups using distributed computation has numerous challenges including synchronization, load balancing, overcoming processor failures, and data distribution. Successful systems in practice such as Google's MapReduce and Apache's Hadoop address these problems by only allowing a certain class of highly distributable tasks defined by local computations that can be applied in any order to the input.The fundamental question that arises is: How does the class of computational tasks supported by these systems differ from the class for which streaming solutions existqWe introduce a simple algorithmic model for massive, unordered, distributed (mud) computation, as implemented by these systems. We show that in principle, mud algorithms are equivalent in power to symmetric streaming algorithms. More precisely, we show that any symmetric (order-invariant) function that can be computed by a streaming algorithm can also be computed by a mud algorithm, with comparable space and communication complexity. Our simulation uses Savitch's theorem and therefore has superpolynomial time complexity. We extend our simulation result to some natural classes of approximate and randomized streaming algorithms. We also give negative results, using communication complexity arguments to prove that extensions to private randomness, promise problems, and indeterminate functions are impossible. We also introduce an extension of the mud model to multiple keys and multiple rounds.

103 citations


Journal ArticleDOI
TL;DR: A new algorithm is proposed which attains the ML performance with significantly reduced complexity based on the minimum mean square error (MMSE) criterion and an efficient way of generating the log likelihood ratio (LLR) values which can be used for coded systems.
Abstract: For multiple-input multiple-output (MIMO) systems, the optimum maximum likelihood (ML) detection requires tremendous complexity as the number of antennas or modulation level increases. This paper proposes a new algorithm which attains the ML performance with significantly reduced complexity. Based on the minimum mean square error (MMSE) criterion, the proposed scheme reduces the search space by excluding unreliable candidate symbols in data streams. Utilizing the probability metric which evaluates the reliability with the normalized likelihood functions of each symbol candidate, near optimal ML detection is made possible. Also we derive the performance analysis which supports the validity of our proposed method. A threshold parameter is introduced to balance a tradeoff between complexity and performance. Besides, we propose an efficient way of generating the log likelihood ratio (LLR) values which can be used for coded systems. Simulation results show that the proposed scheme achieves almost the same performance as the ML detection at a bit error rate (BER) of 10-4 with 28% and 15% of real multiplications compared to the conventional QR decomposition with M-algorithm (QRD-M) in 4-QAM and 16- QAM, respectively. Also we confirm that the proposed scheme achieves the near-optimal performance for all ranges of code rates with much reduced complexity. For instance, our scheme exhibits 74% and 46% multiplication reduction in 4-QAM and 16-QAM, respectively, compared to the sphere decoding based soft-output scheme with rate-1/2 convolutional code.

Proceedings ArticleDOI
09 Jun 2010
TL;DR: New lower bounds for randomized communication complexity and query complexity which are expressed as the optimum value of linear programs are described, which are described as the partition bounds.
Abstract: We describe new lower bounds for randomized communication complexity and query complexity which we call the partition bounds. They are expressed as the optimum value of linear programs. For communication complexity we show that the partition bound is stronger than both the rectangle/corruption bound and the γ2/generalized discrepancy bounds. In the model of query complexity we show that the partition bound is stronger than the approximate polynomial degree and classical adversary bounds. We also exhibit an example where the partition bound is quadratically larger than the approximate polynomial degree and adversary bounds.

Posted Content
TL;DR: In this article, the authors show that the complexity of performing nearest neighbor (NNS) search on a metric space is related to the expansion of the metric space given a graph obtained by connecting every pair of points within a certain distance.
Abstract: In this paper we show how the complexity of performing nearest neighbor (NNS) search on a metric space is related to the expansion of the metric space. Given a metric space we look at the graph obtained by connecting every pair of points within a certain distance $r$ . We then look at various notions of expansion in this graph relating them to the cell probe complexity of NNS for randomized and deterministic, exact and approximate algorithms. For example if the graph has node expansion $\Phi$ then we show that any deterministic $t$-probe data structure for $n$ points must use space $S$ where $(St/n)^t > \Phi$. We show similar results for randomized algorithms as well. These relationships can be used to derive most of the known lower bounds in the well known metric spaces such as $l_1$, $l_2$, $l_\infty$ by simply computing their expansion. In the process, we strengthen and generalize our previous results (FOCS 2008). Additionally, we unify the approach in that work and the communication complexity based approach. Our work reduces the problem of proving cell probe lower bounds of near neighbor search to computing the appropriate expansion parameter. In our results, as in all previous results, the dependence on $t$ is weak; that is, the bound drops exponentially in $t$. We show a much stronger (tight) time-space tradeoff for the class of dynamic low contention data structures. These are data structures that supports updates in the data set and that do not look up any single cell too often.

Journal ArticleDOI
TL;DR: This work proposes a nonmyopic rule, which is based on not only the target state prediction but also its future tendency, and shows that the variational filtering algorithm is capable of precise tracking even in the highly nonlinear case.
Abstract: The prime motivation of our work is to balance the inherent trade-off between the resource consumption and the accuracy of the target tracking in wireless sensor networks. Toward this objective, the study goes through three phases. First, a cluster-based scheme is exploited. At every sampling instant, only one cluster of sensors that located in the proximity of the target is activated, whereas the other sensors are inactive. To activate the most appropriate cluster, we propose a nonmyopic rule, which is based on not only the target state prediction but also its future tendency. Second, the variational filtering algorithm is capable of precise tracking even in the highly nonlinear case. Furthermore, since the measurement incorporation and the approximation of the filtering distribution are jointly performed by variational calculus, an effective and lossless compression is achieved. The intercluster information exchange is thus reduced to one single Gaussian statistic, dramatically cutting down the resource consumption. Third, a binary proximity observation model is employed by the activated slave sensors to reduce the energy consumption and to minimize the intracluster communication. Finally, the effectiveness of the proposed approach is evaluated and compared with the state-of-the-art algorithms in terms of tracking accuracy, internode communication, and computation complexity.

Journal ArticleDOI
TL;DR: A theoretical framework for delay functions is developed and it is shown that with a function of angle and distance the authors can reduce the number of protests by a factor of 2 compared to a simple angle-based delay function.
Abstract: Beaconless georouting algorithms are fully reactive and work without prior knowledge of their neighbors. However, existing approaches can either not guarantee delivery or they require the exchange of complete neighborhood information. We describe two general methods for completely reactive face routing with guaranteed delivery. The beaconless forwarder planarization (BFP) scheme determines correct edges of a local planar subgraph without hearing from all neighbors. Face routing then continues properly. Angular relaying determines directly the next hop of a face traversal. Both schemes are based on the select-and-protest principle. Neighbors respond according to a delay function, but only if they do not violate a planar subgraph condition. Protest messages are used to remove falsely selected neighbors that are not in the planar subgraph. We show that a correct beaconless planar subgraph construction is not possible without protests. We also show the impact of the chosen planar subgraph on the message complexity. With the new circlunar neighborhood graph (CNG) we can bound the worst case message complexity of BFP, which is not possible when using the Gabriel graph (GG) for planarization. Simulation results show similar message complexities in the average case when using CNG and GG. Angular relaying uses a delay function that is based on the angular distance to the previous hop. We develop a theoretical framework for delay functions and show both theoretically and in simulations that with a function of angle and distance we can reduce the number of protests by a factor of 2 compared to a simple angle-based delay function.

Book ChapterDOI
25 Jan 2010
TL;DR: In this article, the authors consider secure function evaluation (SFE) in the client-server setting where the server issues a secure token to the client and is not trusted by the client.
Abstract: We consider Secure Function Evaluation (SFE) in the client-server setting where the server issues a secure token to the client. The token is not trusted by the client and is not a trusted third party. We show how to take advantage of the token to drastically reduce the communication complexity of SFE and computation load of the server. Our main contribution is the detailed consideration of design decisions, optimizations, and trade-offs, associated with the setting and its strict hardware requirements for practical deployment. In particular, we model the token as a computationally weak device with small constant-size memory and limit communication between client and server. We consider semi-honest, covert, and malicious adversaries. We show the feasibility of our protocols based on a FPGA implementation.

01 Jan 2010
TL;DR: This note proposes a distributed architecture (based on cell-like P systems, with their skin membranes communicating through channels as in tissue-likeP systems, according to specified rules of the antiport type), where parts of a problem can be introduced as inputs in various components and then processed in parallel.
Abstract: Although P systems are distributed parallel computing devices, no explicit way of handling the input in a distributed way in this framework was considered so far. This note proposes a distributed architecture (based on cell-like P systems, with their skin membranes communicating through channels as in tissue-like P systems, according to specified rules of the antiport type), where parts of a problem can be introduced as inputs in various components and then processed in parallel. The respective devices are called dP systems, with the case of accepting strings called dP automata. The communication complexity can be evaluated in various ways: statically (counting the communication rules in a dP system which solves a given problem), or dynamically (counting the number of communication steps, of communication rules used in a computation, or the number of objects communicated). For each measure, two notions of "parallelizability" can be introduced. Besides (informal) definitions, some illustrations of these idea are provided for dP automata: each regular language is "weakly parallelizable" (i.e., it can be recognized in this framework, using a constant number of communication steps), and there are languages of various types with respect to Chomsky hierarchy which are "efficiently parallelizable" (they are parallelizable and, moreover, are accepted in a faster way by a dP automaton than by a single P automaton). Several suggestions for further research are made.

Journal ArticleDOI
TL;DR: In this article, a distributed architecture based on cell-like P systems, with their skin membranes communicating through channels according to specified rules of the antiport type, where parts of a problem can be introduced as inputs in various components and then processed in parallel is considered.
Abstract: Although P systems are distributed parallel computing devices, no explicit way of handling the input in a distributed way in this framework was considered so far. This note proposes a distributed architecture (based on cell-like P systems, with their skin membranes communicating through channels as in tissue-like P systems, according to specified rules of the antiport type), where parts of a problem can be introduced as inputs in various components and then processed in parallel. The respective devices are called dP systems, with the case of accepting strings called dP automata. The communication complexity can be evaluated in various ways: statically (counting the communication rules in a dP system which solves a given problem), or dynamically (counting the number of communication steps, of communication rules used in a computation, or the number of objects communicated). For each measure, two notions of "parallelizability" can be introduced. Besides (informal) definitions, some illustrations of these idea are provided for dP automata: each regular language is "weakly parallelizable" (i.e., it can be recognized in this framework, using a constant number of communication steps), and there are languages of various types with respect to Chomsky hierarchy which are "efficiently parallelizable" (they are parallelizable and, moreover, are accepted in a faster way by a dP automaton than by a single P automaton). Several suggestions for further research are made.

Journal ArticleDOI
TL;DR: The first nontrivial lower bounds on time-space trade-offs for the selection problem are established, and deterministic lower bounds for I/O-efficient algorithms as well are got.
Abstract: We establish the first nontrivial lower bounds on time-space trade-offs for the selection problem. We prove that any comparison-based randomized algorithm for finding the median requires Ω(nlog logSn) expected time in the RAM model (or more generally in the comparison branching program model), if we have S bits of extra space besides the read-only input array. This bound is tight for all S > log n, and remains true even if the array is given in a random order. Our result thus answers a 16-year-old question of Munro and Raman l1996r, and also complements recent lower bounds that are restricted to sequential access, as in the multipass streaming model lChakrabarti et al. 2008br.We also prove that any comparison-based, deterministic, multipass streaming algorithm for finding the median requires Ω(nloga(n/s)+ nlogsn) worst-case time (in scanning plus comparisons), if we have s cells of space. This bound is also tight for all s >log2n. We get deterministic lower bounds for I/O-efficient algorithms as well.The proofs in this article are self-contained and do not rely on communication complexity techniques.

Journal ArticleDOI
Chao Zhang1, Zhaocheng Wang1, Zhixing Yang1, Jun Wang1, Jian Song1 
TL;DR: A structure of equalizer based on frequency domain decision feedback which could be used for multi-user SC-FDMA systems and it is shown by the simulation results that perfect performance could be achieved under all kinds of situations forMulti-user systems.
Abstract: Single carrier frequency division multiple access (SC-FDMA) is one well-known scheme, which has recently become a preferred choice for uplink channels. Due to the usage of single carrier, the performance of SC-FDMA systems degrades in deep frequency selective fading channels. In this paper, we propose a structure of equalizer based on frequency domain decision feedback which could be used for multi-user SC-FDMA systems. Specific parameters of the equalizer are analyzed as well. This algorithm is applicable to various carrier allocations in multi-user systems such as localized allocation, distributed allocation, and frequency-hopping (FH) allocation. To reduce the complexity, it is not necessary to derive the inversion of matrix, which is required in the traditional decision feedback equalizer for single carrier frequency domain equalization (SC-FDE-DFE). It is shown by the simulation results that perfect performance could be achieved under all kinds of situations for multi-user systems. This structure can be used in the broadcasting uplink channels with SC-FDMA scheme.

Proceedings ArticleDOI
21 Jun 2010
TL;DR: This paper employs threshold cryptography and distributed key generation to define two protocols both of which are more efficient than existing solutions and are practical for deployment under significant levels of churn and adversarial behaviour.
Abstract: There are several analytical results on distributed hash tables (DHTs) that can tolerate Byzantine faults. Unfortunately, in such systems, operations such as data retrieval and message sending incur significant communication costs. For example, a simple scheme used in many Byzantine fault-tolerant DHT constructions of $n$ nodes requires $O(\log^{3}n)$ messages, this is likely impractical for real-world applications. The previous best known message complexity is $O(\log^2{n})$ {\it in expectation}, however, the corresponding protocol suffers from prohibitive costs owing to hidden constants in the asymptotic notation and setup costs. In this paper, we focus on reducing the communication costs against a computationally bounded adversary. We employ threshold cryptography and distributed key generation to define two protocols both of which are more efficient than existing solutions. In comparison, our first protocol is {\it deterministic} with $O(\log^2{}n)$ message complexity and our second protocol is randomized with expected $O(\log{n})$ message complexity. Further, both the hidden constants and setup costs for our protocols are small and no trusted third party is required. Finally, we present results from micro benchmarks conducted over PlanetLab showing that our protocols are practical for deployment under significant levels of churn and adversarial behaviour.

Proceedings ArticleDOI
09 Jun 2010
TL;DR: This work gives what it believes to be the first lower bounds for this class, separating P^{NP} from Sigma_2 intersect Pi_2 in the communication complexity setting, and restricts the number of time steps to be polynomial in the input length.
Abstract: We consider two natural extensions of the communication complexity model that are inspired by distributed computing. In both models, two parties are equipped with synchronized discrete clocks, and we assume that a bit can be sent from one party to another in one step of time. Both models allow implicit communication, by allowing the parties to choose whether to send a bit during each step. We examine trade-offs between time (total number of possible time steps elapsed) and communication (total number of bits actually sent). In the synchronized bit model, we measure the total number of bits sent between the two parties (e.g., email). We show that, in this model, communication costs can differ from the usual communication complexity by a factor roughly logarithmic in the number of time steps, and no more than such a factor. In the synchronized connection model, both parties choose whether or not to open their end of the communication channel at each time step. An exchange of bits takes place only when both ends of the channel are open (e.g., instant messaging), in which case we say that a {\em connection} has occurred. If a party does not open its end, it does not learn whether the other party opened its channel. When we restrict the number of time steps to be polynomial in the input length, and the number of connections to be polylogarithmic in the input length, the class of problems solved with this model turns out to be roughly equivalent to the communication complexity analogue of P^{NP}. Using our new model, we give what we believe to be the first lower bounds for this class, separating P^{NP} from Sigma_2 intersect Pi_2 in the communication complexity setting. Although these models are both quite natural, they have unexpected power, and lead to a refinement of problem classifications in communication complexity.

Proceedings ArticleDOI
05 Jun 2010
TL;DR: It is proved that this one-pass algorithm for Dyck(2) is optimal, up to a log(n) factor, even when two-sided error is allowed, and conjecture that a similar bound holds for any constant number of passes over the input.
Abstract: Motivated by a concrete problem and with the goal of understanding the relationship between the complexity of streaming algorithms and the computational complexity of formal languages, we investigate the problem Dyck(s) of checking matching parentheses, with s different types of parenthesisWe present a one-pass randomized streaming algorithm for Dyck(2) with space O(√ n log(n)) bits, time per letter polylog(n), and one-sided error We prove that this one-pass algorithm is optimal, up to a log(n) factor, even when two-sided error is allowed, and conjecture that a similar bound holds for any constant number of passes over the inputSurprisingly, the space requirement shrinks drastically if we have access to the input stream "in reverse" We present a two-pass randomized streaming algorithm for Dyck(2) with space O((log n)2), time polylog(n) and one-sided error, where the second pass is in the reverse direction Both algorithms can be extended to Dyck(s) since this problem is reducible to Dyck(2) for a suitable notion of reduction in the streaming model Except for an extra O(√ log(s)) multiplicative overhead in the space required in the one-pass algorithm, the resource requirements are of the same orderFor the lower bound, we exhibit hard instances Ascension(m) of Dyck(2) with length Θ(mn) We embed these in what we call a "one-pass" communication problem with 2m-players, where m=~O(n) To establish the hardness of Ascension(m), we prove a direct sum result by following the "information cost" approach, but with a few twists Indeed, we play a subtle game between public and private coins for Mountain, which corresponds to a primitive instance Ascension(1) This mixture between public and private coins for m results from a balancing act between the direct sum result and a combinatorial lower bound for m

Journal ArticleDOI
TL;DR: A reduced-complexity partial transmit sequences approach based on the quantum-inspired evolutionary algorithm (QEA) for the reduction of peak-to-average power ratio (PAPR) in an orthogonal frequency division multiplexing (OFDM) system with low computational complexity.
Abstract: This paper proposes a reduced-complexity partial transmit sequences (PTS) approach based on the quantum-inspired evolutionary algorithm (QEA) for the reduction of peak-to-average power ratio (PAPR) in an orthogonal frequency division multiplexing (OFDM) system. The conventional PTS technique improves the PAPR statistics for OFDM signals, but the considerable computational complexity for an exhaustive search over all combinations of allowed phase factors is a potential problem for practical implementation. To reduce the computational complexity while still obtaining the desirable PAPR reduction, we introduce the QEA, an effective algorithm that solves various combinatorial optimization problems, to search the optimal phase factors. The simulation results show that the proposed QEA achieves significant PAPR reduction with low computational complexity.

Proceedings ArticleDOI
17 Jan 2010
TL;DR: A new randomized consensus algorithm is presented that achieves optimal communication efficiency, using only O(n) bits of communication, and terminates in (almost optimal) time O(log n), with high probability.
Abstract: We consider the problem of fault-tolerant agreement in a crash-prone synchronous system. We present a new randomized consensus algorithm that achieves optimal communication efficiency, using only O(n) bits of communication, and terminates in (almost optimal) time O(log n), with high probability. The same protocol, with minor modifications, can also be used in partially synchronous networks, guaranteeing correct behavior even in asynchronous executions, while maintaining efficient performance in synchronous executions. Finally, the same techniques also yield a randomized, fault-tolerant gossip protocol that terminates in O(log* n) rounds using O(n) messages (with bit complexity that depends on the data being gossiped).

Journal ArticleDOI
TL;DR: In this article, an O(n/p) time parallel algorithm with a communication complexity that is equal to that of parallel sorting and is not sensitive to Σ is presented.
Abstract: Assembling genomic sequences from a set of overlapping reads is one of the most fundamental problems in computational biology. Algorithms addressing the assembly problem fall into two broad categories - based on the data structures which they employ. The first class uses an overlap/string graph and the second type uses a de Bruijn graph. However with the recent advances in short read sequencing technology, de Bruijn graph based algorithms seem to play a vital role in practice. Efficient algorithms for building these massive de Bruijn graphs are very essential in large sequencing projects based on short reads. In an earlier work, an O(n/p) time parallel algorithm has been given for this problem. Here n is the size of the input and p is the number of processors. This algorithm enumerates all possible bi-directed edges which can overlap with a node and ends up generating Θ(n Σ) messages (Σ being the size of the alphabet). In this paper we present a Θ(n/p) time parallel algorithm with a communication complexity that is equal to that of parallel sorting and is not sensitive to Σ. The generality of our algorithm makes it very easy to extend it even to the out-of-core model and in this case it has an optimal I/O complexity of (M being the main memory size and B being the size of the disk block). We demonstrate the scalability of our parallel algorithm on a SGI/Altix computer. A comparison of our algorithm with the previous approaches reveals that our algorithm is faster - both asymptotically and practically. We demonstrate the scalability of our sequential out-of-core algorithm by comparing it with the algorithm used by VELVET to build the bi-directed de Bruijn graph. Our experiments reveal that our algorithm can build the graph with a constant amount of memory, which clearly outperforms VELVET. We also provide efficient algorithms for the bi-directed chain compaction problem. The bi-directed de Bruijn graph is a fundamental data structure for any sequence assembly program based on Eulerian approach. Our algorithms for constructing Bi-directed de Bruijn graphs are efficient in parallel and out of core settings. These algorithms can be used in building large scale bi-directed de Bruijn graphs. Furthermore, our algorithms do not employ any all-to-all communications in a parallel setting and perform better than the prior algorithms. Finally our out-of-core algorithm is extremely memory efficient and can replace the existing graph construction algorithm in VELVET.

Journal Article
TL;DR: The first nontrivial communication complexity lower bound for the problem of estimating the edit distance (aka Levenshtein distance) between two strings is proved, and it holds not only for strings over a binary alphabet but also for strings that are permutations (aka the Ulam metric).
Abstract: We prove the first nontrivial communication complexity lower bound for the problem of estimating the edit distance (aka Levenshtein distance) between two strings. To the best of our knowledge, this is the first computational setting in which the complexity of estimating the edit distance is provably larger than that of Hamming distance. Our lower bound exhibits a trade-off between approximation and communication, asserting, for example, that protocols with $O(1)$ bits of communication can obtain only approximation $\alpha\geq\Omega(\log d/\log\log d)$, where $d$ is the length of the input strings. This case of $O(1)$ communication is of particular importance since it captures constant-size sketches as well as embeddings into spaces like $l_1$ and squared-$l_2$, two prevailing algorithmic approaches for dealing with edit distance. Indeed, the known nontrivial communication upper bounds are all derived from embeddings into $l_1$. By excluding low-communication protocols for edit distance, we rule out a strictly richer class of algorithms than previous results. Furthermore, our lower bound holds not only for strings over a binary alphabet but also for strings that are permutations (aka the Ulam metric). For this case, our bound nearly matches an upper bound known via embedding the Ulam metric into $l_1$. Our proof uses a new technique that relies on Fourier analysis in a rather elementary way.

Journal ArticleDOI
TL;DR: This paper presents a suite of two algorithms: simple_tree, and hypercube, that are both fast and require a small number of messages, which makes the algorithms highly scalable.
Abstract: Large-scale distributed systems such as supercomputers and peer-to-peer systems typically have a fully connected logical topology over a large number of processors. Existing snapshot algorithms in such systems have high response time and/or require a large number of messages, typically O(n2), where n is the number of processes. In this paper, we present a suite of two algorithms: simple_tree, and hypercube, that are both fast and require a small number of messages. This makes the algorithms highly scalable. Simple_tree requires O(n) messages and has O(log n) response time. Hypercube requires O(n log n) messages and has O(log n) response time, in addition to having the property that the roles of all the processes are symmetrical. Process symmetry implies greater potential for balanced workload and congestion-freedom. All the algorithms assume non-FIFO channels.

Proceedings ArticleDOI
01 Sep 2010
TL;DR: This work studies the problem of synchronization of two remotely located data sources, which are mis-synchronized due to deletions and insertions, and proposes an interactive algorithm which is computationally simple and has near-optimal communication complexity.
Abstract: We study the problem of synchronization of two remotely located data sources, which are mis-synchronized due to deletions and insertions. This is an important problem since a small number of synchronization errors can induce a large Hamming distance between the two sources. The goal is to effect synchronization with the rate-efficient use of lossless bidirectional links between the two sources. In this work, we focus on the following model. A binary sequence X of length n is edited to generate the sequence at the remote end, say Y, where the editing involves random deletions and insertions, possibly in small bursts. The problem is to synchronize Y with X with minimal exchange of information (in terms of both the average communication rate and the average number of interactive rounds of communication). We focus here on the case where the number of edits is much smaller than n, and propose an interactive algorithm which is computationally simple and has near-optimal communication complexity. Our algorithm works by efficiently splitting the source sequence into pieces containing either just a single deletion/insertion or a single burst deletion/insertion. Each of these pieces is then synchronized using an optimal one-way synchronization code, based on the single-deletion correcting channel codes of Varshamov and Tenengolts (VT codes).

Journal ArticleDOI
TL;DR: This work proves the first non-trivial communication complexity lower bound for the problem of estimating the edit distance (aka Levenshtein distance) between two strings, and provides the first setting in which the complexity of computing the edit Distance is provably larger than that of Hamming distance.
Abstract: We prove the first nontrivial communication complexity lower bound for the problem of estimating the edit distance (aka Levenshtein distance) between two strings. To the best of our knowledge, this is the first computational setting in which the complexity of estimating the edit distance is provably larger than that of Hamming distance. Our lower bound exhibits a trade-off between approximation and communication, asserting, for example, that protocols with $O(1)$ bits of communication can obtain only approximation $\alpha\geq\Omega(\log d/\log\log d)$, where $d$ is the length of the input strings. This case of $O(1)$ communication is of particular importance since it captures constant-size sketches as well as embeddings into spaces like $l_1$ and squared-$l_2$, two prevailing algorithmic approaches for dealing with edit distance. Indeed, the known nontrivial communication upper bounds are all derived from embeddings into $l_1$. By excluding low-communication protocols for edit distance, we rule out a strictly richer class of algorithms than previous results. Furthermore, our lower bound holds not only for strings over a binary alphabet but also for strings that are permutations (aka the Ulam metric). For this case, our bound nearly matches an upper bound known via embedding the Ulam metric into $l_1$. Our proof uses a new technique that relies on Fourier analysis in a rather elementary way.