Showing papers in "IEEE Transactions on Information Theory in 2016"

PDF

Open Access

Journal Article•DOI•

On the Age of Information in Status Update Systems With Packet Management

[...]

Maice Costa¹, Marian Codreanu², Anthony Ephremides³•Institutions (3)

Trinity College, Dublin¹, University of Oulu², University of Maryland, College Park³

25 Feb 2016-IEEE Transactions on Information Theory

TL;DR: A communication system in which status updates arrive at a source node, and should be transmitted through a network to the intended destination node, using the queuing theory, and it is assumed that the time it takes to successfully transmit a packet to the destination is an exponentially distributed service time.

...read moreread less

Abstract: We consider a communication system in which status updates arrive at a source node, and should be transmitted through a network to the intended destination node. The status updates are samples of a random process under observation, transmitted as packets, which also contain the time stamp to identify when the sample was generated. The age of the information available to the destination node is the time elapsed, since the last received update was generated. In this paper, we model the source-destination link using the queuing theory, and we assume that the time it takes to successfully transmit a packet to the destination is an exponentially distributed service time. We analyze the age of information in the case that the source node has the capability to manage the arriving samples, possibly discarding packets in order to avoid wasting network resources with the transmission of stale information. In addition to characterizing the average age, we propose a new metric, called peak age, which provides information about the maximum value of the age, achieved immediately before receiving an update.

...read moreread less

640 citations

Journal Article•DOI•

Fundamental Limits of Caching in Wireless D2D Networks

[...]

Mingyue Ji¹, Giuseppe Caire¹, Andreas F. Molisch¹•Institutions (1)

University of Southern California¹

01 Feb 2016-IEEE Transactions on Information Theory

TL;DR: A caching strategy based on deterministic assignment of subpackets of the library files, and a coded delivery strategy where the users send linearly coded messages to each other in order to collectively satisfy their demands are proposed.

...read moreread less

Abstract: We consider a wireless device-to-device (D2D) network where communication is restricted to be single-hop. Users make arbitrary requests from a finite library of files and have pre-cached information on their devices, subject to a per-node storage capacity constraint. A similar problem has already been considered in an infrastructure setting, where all users receive a common multicast (coded) message from a single omniscient server (e.g., a base station having all the files in the library) through a shared bottleneck link. In this paper, we consider a D2D infrastructureless version of the problem. We propose a caching strategy based on deterministic assignment of subpackets of the library files, and a coded delivery strategy where the users send linearly coded messages to each other in order to collectively satisfy their demands. We also consider a random caching strategy, which is more suitable to a fully decentralized implementation. Under certain conditions, both approaches can achieve the information theoretic outer bound within a constant multiplicative factor. In our previous work, we showed that a caching D2D wireless network with one-hop communication, random caching, and uncoded delivery (direct file transmissions) achieves the same throughput scaling law of the infrastructure-based coded multicasting scheme, in the regime of large number of users and files in the library. This shows that the spatial reuse gain of the D2D network is order-equivalent to the coded multicasting gain of single base station transmission. It is, therefore, natural to ask whether these two gains are cumulative, i.e., if a D2D network with both local communication (spatial reuse) and coded multicasting can provide an improved scaling law. Somewhat counterintuitively, we show that these gains do not cumulate (in terms of throughput scaling law). This fact can be explained by noticing that the coded delivery scheme creates messages that are useful to multiple nodes, such that it benefits from broadcasting to as many nodes as possible, while spatial reuse capitalizes on the fact that the communication is local, such that the same time slot can be reused in space across the network. Unfortunately, these two issues are in contrast with each other.

...read moreread less

598 citations

Journal Article•DOI•

From Denoising to Compressed Sensing

[...]

Christopher A. Metzler¹, Arian Maleki², Richard G. Baraniuk¹•Institutions (2)

Rice University¹, Columbia University²

01 Sep 2016-IEEE Transactions on Information Theory

TL;DR: In this paper, a denoising-based approximate message passing (D-AMP) framework is proposed to integrate a wide class of denoisers within its iterations. But, the performance of D-AMP is limited by the use of an appropriate Onsager correction term in its iterations, which coerces the signal perturbation at each iteration to be very close to the white Gaussian noise that denoisers are typically designed to remove.

...read moreread less

Abstract: A denoising algorithm seeks to remove noise, errors, or perturbations from a signal. Extensive research has been devoted to this arena over the last several decades, and as a result, todays denoisers can effectively remove large amounts of additive white Gaussian noise. A compressed sensing (CS) reconstruction algorithm seeks to recover a structured signal acquired using a small number of randomized measurements. Typical CS reconstruction algorithms can be cast as iteratively estimating a signal from a perturbed observation. This paper answers a natural question: How can one effectively employ a generic denoiser in a CS reconstruction algorithm? In response, we develop an extension of the approximate message passing (AMP) framework, called denoising-based AMP (D-AMP), that can integrate a wide class of denoisers within its iterations. We demonstrate that, when used with a high-performance denoiser for natural images, D-AMP offers the state-of-the-art CS recovery performance while operating tens of times faster than competing methods. We explain the exceptional performance of D-AMP by analyzing some of its theoretical features. A key element in D-AMP is the use of an appropriate Onsager correction term in its iterations, which coerces the signal perturbation at each iteration to be very close to the white Gaussian noise that denoisers are typically designed to remove.

...read moreread less

535 citations

Journal Article•DOI•

Constant Composition Distribution Matching

[...]

Patrick Schulte¹, Georg Böcherer¹•Institutions (1)

Technische Universität München¹

01 Jan 2016-IEEE Transactions on Information Theory

TL;DR: Fixed-to-fixed length, invertible, and low complexity encoders and decoders based on constant composition and arithmetic coding are presented and the encoder achieves the maximum rate of the desired distribution, asymptotically in the blocklength.

...read moreread less

Abstract: Distribution matching transforms independent and Bernoulli(1/2) distributed input bits into a sequence of output symbols with a desired distribution. Fixed-to-fixed length, invertible, and low complexity encoders and decoders based on constant composition and arithmetic coding are presented. The encoder achieves the maximum rate, namely, the entropy of the desired distribution, asymptotically in the blocklength. Furthermore, the normalized divergence of the encoder output and the desired distribution goes to zero in the blocklength.

...read moreread less

510 citations

Journal Article•DOI•

Exact Recovery in the Stochastic Block Model

[...]

Emmanuel Abbe¹, Afonso S. Bandeira¹, Georgina Hall¹•Institutions (1)

Princeton University¹

01 Jan 2016-IEEE Transactions on Information Theory

TL;DR: An efficient algorithm based on a semidefinite programming relaxation of ML is proposed, which is proved to succeed in recovering the communities close to the threshold, while numerical experiments suggest that it may achieve the threshold.

...read moreread less

Abstract: The stochastic block model with two communities, or equivalently the planted bisection model, is a popular model of random graph exhibiting a cluster behavior. In the symmetric case, the graph has two equally sized clusters and vertices connect with probability $p$ within clusters and $q$ across clusters. In the past two decades, a large body of literature in statistics and computer science has focused on providing lower bounds on the scaling of $|p-q|$ to ensure exact recovery. In this paper, we identify a sharp threshold phenomenon for exact recovery: if $\alpha =pn/\log (n)$ and $\beta =qn/\log (n)$ are constant (with $\alpha >\beta $ ), recovering the communities with high probability is possible if $({\alpha +\beta }/{2}) - \sqrt {\alpha \beta }>1$ and is impossible if $({\alpha +\beta }/{2}) - \sqrt {\alpha \beta } . In particular, this improves the existing bounds. This also sets a new line of sight for efficient clustering algorithms. While maximum likelihood (ML) achieves the optimal threshold (by definition), it is in the worst case NP-hard. This paper proposes an efficient algorithm based on a semidefinite programming relaxation of ML, which is proved to succeed in recovering the communities close to the threshold, while numerical experiments suggest that it may achieve the threshold. An efficient algorithm that succeeds all the way down to the threshold is also obtained using a partial recovery algorithm combined with a local improvement procedure.

...read moreread less

474 citations

Journal Article•DOI•

Covert Communication Over Noisy Channels: A Resolvability Perspective

[...]

Matthieu R. Bloch¹•Institutions (1)

Georgia Institute of Technology¹

01 May 2016-IEEE Transactions on Information Theory

TL;DR: A coding scheme based on the principle of channel resolvability is developed, which proves that if the receiver's channel is better than the warden's channel, it is possible to communicate on the order of √n reliable and covert bits over n channel uses without a secret key.

...read moreread less

Abstract: We consider the situation in which a transmitter attempts to communicate reliably over a discrete memoryless channel, while simultaneously ensuring covertness (low probability of detection) with respect to a warden, who observes the signals through another discrete memoryless channel. We develop a coding scheme based on the principle of channel resolvability, which generalizes and extends prior work in several directions. First, it shows that irrespective of the quality of the channels, it is possible to communicate on the order of $\sqrt {n}$ reliable and covert bits over $n$ channel uses if the transmitter and the receiver share on the order of $\sqrt {n}$ key bits. This improves upon earlier results requiring on the order of $\sqrt {n}\log n$ key bits. Second, it proves that if the receiver’s channel is better than the warden’s channel in a sense that we make precise, it is possible to communicate on the order of $\sqrt {n}$ reliable and covert bits over $n$ channel uses without a secret key. This generalizes earlier results established for binary symmetric channels. We also identify the fundamental limits of covert and secret communications in terms of the optimal asymptotic scaling of the message size and key size, and we extend the analysis to Gaussian channels. The main technical problem that we address is how to develop concentration inequalities for low-weight sequences. The crux of our approach is to define suitably modified typical sets that are amenable to concentration inequalities.

...read moreread less

357 citations

Journal Article•DOI•

Fundamental Limits of Communication With Low Probability of Detection

[...]

Ligong Wang¹, Gregory W. Wornell², Lizhong Zheng²•Institutions (2)

Cergy-Pontoise University¹, Massachusetts Institute of Technology²

05 Apr 2016-IEEE Transactions on Information Theory

TL;DR: In this article, the authors considered the problem of communication over a discrete memoryless channel (DMC) or an additive white Gaussian noise (AWGN) channel subject to the constraint that the probability that an adversary who observes the channel outputs can detect the communication is low.

...read moreread less

Abstract: This paper considers the problem of communication over a discrete memoryless channel (DMC) or an additive white Gaussian noise (AWGN) channel subject to the constraint that the probability that an adversary who observes the channel outputs can detect the communication is low. In particular, the relative entropy between the output distributions when a codeword is transmitted and when no input is provided to the channel must be sufficiently small. For a DMC whose output distribution induced by the “off” input symbol is not a mixture of the output distributions induced by other input symbols, it is shown that the maximum amount of information that can be transmitted under this criterion scales like the square root of the blocklength. The same is true for the AWGN channel. Exact expressions for the scaling constant are also derived.

...read moreread less

326 citations

Journal Article•DOI•

Guaranteed Matrix Completion via Non-Convex Factorization

[...]

Ruoyu Sun¹, Zhi-Quan Luo¹•Institutions (1)

University of Minnesota¹

01 Nov 2016-IEEE Transactions on Information Theory

TL;DR: In this article, the authors established a theoretical guarantee for the matrix factorization-based formulation to correctly recover the underlying low-rank matrix and showed that under similar conditions to those in previous works, many standard optimization algorithms converge to the global optima of a factorizationbased formulation and recover the true lowrank matrix.

...read moreread less

Abstract: Matrix factorization is a popular approach for large-scale matrix completion. The optimization formulation based on matrix factorization, even with huge size, can be solved very efficiently through the standard optimization algorithms in practice. However, due to the non-convexity caused by the factorization model, there is a limited theoretical understanding of whether these algorithms will generate a good solution. In this paper, we establish a theoretical guarantee for the factorization-based formulation to correctly recover the underlying low-rank matrix. In particular, we show that under similar conditions to those in previous works, many standard optimization algorithms converge to the global optima of a factorization-based formulation and recover the true low-rank matrix. We study the local geometry of a properly regularized objective and prove that any stationary point in a certain local region is globally optimal. A major difference of this paper from the existing results is that we do not need resampling (i.e., using independent samples at each iteration) in either the algorithm or its analysis.

...read moreread less

299 citations

Journal Article•DOI•

Secure Massive MIMO Transmission With an Active Eavesdropper

[...]

Yongpeng Wu¹, Robert Schober¹, Derrick Wing Kwan Ng², Chengshan Xiao³, Giuseppe Caire⁴ - Show less +1 more•Institutions (4)

University of Erlangen-Nuremberg¹, University of New South Wales², Missouri University of Science and Technology³, Technical University of Berlin⁴

01 Jul 2016-IEEE Transactions on Information Theory

TL;DR: This paper considers a time-division duplex system where uplink training is required and an active eavesdropper can attack the training phase to cause pilot contamination at the transmitter, and derives an asymptotic achievable secrecy rate when the number of transmit antennas approaches infinity.

...read moreread less

Abstract: In this paper, we investigate secure and reliable transmission strategies for multi-cell multi-user massive multiple-input multiple-output systems with a multi-antenna active eavesdropper. We consider a time-division duplex system where uplink training is required and an active eavesdropper can attack the training phase to cause pilot contamination at the transmitter. This forces the precoder used in the subsequent downlink transmission phase to implicitly beamform toward the eavesdropper, thus increasing its received signal power. Assuming matched filter precoding and artificial noise (AN) generation at the transmitter, we derive an asymptotic achievable secrecy rate when the number of transmit antennas approaches infinity. For the case of a single-antenna active eavesdropper, we obtain a closed-form expression for the optimal power allocation policy for the transmit signal and the AN, and find the minimum transmit power required to ensure reliable secure communication. Furthermore, we show that the transmit antenna correlation diversity of the intended users and the eavesdropper can be exploited in order to improve the secrecy rate. In fact, under certain orthogonality conditions of the channel covariance matrices, the secrecy rate loss introduced by the eavesdropper can be completely mitigated.

...read moreread less

272 citations

Journal Article•DOI•

Multi-Server Coded Caching

[...]

Seyed Pooya Shariatpanahi, Seyed Abolfazl Motahari¹, Babak Hossein Khalaj¹•Institutions (1)

Sharif University of Technology¹

01 Dec 2016-IEEE Transactions on Information Theory

TL;DR: The results suggest that, in the case of networks with multiple servers, type of network topology can be exploited to reduce service delay.

...read moreread less

Abstract: In this paper, we consider multiple cache-enabled clients connected to multiple servers through an intermediate network. We design several topology-aware coding strategies for such networks. Based on the topology richness of the intermediate network, and types of coding operations at internal nodes, we define three classes of networks, namely, dedicated, flexible, and linear networks. For each class, we propose an achievable coding scheme, analyze its coding delay, and also compare it with an information theoretic lower bound. For flexible networks, we show that our scheme is order-optimal in terms of coding delay and, interestingly, the optimal memory-delay curve is achieved in certain regimes. In general, our results suggest that, in the case of networks with multiple servers, type of network topology can be exploited to reduce service delay.

...read moreread less

255 citations

Journal Article•DOI•

Vandermonde Decomposition of Multilevel Toeplitz Matrices With Application to Multidimensional Super-Resolution

[...]

Zai Yang¹, Lihua Xie², Petre Stoica³•Institutions (3)

Nanjing University of Science and Technology¹, Nanyang Technological University², Uppsala University³

12 Apr 2016-IEEE Transactions on Information Theory

TL;DR: In this paper, a constructive method for finding the Vandermonde decomposition is provided when the matrix rank is lower than the dimension of each Toeplitz block, and a numerical method for searching for a decomposition was also proposed when the Matrix rank is higher.

...read moreread less

Abstract: The Vandermonde decomposition of Toeplitz matrices, discovered by Caratheodory and Fejer in the 1910s and rediscovered by Pisarenko in the 1970s, forms the basis of modern subspace methods for 1-D frequency estimation. Many related numerical tools have also been developed for multidimensional (MD), especially 2-D, frequency estimation; however, a fundamental question has remained unresolved as to whether an analog of the Vandermonde decomposition holds for multilevel Toeplitz matrices in the MD case. In this paper, an affirmative answer to this question and a constructive method for finding the decomposition are provided when the matrix rank is lower than the dimension of each Toeplitz block. A numerical method for searching for a decomposition is also proposed when the matrix rank is higher. The new results are applied to study the MD frequency estimation within the recent super-resolution framework. A precise formulation of the atomic $\ell _{0}$ norm is derived using the Vandermonde decomposition. Practical algorithms for frequency estimation are proposed based on the relaxation techniques. Extensive numerical simulations are provided to demonstrate the effectiveness of these algorithms compared with the existing atomic norm and subspace methods.

...read moreread less

Journal Article•DOI•

Effect of Message Transmission Path Diversity on Status Age

[...]

Clement Kam¹, Sastry Kompella¹, Gam D. Nguyen¹, Anthony Ephremides²•Institutions (2)

United States Naval Research Laboratory¹, University of Maryland, College Park²

01 Mar 2016-IEEE Transactions on Information Theory

TL;DR: A system in which a sensor sends random status updates over a dynamic network to a monitor is studied, and an approximation that is shown to be close to the simulated age of the status age is provided.

...read moreread less

Abstract: This paper focuses on status age, which is a metric for measuring the freshness of a continually updated piece of information (i.e., status) as observed at a remote monitor. In paper, we study a system in which a sensor sends random status updates over a dynamic network to a monitor. For this system, we consider the impact of having messages take different routes through the network on the status age. First, we consider a network with plentiful resources (i.e., many nodes that can provide numerous alternate paths), so that packets need not wait in queues at each node in a multihop path. This system is modeled as a single queue with an infinite number of servers, specifically as an $M/M/\infty $ queue. Packets routed over a dynamic network may arrive at the monitor out of order, which we account for in our analysis for the $M/M/\infty $ model. We then consider a network with somewhat limited resources, so that packets can arrive out of order but also must wait in a queue. This is modeled as a single queue with two servers, specifically an $M/M/2$ queue. We present the exact approach to computing the analytical status age, and we provide an approximation that is shown to be close to the simulated age. We also compare both models with $M/M/1$ , which corresponds to severely limited network resources, and we demonstrate the tradeoff between the status age and the unnecessary network resource consumption.

...read moreread less

Journal Article•DOI•

The Generalized Lasso With Non-Linear Observations

[...]

Yaniv Plan¹, Roman Vershynin²•Institutions (2)

University of British Columbia¹, University of Michigan²

01 Mar 2016-IEEE Transactions on Information Theory

TL;DR: The first theoretical accuracy guarantee for 1-b compressed sensing with unknown covariance matrix of the measurement vectors is given, and the single-index model of non-linearity is considered, allowing the non- linearity to be discontinuous, not one-to-one and even unknown.

...read moreread less

Abstract: We study the problem of signal estimation from non-linear observations when the signal belongs to a low-dimensional set buried in a high-dimensional space. A rough heuristic often used in practice postulates that the non-linear observations may be treated as noisy linear observations, and thus, the signal may be estimated using the generalized Lasso. This is appealing because of the abundance of efficient, specialized solvers for this program. Just as noise may be diminished by projecting onto the lower dimensional space, the error from modeling non-linear observations with linear observations will be greatly reduced when using the signal structure in the reconstruction. We allow general signal structure, only assuming that the signal belongs to some set $K \subset \mathbb {R} ^{n}$ . We consider the single-index model of non-linearity. Our theory allows the non-linearity to be discontinuous, not one-to-one and even unknown. We assume a random Gaussian model for the measurement matrix, but allow the rows to have an unknown covariance matrix. As special cases of our results, we recover near-optimal theory for noisy linear observations, and also give the first theoretical accuracy guarantee for 1-b compressed sensing with unknown covariance matrix of the measurement vectors.

...read moreread less

Journal Article•DOI•

$f$ -Divergence Inequalities

[...]

Igal Sason¹, Sergio Verdu²•Institutions (2)

Technion – Israel Institute of Technology¹, Princeton University²

01 Nov 2016-IEEE Transactions on Information Theory

TL;DR: In this paper, the Renyi divergence in terms of the relative information spectrum is derived, leading to a bound on the total variation distance and its relation to the relative entropy, including reverse Pinsker inequalities.

...read moreread less

Abstract: This paper develops systematic approaches to obtain $f$ -divergence inequalities, dealing with pairs of probability measures defined on arbitrary alphabets. Functional domination is one such approach, where special emphasis is placed on finding the best possible constant upper bounding a ratio of $f$ -divergences. Another approach used for the derivation of bounds among $f$ -divergences relies on moment inequalities and the logarithmic-convexity property, which results in tight bounds on the relative entropy and Bhattacharyya distance in terms of $\chi ^{2}$ divergences. A rich variety of bounds are shown to hold under boundedness assumptions on the relative information. Special attention is devoted to the total variation distance and its relation to the relative information and relative entropy, including “reverse Pinsker inequalities,” as well as on the $E_\gamma $ divergence, which generalizes the total variation distance. Pinsker’s inequality is extended for this type of $f$ -divergence, a result which leads to an inequality linking the relative entropy and relative information spectrum. Integral expressions of the Renyi divergence in terms of the relative information spectrum are derived, leading to bounds on the Renyi divergence in terms of either the variational distance or relative entropy.

...read moreread less

Journal Article•DOI•

Achieving Exact Cluster Recovery Threshold via Semidefinite Programming

[...]

Bruce Hajek¹, Yihong Wu¹, Jiaming Xu²•Institutions (2)

University of Illinois at Urbana–Champaign¹, University of Pennsylvania²

01 May 2016-IEEE Transactions on Information Theory

TL;DR: It is shown that the semidefinite programming relaxation of the maximum likelihood estimator achieves the optimal threshold for exactly recovering the partition from the graph with probability tending to one in the planted dense subgraph model containing a single cluster of size proportional to $n$.

...read moreread less

Abstract: The binary symmetric stochastic block model deals with a random graph of $n$ vertices partitioned into two equal-sized clusters, such that each pair of vertices is independently connected with probability $p$ within clusters and $q$ across clusters. In the asymptotic regime of $p=a \log n/n$ and $q=b \log n/n$ for fixed $a,b$ , and $n \to \infty $ , we show that the semidefinite programming relaxation of the maximum likelihood estimator achieves the optimal threshold for exactly recovering the partition from the graph with probability tending to one, resolving a conjecture of Abbe et al. Furthermore, we show that the semidefinite programming relaxation also achieves the optimal recovery threshold in the planted dense subgraph model containing a single cluster of size proportional to $n$ .

...read moreread less

Journal Article•DOI•

Hierarchical Coded Caching

[...]

Nikhil Karamchandani¹, Urs Niesen², Mohammad Ali Maddah-Ali², Suhas Diggavi³•Institutions (3)

Indian Institute of Technology Bombay¹, Bell Labs², University of California, Los Angeles³

22 Apr 2016-IEEE Transactions on Information Theory

TL;DR: A new caching scheme that combines two basic approaches to provide coded multicasting opportunities within each layer and across multiple layers is proposed, which achieves the optimal communication rates to within a constant multiplicative and additive gap.

...read moreread less

Abstract: caching of popular content during off-peak hours is a strategy to reduce network loads during peak hours. Recent work has shown significant benefits of designing such caching strategies not only to locally deliver the part of the content, but also to provide coded multicasting opportunities even among users with different demands. Exploiting both of these gains was shown to be approximately optimal for caching systems with a single layer of caches. Motivated by practical scenarios, we consider, in this paper, a hierarchical content delivery network with two layers of caches. We propose a new caching scheme that combines two basic approaches. The first approach provides coded multicasting opportunities within each layer; the second approach provides coded multicasting opportunities across multiple layers. By striking the right balance between these two approaches, we show that the proposed scheme achieves the optimal communication rates to within a constant multiplicative and additive gap. We further show that there is no tension between the rates in each of the two layers up to the aforementioned gap. Thus, both the layers can simultaneously operate at approximately the minimum rate.

...read moreread less

Journal Article•DOI•

Minimax Rates of Entropy Estimation on Large Alphabets via Best Polynomial Approximation

[...]

Yihong Wu¹, Pengkun Yang¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

30 Mar 2016-IEEE Transactions on Information Theory

TL;DR: It is shown that the minimax mean-square error is within the universal multiplicative constant factors of (k/n log k)2 t log2 k/n if n exceeds a constant factor of ( k/log k); otherwise, there exists no consistent estimator.

...read moreread less

Abstract: Consider the problem of estimating the Shannon entropy of a distribution over $k$ elements from $n$ independent samples. We show that the minimax mean-square error is within the universal multiplicative constant factors of $\big ({k }/{n \log k}\big )^{2} + {\log ^{2} k}/{n}$ if $n$ exceeds a constant factor of $({k}/{\log k})$ ; otherwise, there exists no consistent estimator. This refines the recent result of Valiant and Valiant that the minimal sample size for consistent entropy estimation scales according to $\Theta ({k}/{\log k})$ . The apparatus of the best polynomial approximation plays a key role in both the construction of optimal estimators and, by a duality argument, the minimax lower bound.

...read moreread less

Journal Article•DOI•

Finite-Length Analysis of Caching-Aided Coded Multicasting

[...]

Karthikeyan Shanmugam¹, Mingyue Ji¹, Antonia M. Tulino², Jaime Llorca¹, Alexandros G. Dimakis³ - Show less +1 more•Institutions (3)

Bell Labs¹, University of Naples Federico II², University of Texas at Austin³

01 Oct 2016-IEEE Transactions on Information Theory

TL;DR: This paper designs a new random placement and an efficient clique cover-based delivery scheme that achieves this lower bound approximately and provides tight concentration results that show that the average number of transmissions concentrates very well requiring only a polynomial number of packets in the rest of the system parameters.

...read moreread less

Abstract: We study a noiseless broadcast link serving $K$ users whose requests arise from a library of $N$ files. Every user is equipped with a cache of size $M$ files each. It has been shown that by splitting all the files into packets and placing individual packets in a random independent manner across all the caches prior to any transmission, at most $N/M$ file transmissions are required for any set of demands from the library. The achievable delivery scheme involves linearly combining packets of different files following a greedy clique cover solution to the underlying index coding problem. This remarkable multiplicative gain of random placement and coded delivery has been established in the asymptotic regime when the number of packets per file $F$ scales to infinity. The asymptotic coding gain obtained is roughly $t=KM/N$ . In this paper, we initiate the finite-length analysis of random caching schemes when the number of packets $F$ is a function of the system parameters $M,N$ , and $K$ . Specifically, we show that the existing random placement and clique cover delivery schemes that achieve optimality in the asymptotic regime can have at most a multiplicative gain of 2 even if the number of packets is exponential in the asymptotic gain $t=K({M}/{N})$ . Furthermore, for any clique cover-based coded delivery and a large class of random placement schemes that include the existing ones, we show that the number of packets required to get a multiplicative gain of $({4}/{3})g$ is at least $O(({g}/{K})(N/M)^{g-1})$ . We design a new random placement and an efficient clique cover-based delivery scheme that achieves this lower bound approximately. We also provide tight concentration results that show that the average (over the random placement involved) number of transmissions concentrates very well requiring only a polynomial number of packets in the rest of the system parameters.

...read moreread less

Journal Article•DOI•

Random Forests and Kernel Methods

[...]

Erwan Scornet¹•Institutions (1)

Pierre-and-Marie-Curie University¹

01 Mar 2016-IEEE Transactions on Information Theory

TL;DR: In this article, a connection between the random forests and the kernel methods is made, and it is shown empirically that the KeRF estimates compare favourably to the random forest estimates.

...read moreread less

Abstract: Random forests are ensemble methods which grow trees as base learners and combine their predictions by averaging. Random forests are known for their good practical performance, particularly in high-dimensional settings. On the theoretical side, several studies highlight the potentially fruitful connection between the random forests and the kernel methods. In this paper, we work out this connection in detail. In particular, we show that by slightly modifying their definition, random forests can be rewritten as kernel methods (called KeRF for kernel based on random forests) which are more interpretable and easier to analyze. Explicit expressions of KeRF estimates for some specific random forest models are given, together with upper bounds on their rate of consistency. We also show empirically that the KeRF estimates compare favourably to the random forest estimates.

...read moreread less

Journal Article•DOI•

One-Bit Compressive Sensing With Norm Estimation

[...]

Karin C. Knudson¹, Rayan Saab², Rachel Ward³•Institutions (3)

Phillips Academy¹, University of California, San Diego², University of Texas at Austin³

01 May 2016-IEEE Transactions on Information Theory

TL;DR: For quantized affine affine measurements of the form $ \mathop {\mathrm {sign\kern 0pt}} olimits (\langle {a}_{i, {x} \rangle + b_{i})$, and if the vectors $ {a, i}$ are random, an appropriate choice of the affine shifts $b_{i}$ allows norm recovery to be easily incorporated into existing methods for one-bit compressive sensing.

...read moreread less

Abstract: Consider the recovery of an unknown signal $ {x}$ from quantized linear measurements. In the one-bit compressive sensing setting, one typically assumes that $ {x}$ is sparse, and that the measurements are of the form $ \mathop {\mathrm {sign\kern 0pt}} olimits (\langle {a}_{i}, {x} \rangle ) \in \{\pm 1\}$ . Since such measurements give no information on the norm of $ {x}$ , recovery methods typically assume that $\| {x} \|_{2}=1$ . We show that if one allows more generally for quantized affine measurements of the form $ \mathop {\mathrm {sign\kern 0pt}} olimits (\langle {a}_{i}, {x} \rangle + b_{i})$ , and if the vectors $ {a}_{i}$ are random, an appropriate choice of the affine shifts $b_{i}$ allows norm recovery to be easily incorporated into existing methods for one-bit compressive sensing. In addition, we show that for arbitrary fixed $ {x}$ in the annulus $r \leq \| {x} \|_{2} \leq R$ , one may estimate the norm $\| {x} \|_{2}$ up to additive error $\delta $ from $m {\gtrsim } R^{4} r^{-2} \delta ^{-2}$ such binary measurements through a single evaluation of the inverse Gaussian error function. Finally, all of our recovery guarantees can be made universal over sparse vectors in the sense that with high probability, one set of measurements and thresholds can successfully estimate all sparse vectors $ {x}$ in a Euclidean ball of known radius.

...read moreread less

Journal Article•DOI•

Aligned Image Sets Under Channel Uncertainty: Settling Conjectures on the Collapse of Degrees of Freedom Under Finite Precision CSIT

[...]

Arash Gholami Davoodi¹, Syed A. Jafar¹•Institutions (1)

University of California, Irvine¹

01 Oct 2016-IEEE Transactions on Information Theory

TL;DR: This paper proves that the degrees of freedom (DoF) of a two user broadcast channel must collapse under finite precision channel state information at the transmitter (CSIT) in all non-degenerate settings (e.g., where the probability density function of unknown channel coefficients exists and is bounded).

...read moreread less

Abstract: A conjecture made by Lapidoth et al. at Allerton 2005 (also an open problem presented at ITA 2006) states that the degrees of freedom (DoF) of a two user broadcast channel, where the transmitter is equipped with two antennas and each user is equipped with one antenna, must collapse under finite precision channel state information at the transmitter (CSIT). That this conjecture, which predates interference alignment, has remained unresolved, is emblematic of a pervasive lack of understanding of the DoF of wireless networks—including interference and $X$ networks—under channel uncertainty at the transmitter(s). In this paper, we prove that the conjecture is true in all non-degenerate settings (e.g., where the probability density function of unknown channel coefficients exists and is bounded). The DoF collapse even when perfect channel knowledge for one user is available to the transmitter. This also settles a related recent conjecture by Tandon et al. The key to our proof is a bound on the number of codewords that can cast the same image (within noise distortion) at the undesired receiver whose channel is subject to finite precision CSIT, while remaining resolvable at the desired receiver whose channel is precisely known by the transmitter. We are also able to generalize the result along two directions. First, if the peak of the probability density function is allowed to scale as $O((\sqrt {P})^\alpha )$ , representing the concentration of probability density (improving CSIT) due to, e.g., quantized feedback at rate $({\alpha }/{2})\log (P)$ , then the DoF is bounded above by $1+\alpha $ , which is also achievable under quantized feedback. Second, we generalize the result to arbitrary number of antennas at the transmitter, arbitrary number of single-antenna users, and complex channels. The generalization directly implies a collapse of DoF to unity under non-degenerate channel uncertainty for the general $K$ -user interference and $M\times N$ user $X$ networks as well.

...read moreread less

Journal Article•DOI•

Locality and Availability in Distributed Storage

[...]

Ankit Singh Rawat¹, Dimitris S. Papailiopoulos¹, Alexandros G. Dimakis¹, Sriram Vishwanath¹•Institutions (1)

University of Texas at Austin¹

03 Feb 2016-IEEE Transactions on Information Theory

TL;DR: It is shown that it is possible to construct codes that can support a scaling number of parallel reads while keeping the rate to be an arbitrarily high constant, and that this is possible with the minimum Hamming distance arbitrarily close to the Singleton bound.

...read moreread less

Abstract: This paper studies the problem of information symbol availability in codes: we refer to a systematic code as code with $(r, t)$ -availability if every information (systematic) symbol can be reconstructed from $t$ disjoint groups of other code symbols, each of the sizes at most $r$ . This paper shows that it is possible to construct codes that can support a scaling number of parallel reads while keeping the rate to be an arbitrarily high constant. It further shows that this is possible with the minimum Hamming distance arbitrarily close to the Singleton bound. This paper also presents a bound demonstrating a tradeoff between rate, minimum Hamming distance, and availability parameters. Our codes match the aforementioned bound, and their constructions rely on certain combinatorial structures. Resolvable designs provide one way to realize these required combinatorial structures. The two constructions presented in this paper require field sizes, which are linear and exponential in the code length, respectively. From a practical standpoint, our codes are relevant for distributed storage applications involving hot data, i.e., the information, which is frequently accessed by multiple processes in parallel.

...read moreread less

Journal Article•DOI•

Linear Codes With Two or Three Weights From Weakly Regular Bent Functions

[...]

Chunming Tang¹, Nian Li², Yanfeng Qi³, Zhengchun Zhou⁴, Tor Helleseth² - Show less +1 more•Institutions (4)

China West Normal University¹, University of Bergen², Hangzhou Dianzi University³, Southwest Jiaotong University⁴

01 Mar 2016-IEEE Transactions on Information Theory

TL;DR: This paper first generalizes the method of constructing two-weight and three-weight linear codes of Ding et al. to general weakly regular bent functions and determines the weight distributions of these linear codes.

...read moreread less

Abstract: Linear codes with a few weights have applications in consumer electronics, communication, data storage system, secret sharing, authentication codes, association schemes, and strongly regular graphs. This paper first generalizes the method of constructing two-weight and three-weight linear codes of Ding et al. and Zhou et al. to general weakly regular bent functions and determines the weight distributions of these linear codes. It solves an open problem proposed by Ding et al. Furthermore, this paper constructs new linear codes with two or three weights and presents their weight distributions. They contain some optimal codes meeting certain bound on linear codes.

...read moreread less

Journal Article•DOI•

First- and Second-Order Statistics Characterization of Hawkes Processes and Non-Parametric Estimation

[...]

Emmanuel Bacry¹, Jean-François Muzy²•Institutions (2)

École Polytechnique¹, Centre national de la recherche scientifique²

25 Feb 2016-IEEE Transactions on Information Theory

TL;DR: A systematic study of a fast and efficient method, which main principles were initially sketched by Bacry and Muzy, to perform a non-parametric estimation of the Hawkes kernel matrix in the general framework of marked Hawkes processes.

...read moreread less

Abstract: We show that the jumps correlation matrix of a multivariate Hawkes process is related to the Hawkes kernel matrix through a system of Wiener–Hopf integral equations. A Wiener–Hopf argument allows one to prove that this system (in which the kernel matrix is the unknown) possesses a unique causal solution and consequently that the first- and second-order properties fully characterize a Hawkes process. The numerical inversion of this system of integral equations allows us to propose a fast and efficient method, which main principles were initially sketched by Bacry and Muzy, to perform a non-parametric estimation of the Hawkes kernel matrix. In this paper, we perform a systematic study of this non-parametric estimation procedure in the general framework of marked Hawkes processes. We precisely describe this procedure step by step. We discuss the estimation error and explain how the values for the main parameters should be chosen. Various numerical examples are given in order to illustrate the broad possibilities of this estimation procedure ranging from monovariate (power-law or non-positive kernels) up to three-variate (circular dependence) processes. A comparison with other non-parametric estimation procedures is made. Applications to high-frequency trading events in financial markets and to earthquakes occurrence dynamics are finally considered.

...read moreread less

Journal Article•DOI•

Bounds on the Parameters of Locally Recoverable Codes

[...]

Itzhak Tamo¹, Alexander Barg¹, Alexey Frolov²•Institutions (2)

University of Maryland, College Park¹, Russian Academy of Sciences²

18 Jan 2016-IEEE Transactions on Information Theory

TL;DR: New finite-length and asymptotic bounds on the parameters of LRC codes are derived and an asymPTotic Gilbert-Varshamov type bound is derived for LRC code types and the maximum attainable relative distance is found.

...read moreread less

Abstract: A locally recoverable code (LRC code) is a code over a finite alphabet, such that every symbol in the encoding is a function of a small number of other symbols that form a recovering set. In this paper, we derive new finite-length and asymptotic bounds on the parameters of LRC codes. For LRC codes with a single recovering set for every coordinate, we derive an asymptotic Gilbert–Varshamov type bound for LRC codes and find the maximum attainable relative distance of asymptotically good LRC codes. Similar results are established for LRC codes with two disjoint recovering sets for every coordinate. For the case of multiple recovering sets (the availability problem), we derive a lower bound on the parameters using expander graph arguments. Finally, we also derive finite-length upper bounds on the rate and the distance of LRC codes with multiple recovering sets.

...read moreread less

Journal Article•DOI•

Achieving Exact Cluster Recovery Threshold via Semidefinite Programming: Extensions

[...]

Bruce Hajek¹, Yihong Wu¹, Jiaming Xu²•Institutions (2)

University of Illinois at Urbana–Champaign¹, University of California, Berkeley²

01 Oct 2016-IEEE Transactions on Information Theory

TL;DR: It is shown that SDP relaxations also achieve the sharp recovery threshold in the following cases: 1) binary SBM with two clusters of sizes proportional to network size but not necessarily equal; 2) S BM with a fixed number of equal-sized clusters; and 3) binary censored block model with the background graph being Erdös-Rényi.

...read moreread less

Abstract: Resolving a conjecture of Abbe, Bandeira, and Hall, the authors have recently shown that the semidefinite programming (SDP) relaxation of the maximum likelihood estimator achieves the sharp threshold for exactly recovering the community structure under the binary stochastic block model (SBM) of two equal-sized clusters. The same was shown for the case of a single cluster and outliers. Extending the proof techniques, in this paper, it is shown that SDP relaxations also achieve the sharp recovery threshold in the following cases: 1) binary SBM with two clusters of sizes proportional to network size but not necessarily equal; 2) SBM with a fixed number of equal-sized clusters; and 3) binary censored block model with the background graph being Erdős–Renyi. Furthermore, a sufficient condition is given for an SDP procedure to achieve exact recovery for the general case of a fixed number of clusters plus outliers. These results demonstrate the versatility of SDP relaxation as a simple, general purpose, computationally feasible methodology for community detection.

...read moreread less

Journal Article•DOI•

Optimal Locally Repairable Codes and Connections to Matroid Theory

[...]

Itzhak Tamo¹, Dimitris S. Papailiopoulos², Alexandros G. Dimakis²•Institutions (2)

University of Maryland, College Park¹, University of Texas at Austin²

01 Dec 2016-IEEE Transactions on Information Theory

TL;DR: This paper presents an explicit LRC that is simple to construct and is optimal for a specific set of coding parameters, based on grouping RS symbols and then adding extra simple parities that allow for small repair locality.

...read moreread less

Abstract: Petabyte-scale distributed storage systems are currently transitioning to erasure codes to achieve higher storage efficiency. Classical codes, such as Reed–Solomon (RS), are highly sub-optimal for distributed environments due to their high overhead during single-failure events. Locally repairable codes (LRCs) form a new family of codes that are repair efficient. In particular, LRCs minimize the number of nodes participating in single node repairs. Fundamental bounds and methods for explicitly constructing LRCs suitable for deployment in distributed storage clusters are not fully understood and currently form an active area of research. In this paper, we present an explicit LRC that is simple to construct and is optimal for a specific set of coding parameters. Our construction is based on grouping RS symbols and then adding extra simple parities that allow for small repair locality. For the analysis of the optimality of the code, we derive a new result on the matroid represented by the code’s generator matrix.

...read moreread less

Journal Article•DOI•

On the Relation Between Identifiability, Differential Privacy, and Mutual-Information Privacy

[...]

Weina Wang¹, Lei Ying¹, Junshan Zhang¹•Institutions (1)

Arizona State University¹

23 Jun 2016-IEEE Transactions on Information Theory

TL;DR: The results in this paper reveal some consistency between two worst case notions of privacy, namely, identifiability and differential privacy, and an average notion of Privacy, mutual-information privacy.

...read moreread less

Abstract: This paper investigates the relation between three different notions of privacy: identifiability, differential privacy, and mutual-information privacy. Under a unified privacy-distortion framework, where the distortion is defined to be the expected Hamming distance between the input and output databases, we establish some fundamental connections between these three privacy notions. Given a maximum allowable distortion $D$ , we define the privacy-distortion functions $\epsilon _{\mathrm{ i}}^{*}(D)$ , $\epsilon _{\mathrm{ d}}^{*}(D)$ , and $\epsilon _{\mathrm{ m}}^{*}(D)$ to be the smallest (most private/best) identifiability level, differential privacy level, and mutual information between the input and the output, respectively. We characterize $\epsilon _{\mathrm{ i}}^{*}(D)$ and $\epsilon _{\mathrm{ d}}^{*}(D)$ , and prove that $\epsilon _{\mathrm{ i}}^{*}(D)-\epsilon _{X}\le \epsilon _{\mathrm{ d}}^{*}(D)\le \epsilon _{\mathrm{ i}}^{*}(D)$ for $D$ within certain range, where $\epsilon _{X}$ is a constant determined by the prior distribution of the original database $X$ , and diminishes to zero when $X$ is uniformly distributed. Furthermore, we show that $\epsilon _{\mathrm{ i}}^{*}(D)$ and $\epsilon _{\mathrm{ m}}^{*}(D)$ can be achieved by the same mechanism for $D$ within certain range, i.e., there is a mechanism that simultaneously minimizes the identifiability level and achieves the best mutual-information privacy. Based on these two connections, we prove that this mutual-information optimal mechanism satisfies $\epsilon $ -differential privacy with $\epsilon _{\mathrm{ d}}^{*}(D)\le \epsilon \le \epsilon _{\mathrm{ d}}^{*}(D)+2\epsilon _{X}$ . The results in this paper reveal some consistency between two worst case notions of privacy, namely, identifiability and differential privacy, and an average notion of privacy, mutual-information privacy.

...read moreread less

Journal Article•DOI•

Constructions and Noise Threshold of Hyperbolic Surface Codes

[...]

Nikolas P. Breuckmann¹, Barbara M. Terhal¹•Institutions (1)

RWTH Aachen University¹

21 Apr 2016-IEEE Transactions on Information Theory

TL;DR: In this paper, the authors obtain concrete constructions of homological quantum codes based on tilings of 2D surfaces with constant negative curvature (hyperbolic surfaces) and provide numerical estimates of the value of the noise threshold and logical error probability of these codes against independent $X$ or $Z$ noise.

...read moreread less

Abstract: We show how to obtain concrete constructions of homological quantum codes based on tilings of 2-D surfaces with constant negative curvature (hyperbolic surfaces). This construction results in 2-D quantum codes whose tradeoff of encoding rate versus protection is more favorable than for the surface code. These surface codes would require variable length connections between qubits, as determined by the hyperbolic geometry. We provide numerical estimates of the value of the noise threshold and logical error probability of these codes against independent $X$ or $Z$ noise, assuming noise-free error correction.

...read moreread less

Journal Article•DOI•

The Optimal Noise-Adding Mechanism in Differential Privacy

[...]

Quan Geng¹, Pramod Viswanath¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Feb 2016-IEEE Transactions on Information Theory

TL;DR: In this paper, the authors characterize the fundamental tradeoff between privacy and utility in differential privacy, and derive the optimal staircase mechanism for a single real-valued query function under a very general utility-maximization (or cost-minimization) framework.

...read moreread less

Abstract: Differential privacy is a framework to quantify to what extent individual privacy in a statistical database is preserved while releasing useful aggregate information about the database. In this paper, within the classes of mechanisms oblivious of the database and the queries beyond the global sensitivity, we characterize the fundamental tradeoff between privacy and utility in differential privacy, and derive the optimal $\epsilon $ -differentially private mechanism for a single real-valued query function under a very general utility-maximization (or cost-minimization) framework. The class of noise probability distributions in the optimal mechanism has staircase-shaped probability density functions which are symmetric (around the origin), monotonically decreasing and geometrically decaying. The staircase mechanism can be viewed as a geometric mixture of uniform probability distributions, providing a simple algorithmic description for the mechanism. Furthermore, the staircase mechanism naturally generalizes to discrete query output settings as well as more abstract settings. We explicitly derive the parameter of the optimal staircase mechanism for $\ell _{1}$ and $\ell _{2}$ cost functions. Comparing the optimal performances with those of the usual Laplacian mechanism, we show that in the high privacy regime ( $\epsilon $ is small), the Laplacian mechanism is asymptotically optimal as $\epsilon \to 0$ ; in the low privacy regime ( $\epsilon $ is large), the minimum magnitude and second moment of noise are $\Theta (\Delta e^{(-{\epsilon }/{2})})$ and $\Theta (\Delta ^{2} e^{(-{2\epsilon }/{3})})$ as $\epsilon \to +\infty $ , respectively, while the corresponding figures when using the Laplacian mechanism are ${\Delta }/{\epsilon }$ and ${2\Delta ^{2}}/{\epsilon ^{2}}$ , where $\Delta $ is the sensitivity of the query function. We conclude that the gains of the staircase mechanism are more pronounced in the moderate-low privacy regime.

...read moreread less

Collapse