scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Information Theory in 2019"


Journal ArticleDOI
TL;DR: An age of information timeliness metric is formulated and a general result for the AoI that is applicable to a wide variety of multiple source service systems is derived that makes AoI evaluation to be comparable in complexity to finding the stationary distribution of a finite-state Markov chain.
Abstract: We examine multiple independent sources providing status updates to a monitor through simple queues. We formulate an age of information (AoI) timeliness metric and derive a general result for the AoI that is applicable to a wide variety of multiple source service systems. For first-come first-served and two types of last-come first-served systems with Poisson arrivals and exponential service times, we find the region of feasible average status ages for multiple updating sources. We then use these results to characterize how a service facility can be shared among multiple updating sources. A new simplified technique for evaluating the AoI in finite-state continuous-time queuing systems is also derived. Based on stochastic hybrid systems, this method makes AoI evaluation to be comparable in complexity to finding the stationary distribution of a finite-state Markov chain.

552 citations


Journal ArticleDOI
TL;DR: In this paper, the problem of learning a shallow neural network that best fits a training data set was studied in the over-parameterized regime, where the numbers of observations are fewer than the number of parameters in the model.
Abstract: In this paper, we study the problem of learning a shallow artificial neural network that best fits a training data set. We study this problem in the over-parameterized regime where the numbers of observations are fewer than the number of parameters in the model. We show that with the quadratic activations, the optimization landscape of training, such shallow neural networks, has certain favorable characteristics that allow globally optimal models to be found efficiently using a variety of local search heuristics. This result holds for an arbitrary training data of input/output pairs. For differentiable activation functions, we also show that gradient descent, when suitably initialized, converges at a linear rate to a globally optimal model. This result focuses on a realizable model where the inputs are chosen i.i.d. from a Gaussian distribution and the labels are generated according to planted weight coefficients.

425 citations


Journal ArticleDOI
TL;DR: This paper considers a “vector AMP” (VAMP) algorithm and shows that VAMP has a rigorous scalar state-evolution that holds under a much broader class of large random matrices A: those that are right-orthogonally invariant.
Abstract: The standard linear regression (SLR) problem is to recover a vector $\mathrm {x}^{0}$ from noisy linear observations $\mathrm {y}=\mathrm {Ax}^{0}+\mathrm {w}$ . The approximate message passing (AMP) algorithm proposed by Donoho, Maleki, and Montanari is a computationally efficient iterative approach to SLR that has a remarkable property: for large i.i.d. sub-Gaussian matrices A, its per-iteration behavior is rigorously characterized by a scalar state-evolution whose fixed points, when unique, are Bayes optimal. The AMP algorithm, however, is fragile in that even small deviations from the i.i.d. sub-Gaussian model can cause the algorithm to diverge. This paper considers a “vector AMP” (VAMP) algorithm and shows that VAMP has a rigorous scalar state-evolution that holds under a much broader class of large random matrices A: those that are right-orthogonally invariant. After performing an initial singular value decomposition (SVD) of A, the per-iteration complexity of VAMP is similar to that of AMP. In addition, the fixed points of VAMP’s state evolution are consistent with the replica prediction of the minimum mean-squared error derived by Tulino, Caire, Verdu, and Shamai. Numerical experiments are used to confirm the effectiveness of VAMP and its consistency with state-evolution predictions.

263 citations


Journal ArticleDOI
TL;DR: A general formula is derived that the stationary distribution of the AoI is given in terms of the stationary distributions of the system delay and the peak AoI, which holds for a wide class of information update systems.
Abstract: This paper considers the stationary distribution of the age of information (AoI) in information update systems. We first derive a general formula for the stationary distribution of the AoI, which holds for a wide class of information update systems. The formula indicates that the stationary distribution of the AoI is given in terms of the stationary distributions of the system delay and the peak AoI. To demonstrate its applicability and usefulness, we analyze the AoI in single-server queues with four different service disciplines: first-come first-served (FCFS), preemptive last-come first-served (LCFS), and two variants of non-preemptive LCFS service disciplines. For the FCFS and the preemptive LCFS service disciplines, the GI/GI/1, M/GI/1, and GI/M/1 queues are considered, and for the non-preemptive LCFS service disciplines, the M/GI/1 and GI/M/1 queues are considered. With these results, we further show comparison results for the mean AoI’s in the M/GI/1 and GI/M/1 queues under those service disciplines.

233 citations


Journal ArticleDOI
TL;DR: The Last-Generated, First-Serve (LGFS) scheduling policy, in which the packet with the earliest generation time is processed with the highest priority, is proposed, and the age optimality results of LCFS-type policies are established.
Abstract: In this paper, we investigate scheduling policies that minimize the age of information in single-hop queueing systems. We propose a Last-Generated, First-Serve (LGFS) scheduling policy, in which the packet with the earliest generation time is processed with the highest priority. If the service times are i.i.d. exponentially distributed, the preemptive LGFS policy is proven to be age-optimal in a stochastic ordering sense. If the service times are i.i.d. and satisfy a New-Better-than-Used (NBU) distributional property, the non-preemptive LGFS policy is shown to be within a constant gap from the optimum age performance. These age-optimality results are quite general: (i) they hold for arbitrary packet generation times and arrival times (including out-of-order packet arrivals); (ii) they hold for multi-server packet scheduling with the possibility of replicating a packet over multiple servers; (iii) and they hold for minimizing not only the time-average age and mean peak age, but also for minimizing the age stochastic process and any non-decreasing functional of the age stochastic process. If the packet generation time is equal to the packet arrival time, the LGFS policies reduce to the Last-Come, First-Serve (LCFS) policies. Hence, the age optimality results of LCFS-type policies are also established.

196 citations


Journal ArticleDOI
TL;DR: The capacity of SPIR is zero, regardless of the number of messages, if the databases have access to common randomness that is independent of the messages and in the amount that is at least two bits per desired message bit.
Abstract: Private information retrieval (PIR) is the problem of retrieving, as efficiently as possible, one out of $K$ messages from $N$ non-communicating replicated databases (each holds all $K$ messages) while keeping the identity of the desired message index a secret from each individual database. Symmetric PIR (SPIR) is a generalization of PIR to include the requirement that beyond the desired message, the user learns nothing about the other $K-1$ messages. The information theoretic capacity of SPIR (equivalently, the reciprocal of minimum download cost) is the maximum number of bits of desired information that can be privately retrieved per bit of downloaded information. We show that the capacity of SPIR is $1-1/N$ regardless of the number of messages $K$ , if the databases have access to common randomness (not available to the user) that is independent of the messages, in the amount that is at least $1/(N-1)$ bits per desired message bit. Otherwise, if the amount of common randomness is less than $1/(N-1)$ bits per message bit, then the capacity of SPIR is zero. Extensions to the capacity region of SPIR and the capacity of finite length SPIR are provided.

178 citations


Journal ArticleDOI
TL;DR: In this paper, the problem of single-round private information retrieval (PIR) from $N$ replicated databases was considered and the authors derived the information-theoretic capacity of this problem, which is the maximum number of correct symbols that can be retrieved privately (under the $T$ -privacy constraint) for every symbol of the downloaded data.
Abstract: We consider the problem of single-round private information retrieval (PIR) from $N$ replicated databases. We consider the case when $B$ databases are outdated (unsynchronized), or even worse, adversarial (Byzantine), and therefore, can return incorrect answers. In the PIR problem with Byzantine databases (BPIR), a user wishes to retrieve a specific message from a set of $M$ messages with zero-error, irrespective of the actions performed by the Byzantine databases. We consider the $T$ -privacy constraint in this paper, where any $T$ databases can collude, and exchange the queries submitted by the user. We derive the information-theoretic capacity of this problem, which is the maximum number of correct symbols that can be retrieved privately (under the $T$ -privacy constraint) for every symbol of the downloaded data. We determine the exact BPIR capacity to be $C=(N-2B)/N \cdot (1-T/(N-2B))/(1-(T/(N-2B))^{M})$ , if $2B+T . This capacity expression shows that the effect of Byzantine databases on the retrieval rate is equivalent to removing $2B$ databases from the system, with a penalty factor of $(N-2B)/N$ , which signifies that even though the number of databases needed for PIR is effectively $N-2B$ , the user still needs to access the entire $N$ databases. The result shows that for the unsynchronized PIR problem, if the user does not have any knowledge about the fraction of the messages that are mis-synchronized, the single-round capacity is the same as the BPIR capacity. Our achievable scheme extends the optimal achievable scheme for the robust PIR (RPIR) problem to correct the errors introduced by the Byzantine databases as opposed to erasures in the RPIR problem. Our converse proof uses the idea of the cut-set bound in the network coding problem against adversarial nodes.

170 citations


Journal ArticleDOI
TL;DR: This paper proposes heterogeneous coded matrix multiplication (HCMM) algorithm for performing distributed matrix multiplication over heterogeneous clusters that are provably asymptotically optimal for a broad class of processing time distributions and develops a heuristic algorithm for HCMM load allocation for the distributed implementation of budget-limited computation tasks.
Abstract: In large-scale distributed computing clusters, such as Amazon EC2, there are several types of “system noise” that can result in major degradation of performance: system failures, bottlenecks due to limited communication bandwidth, latency due to straggler nodes, and so on. There have been recent results that demonstrate the impact of coding for efficient utilization of computation and storage redundancy to alleviate the effect of stragglers and communication bottlenecks in homogeneous clusters. In this paper, we focus on general heterogeneous distributed computing clusters consist of a variety of computing machines with different capabilities. We propose a coding framework for speeding up distributed computing in heterogeneous clusters by trading redundancy for reducing the latency of computation. In particular, we propose heterogeneous coded matrix multiplication (HCMM) algorithm for performing distributed matrix multiplication over heterogeneous clusters that are provably asymptotically optimal for a broad class of processing time distributions. Moreover, we show that HCMM is unboundedly faster than any uncoded scheme that partitions the total workload among the workers. To demonstrate how the proposed HCMM scheme can be applied in practice, we provide results from numerical studies and Amazon EC2 experiments comparing HCMM with three benchmark load allocation schemes—uniform uncoded, load-balanced uncoded, and uniform coded. In particular, in our numerical studies, HCMM achieves speedups of up to 73%, 56%, and 42%, respectively, over the three benchmark schemes mentioned earlier. Furthermore, we carry out experiments over Amazon EC2 clusters and demonstrate how HCMM can be combined with rateless codes with nearly linear decoding complexity. In particular, we show that HCMM combined with the Luby transform codes can significantly reduce the overall execution time. HCMM is found to be up to 61%, 46%, and 36% faster than the aforementioned three benchmark schemes, respectively. Additionally, we provide a generalization to the problem of optimal load allocation in heterogeneous settings, where we take into account the monetary costs associated with distributed computing clusters. We argue that HCMM is asymptotically optimal for budget-constrained scenarios as well. In particular, we characterize the minimum possible expected cost associated with a computation task over a given cluster of machines. Furthermore, we develop a heuristic algorithm for (HCMM) load allocation for the distributed implementation of budget-limited computation tasks.

163 citations


Journal ArticleDOI
TL;DR: In a practically important case where the number of files () is large, the rate-memory tradeoff of the above caching system is exactly characterized for systems with no more than five users and the tradeoff within a factor of 2 otherwise.
Abstract: We consider a basic caching system, where a single server with a database of $N$ files (eg, movies) is connected to a set of $K$ users through a shared bottleneck link Each user has a local cache memory with a size of $M$ files The system operates in two phases: a placement phase, where each cache memory is populated up to its size from the database, and a following delivery phase, where each user requests a file from the database, and the server is responsible for delivering the requested contents The objective is to design the two phases to minimize the load (peak or average) of the bottleneck link We characterize the rate-memory tradeoff of the above caching system within a factor of 200884 for both the peak rate and the average rate (under uniform file popularity), improving the state of the arts that are within a factor of 4 and 47, respectively Moreover, in a practically important case where the number of files ( $N$ ) is large, we exactly characterize the tradeoff for systems with no more than five users and characterize the tradeoff within a factor of 2 otherwise To establish these results, we develop two new converse bounds that improve over the state of the art

141 citations


Journal ArticleDOI
TL;DR: In this article, the authors considered the problem of private information retrieval from non-colluding and replicated databases, where the user is equipped with a cache that holds an uncoded fraction from each of the stored messages in the databases.
Abstract: We consider the problem of private information retrieval (PIR) from $N$ non-colluding and replicated databases when the user is equipped with a cache that holds an uncoded fraction $r$ from each of the $K$ stored messages in the databases. We assume that the databases are unaware of the cache content. We investigate $D^{*}(r)$ the optimal download cost normalized with the message size as a function of $K$ , $N$ , and $r$ . For a fixed $K$ and $N$ , we develop an inner bound (converse bound) for the $D^{*}(r)$ curve. The inner bound is a piece-wise linear function in $r$ that consists of $K$ line segments. For the achievability, we develop explicit schemes that exploit the cached bits as side information to achieve $K-1$ non-degenerate corner points. These corner points differ in the number of cached bits that are used to generate the one-side information equation. We obtain an outer bound (achievability) for any caching ratio by memory sharing between these corner points. Thus, the outer bound is also a piece-wise linear function in $r$ that consists of $K$ line segments. The inner and the outer bounds match in general for the cases of very low-caching ratio and very high-caching ratio. As a corollary, we fully characterize the optimal download cost caching ratio tradeoff for $K=3$ . For general $K$ , $N$ , and $r$ , we show that the largest gap between the achievability and the converse bounds is 1/6. Our results show that the download cost can be reduced beyond memory sharing if the databases are unaware of the cached content.

136 citations


Journal ArticleDOI
TL;DR: In this article, the authors introduce a new algorithm for realizing maximum likelihood decoding for arbitrary codebooks in discrete channels with or without memory, in which the receiver rank-orders noise sequences from most likely to least likely.
Abstract: We introduce a new algorithm for realizing maximum likelihood (ML) decoding for arbitrary codebooks in discrete channels with or without memory, in which the receiver rank-orders noise sequences from most likely to least likely. Subtracting noise from the received signal in that order, the first instance that results in a member of the codebook is the ML decoding. We name this algorithm GRAND for Guessing Random Additive Noise Decoding. We establish that GRAND is capacity-achieving when used with random codebooks. For rates below capacity, we identify error exponents, and for rates beyond capacity, we identify success exponents. We determine the scheme’s complexity in terms of the number of computations that the receiver performs. For rates beyond capacity, this reveals thresholds for the number of guesses by which, if a member of the codebook is identified, that it is likely to be the transmitted code word. We introduce an approximate ML decoding scheme where the receiver abandons the search after a fixed number of queries, an approach we dub GRANDAB, for GRAND with ABandonment. While not an ML decoder, we establish that the algorithm GRANDAB is also capacity-achieving for an appropriate choice of abandonment threshold, and characterize its complexity, error, and success exponents. Worked examples are presented for Markovian noise that indicate these decoding schemes substantially outperform the brute force decoding approach.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed three private information retrieval (PIR) protocols for distributed storage systems (DSSs) where data is stored using an arbitrary linear code and provided a necessary and sufficient condition for a code to achieve the maximum distance separable (MDS) PIR capacity.
Abstract: We propose three private information retrieval (PIR) protocols for distributed storage systems (DSSs), where data is stored using an arbitrary linear code. The first two protocols, named Protocol 1 and Protocol 2, achieve privacy for the scenario with noncolluding nodes. Protocol 1 requires a file size that is exponential in the number of files in the system, while Protocol 2 requires a file size that is independent of the number of files and is hence simpler. We prove that, for certain linear codes, Protocol 1 achieves the maximum distance separable (MDS) PIR capacity, i.e., the maximum PIR rate (the ratio of the amount of retrieved stored data per unit of downloaded data) for a DSS that uses an MDS code to store any given (finite and infinite) number of files, and Protocol 2 achieves the asymptotic MDS-PIR capacity (with infinitely large number of files in the DSS). In particular, we provide a necessary and a sufficient condition for a code to achieve the MDS-PIR capacity with Protocols 1 and 2 and prove that cyclic codes, Reed–Muller (RM) codes, and a class of distance-optimal local reconstruction codes achieve both the finite MDS-PIR capacity (i.e., with any given number of files) and the asymptotic MDS-PIR capacity with Protocols 1 and 2, respectively. Furthermore, we present a third protocol, Protocol 3, for the scenario with multiple colluding nodes, which can be seen as an improvement of a protocol recently introduced by Freij-Hollanti et al. . Similar to the noncolluding case, we provide a necessary and a sufficient condition to achieve the maximum possible PIR rate of Protocol 3. Moreover, we provide a particular class of codes that is suitable for this protocol and show that RM codes achieve the maximum possible PIR rate for the protocol. For all three protocols, we present an algorithm to optimize their PIR rates.

Journal ArticleDOI
TL;DR: Improved upper bounds of the Gaussian thermal loss channel capacity are provided, both in energy-constrained and unconstrained scenarios, and a hexagonal GKP code is reported as an optimal encoding in a practically relevant regime.
Abstract: Gaussian thermal loss channels are of particular importance to quantum communication theory since they model realistic optical communication channels. Except for special cases, the quantum capacity of Gaussian thermal loss channels is not yet quantified completely. In this paper, we provide improved upper bounds of the Gaussian thermal loss channel capacity, both in energy-constrained and unconstrained scenarios. We briefly review Gottesman-Kitaev-Preskill (GKP) codes and discuss their experimental implementation. We then prove, in the energy-unconstrained case, that a family of GKP codes achieves the quantum capacity of Gaussian thermal loss channels up to at most a constant gap from the improved upper bound. In the energy-constrained case, we formulate a biconvex encoding and decoding optimization problem to maximize entanglement fidelity. Then, we solve the biconvex optimization heuristically by an alternating semi-definite programming method and report that, starting from Haar random initial codes, our numerical optimization yields a hexagonal GKP code as an optimal encoding in a practically relevant regime.

Journal ArticleDOI
TL;DR: New achievability and converse bounds are derived, which are uniformly tighter than existing bounds, and lead to the tightest bounds on the second-order coding rate for discrete memoryless and Gaussian wiretap channels.
Abstract: This paper investigates the maximal secret communication rate over a wiretap channel subject to reliability and secrecy constraints at a given blocklength. New achievability and converse bounds are derived, which are uniformly tighter than existing bounds, and lead to the tightest bounds on the second-order coding rate for discrete memoryless and Gaussian wiretap channels. The exact second-order coding rate is established for semi-deterministic wiretap channels, which characterizes the optimal tradeoff between reliability and secrecy in the finite-blocklength regime. Underlying our achievability bounds are two new privacy amplification results, which not only refine the classic privacy amplification results, but also achieve secrecy under the stronger semantic-security metric.

Journal ArticleDOI
TL;DR: In this paper, the authors constructed a class of optimal locally repairable codes of distances 3 and 4 with unbounded length (i.e., length of the codes is independent of the code alphabet size).
Abstract: Like classical block codes, a locally repairable code also obeys the Singleton-type bound (we call a locally repairable code optimal if it achieves the Singleton-type bound). In the breakthrough work of Tamo and Barg, several classes of optimal locally repairable codes were constructed via subcodes of Reed–Solomon codes. Thus, the lengths of the codes given by Tamo and Barg are upper bounded by the code alphabet size $q$ . Recently, it was proved through the extension of construction by Tamo and Barg that the length of $q$ -ary optimal locally repairable codes can be $q+1$ by Jin et al . Surprisingly, Barg et al. presented a few examples of $q$ -ary optimal locally repairable codes of small distance and locality with code length achieving roughly $q^{2}$ . Very recently, it was further shown in the work of Li et al. that there exist $q$ -ary optimal locally repairable codes with the length bigger than $q+1$ and the distance proportional to $n$ . Thus, it becomes an interesting and challenging problem to construct new families of $q$ -ary optimal locally repairable codes of length bigger than $q+1$ . In this paper, we construct a class of optimal locally repairable codes of distances 3 and 4 with unbounded length (i.e., length of the codes is independent of the code alphabet size). Our technique is through cyclic codes with particular generator and parity-check polynomials that are carefully chosen.

Journal ArticleDOI
TL;DR: The key novelty in this work is that in the particular regime where the number of available processing nodes is greater than the total number of dot products, Short-Dot has lower expected computation time under straggling under an exponential model compared to existing strategies.
Abstract: We consider the problem of computing a matrix-vector product $Ax$ using a set of $P$ parallel or distributed processing nodes prone to “straggling,” ie , unpredictable delays Every processing node can access only a fraction $({s}/{N})$ of the $N$ -length vector $x$ , and all processing nodes compute an equal number of dot products We propose a novel error correcting code-that we call “Short-Dot”–that introduces redundant, shorter dot products such that only a subset of the nodes’ outputs are sufficient to compute $Ax$ To address the problem of straggling in computing matrix-vector products, prior work uses replication or erasure coding to encode parts of the matrix $A$ , but the length of the dot products computed at each processing node is still $N$ The key novelty in our work is that instead of computing the long dot products as required in the original matrix-vector product, we construct a larger number of redundant and short dot products that only require a fraction of $x$ to be accessed during the computation Short-Dot is thus useful in a communication-constrained scenario as it allows for only a fraction of $x$ to be accessed by each processing node Further, we show that in the particular regime where the number of available processing nodes is greater than the total number of dot products, Short-Dot has lower expected computation time under straggling under an exponential model compared to existing strategies, eg replication, in a scaling sense We also derive fundamental limits on the trade-off between the length of the dot products and the recovery threshold, ie, the required number of processing nodes, showing that Short-Dot is near-optimal

Journal ArticleDOI
Gilad Gour1
TL;DR: It is shown that the extended conditional min-entropy can be used to fully characterize when two bipartite quantum channels are related to each other via a superchannel (also known as supermap or a comb) that is acting on one of the subsystems.
Abstract: We extend the definition of the conditional min-entropy from bipartite quantum states to bipartite quantum channels. We show that many of the properties of the conditional min-entropy carry over to the extended version, including an operational interpretation as a guessing probability when one of the subsystems is classical. We then show that the extended conditional min-entropy can be used to fully characterize when two bipartite quantum channels are related to each other via a superchannel (also known as supermap or a comb) that is acting on one of the subsystems. This relation is a pre-order that extends the definition of “quantum majorization” from bipartite states to bipartite channels, and can also be characterized with semidefinite programming. As a special case, our characterization provides necessary and sufficient conditions for when a set of quantum channels is related to another set of channels via a single superchannel. We discuss the applications of our results to channel discrimination, and to resource theories of quantum processes. Along the way we study channel divergences, entropy functions of quantum channels, and noise models of superchannels, including random unitary superchannels and doubly-stochastic superchannels. For the latter we give a physical meaning as being completely-uniformity preserving.

Journal ArticleDOI
TL;DR: This paper studies the hull of generalized Reed–Solomon codes and extended generalized Reed-Solomon code over finite fields with respect to the Euclidean inner product and constructs several new infinite families of entanglement-assisted quantum error-correcting codes with flexible parameters.
Abstract: The hull of linear codes has promising utilization in coding theory and quantum coding theory. In this paper, we study the hull of generalized Reed–Solomon codes and extended generalized Reed–Solomon codes over finite fields with respect to the Euclidean inner product. Several infinite families of MDS codes with hulls of arbitrary dimensions are presented. As an application, using these MDS codes with hulls of arbitrary dimensions, we construct several new infinite families of entanglement-assisted quantum error-correcting codes with flexible parameters.

Journal ArticleDOI
TL;DR: It is shown that, with sufficient damping, the algorithm is guaranteed to converge, although the amount of damping grows with peak-to-average ratio of the squared singular values of the transforms A, which explains the good performance of AMP on i.i.d. Gaussian transforms A.
Abstract: Approximate message passing (AMP) methods and their variants have attracted considerable recent attention for the problem of estimating a random vector x observed through a linear transform A. In the case of large i.i.d. zero-mean Gaussian A, the methods exhibit fast convergence with precise analytic characterizations on the algorithm behavior. However, the convergence of AMP under general transforms A is not fully understood. In this paper, we provide sufficient conditions for the convergence of a damped version of the generalized AMP (GAMP) algorithm in the case of quadratic cost functions (i.e., Gaussian likelihood and prior). It is shown that, with sufficient damping, the algorithm is guaranteed to converge, although the amount of damping grows with peak-to-average ratio of the squared singular values of the transforms A. This result explains the good performance of AMP on i.i.d. Gaussian transforms A, but also their difficulties with ill-conditioned or non-zero-mean transforms A. A related sufficient condition is then derived for the local stability of the damped GAMP method under general cost functions, assuming certain strict convexity conditions.

Journal ArticleDOI
TL;DR: In this article, the authors investigated the potentials of applying the coded caching paradigm in wireless networks and investigated physical layer schemes for downlink transmission from a multiantenna transmitter to several cache-enabled users.
Abstract: We investigate the potentials of applying the coded caching paradigm in wireless networks. In order to do this, we investigate physical layer schemes for downlink transmission from a multiantenna transmitter to several cache-enabled users. As the baseline scheme, we consider employing coded caching on the top of max–min fair multicasting, which is shown to be far from optimal at high-SNR values. Our first proposed scheme, which is near-optimal in terms of DoF, is the natural extension of multiserver coded caching to Gaussian channels. As we demonstrate, its finite SNR performance is not satisfactory, and thus we propose a new scheme in which the linear combination of messages is implemented in the finite field domain, and the one-shot precoding for the MISO downlink is implemented in the complex field. While this modification results in the same near-optimal DoF performance, we show that this leads to significant performance improvement at finite SNR. Finally, we extend our scheme to the previously considered cache-enabled interference channels, and moreover we provide an ergodic rate analysis of our scheme. Our results convey the important message that although directly translating schemes from the network coding ideas to wireless networks may work well at high-SNR values, careful modifications need to be considered for acceptable finite SNR performance.

Journal ArticleDOI
TL;DR: The problem of private information retrieval (PIR) from coded storage systems with colluding, Byzantine, and unresponsive servers is considered and an explicit scheme using an explicit LaTeX notation is designed, adapted to symmetric PIR.
Abstract: The problem of private information retrieval (PIR) from coded storage systems with colluding, Byzantine, and unresponsive servers is considered. An explicit scheme using an $[n,k]$ Reed–Solomon storage code is designed, protecting against $t$ -collusion, and handling up to $b$ Byzantine and $r$ unresponsive servers, when $n>k+t+2b+r-1$ . This scheme achieves a PIR rate of $(({n-r-(k+2b+t-1)})/{n-r})$ . In the case where the capacity is known, namely, when $k=1$ , it is asymptotically capacity achieving as the number of files grows. Finally, the scheme is adapted to symmetric PIR.

Journal ArticleDOI
TL;DR: An explicit two-layer architecture with a sum-rank outer code is obtained, having disjoint local groups and achieving maximal recoverability (MR) for all families of local linear codes (MDS or not) simultaneously, up to a specified maximum locality.
Abstract: Locally repairable codes (LRCs) are considered with equal or unequal localities, local distances, and local field sizes. An explicit two-layer architecture with a sum-rank outer code is obtained, having disjoint local groups and achieving maximal recoverability (MR) for all families of local linear codes (MDS or not) simultaneously, up to a specified maximum locality $r $ . Furthermore, the local linear codes (thus the localities, local distances, and local fields) can be efficiently and dynamically modified without global recoding or changes in architecture or outer code, while preserving the MR property, easily adapting to new configurations in storage or new hot and cold data. In addition, local groups and file components can be added, removed or updated without global recoding. The construction requires global fields of size roughly $g^{r} $ , for $g $ local groups and maximum or specified locality $r $ . For equal localities, these global fields are smaller than those of previous MR-LRCs when $r \leq h $ (global parities). For unequal localities, they provide an exponential field size reduction on all previous best known MR-LRCs. For bounded localities and a large number of local groups, the global erasure-correction complexity of the given construction is comparable to that of Tamo–Barg codes or Reed–Solomon codes with local replication, while local repair is as efficient as for the Cartesian product of the local codes. Reed–Solomon codes with local replication and Cartesian products are recovered from the given construction when $r=1 $ and $h = 0 $ , respectively. The given construction can also be adapted to provide hierarchical MR-LRCs for all types of hierarchies and parameters. Finally, subextension subcodes and sum-rank alternant codes are introduced to obtain further exponential field size reductions, at the expense of lower information rates.

Journal ArticleDOI
TL;DR: The main conceptual contribution of this paper is to clarify how the choice of a covertness metric impacts the information-theoretic limits of covert communications.
Abstract: We study the first- and second-order asymptotics of covert communication over binary-input discrete memoryless channels for three different covertness metrics and under maximum probability of error constraint. When covertness is measured in terms of the relative entropy between the channel output distributions induced with and without communication, we characterize the exact first- and second-order asymptotics of the number of bits that can be reliably transmitted with a maximum probability of error less than $\epsilon $ and a relative entropy less than $\delta $ . When covertness is measured in terms of the variational distance between the channel output distributions or in terms of the probability of missed detection for fixed probability of false alarm, we establish the exact first-order asymptotics and bound the second-order asymptotics. Pulse position modulation achieves the optimal first-order asymptotics for all three metrics, as well as the optimal second-order asymptotics for relative entropy. The main conceptual contribution of this paper is to clarify how the choice of a covertness metric impacts the information-theoretic limits of covert communications. The main technical contribution underlying our results is a detailed expurgation argument to show the existence of a code satisfying the reliability and covertness criteria.

Journal ArticleDOI
TL;DR: The capacity of private computation, defined as the maximum number of bits of the desired function that can be retrieved per bit of total download from all servers, matches the capacity of PIR with servers and messages, and is shown to hold even for arbitrary non-linear computations when the number of datasets is large.
Abstract: We introduce the problem of private computation, comprised of $ {N}$ distributed and non-colluding servers, $ {K}$ independent datasets, and a user who wants to compute a function of the datasets privately, i.e., without revealing which function he wants to compute, to any individual server. This private computation problem is a strict generalization of the private information retrieval (PIR) problem, obtained by expanding the PIR message set (which consists of only independent messages) to also include functions of those messages. The capacity of private computation, $ {C}$ , is defined as the maximum number of bits of the desired function that can be retrieved per bit of total download from all servers. We characterize the capacity of private computation, for $ {N}$ servers and $ {K}$ independent datasets that are replicated at each server, when the functions to be computed are arbitrary linear combinations of the datasets. Surprisingly, the capacity, $ {C}=\left ({1+1/ {N}+\cdots +1/ {N}^{ {K}-1}}\right)^{-1}$ , matches the capacity of PIR with $ {N}$ servers and $ {K}$ messages. Thus, allowing arbitrary linear computations does not reduce the communication rate compared to pure dataset retrieval. The same insight is shown to hold even for arbitrary non-linear computations when the number of datasets $ {K}\rightarrow \infty $ .

Journal ArticleDOI
TL;DR: In this article, the authors make use of rich algebraic structures of elliptic curves to construct a family of $q$ -ary optimal locally repairable codes of length up to $q+2\sqrt {q}$.
Abstract: Constructing locally repairable codes achieving Singleton-type bound (we call them optimal codes in this paper) is a challenging task and has attracted great attention in the last few years. Tamo and Barg first gave a breakthrough result in this topic by cleverly considering subcodes of Reed-Solomon codes. Thus, $q$ -ary optimal locally repairable codes from subcodes of Reed-Solomon codes given by Tamo and Barg have length upper bounded by $q$ . Recently, it was shown through extension of construction by Tamo and Barg that length of $q$ -ary optimal locally repairable codes can be $q+1$ by Jin et al. . Surprisingly it was shown by Barg et al. that, unlike classical MDS codes, $q$ -ary optimal locally repairable codes could have length bigger than $q+1$ . Thus, it becomes an interesting and challenging problem to construct $q$ -ary optimal locally repairable codes of length bigger than $q+1$ . In this paper, we make use of rich algebraic structures of elliptic curves to construct a family of $q$ -ary optimal locally repairable codes of length up to $q+2\sqrt {q}$ . It turns out that locality of our codes can be as big as 23 and distance can be linear in length.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a new capacity-achieving code for the private information retrieval (PIR) problem, and showed that it has the minimum message size and the minimum upload cost (being roughly linear in the number of messages).
Abstract: We propose a new capacity-achieving code for the private information retrieval (PIR) problem, and show that it has the minimum message size (being one less than the number of servers) and the minimum upload cost (being roughly linear in the number of messages) among a general class of capacity-achieving codes, and in particular, among all capacity-achieving linear codes. Different from existing code constructions, the proposed code is asymmetric, and this asymmetry appears to be the key factor leading to the optimal message size and the optimal upload cost. The converse results on the message size and the upload cost are obtained by an analysis of the information theoretic proof of the PIR capacity, from which a set of critical properties of any capacity-achieving code in the code class of interest is extracted. The symmetry structure of the PIR problem is then analyzed, which allows us to construct symmetric codes from asymmetric ones, yielding a meaningful bridge between the proposed code and existing ones in the literature.

Journal ArticleDOI
TL;DR: It is shown that the new design requires roughly 23% fewer tests than a Bernoulli design when paired with the simple decoding algorithms known as combinatorial orthogonal matching pursuit and definite defectives (DD).
Abstract: We consider the nonadaptive group testing with $N$ items, of which $K = \Theta (N^\theta)$ are defective. We study a test design in which each item appears in nearly the same number of tests. For each item, we independently pick $L$ tests uniformly at random with replacement and place the item in those tests. We analyze the performance of these designs with simple and practical decoding algorithms in a range of sparsity regimes and show that the performance is consistently improved in comparison with standard Bernoulli designs. We show that our new design requires roughly 23% fewer tests than a Bernoulli design when paired with the simple decoding algorithms known as combinatorial orthogonal matching pursuit and definite defectives (DD). This gives the best known nonadaptive group testing performance for $\theta > 0.43$ and the best proven performance with a practical decoding algorithm for all $\theta \in (0,1)$ . We also give a converse result showing that the DD algorithm is optimal with respect to our randomized design when $\theta > 1/2$ . We complement our theoretical results with simulations that show a notable improvement over Bernoulli designs in both sparse and dense regimes.

Journal ArticleDOI
TL;DR: The idea of cross subspace alignment, i.e., introducing a subspace dependence between Reed–Solomon code parameters, emerges as the optimal way to align undesired terms while keeping desired terms resolvable.
Abstract: $X$ -secure and $T$ -private information retrieval (XSTPIR) is a form of private information retrieval where data security is guaranteed against collusion among up to $X$ servers and the user’s privacy is guaranteed against collusion among up to $T$ servers. The capacity of XSTPIR is characterized for an arbitrary number of servers $N$ and arbitrary security and privacy thresholds $X$ and $T$ , in the limit as the number of messages $K\rightarrow \infty $ . Capacity is also characterized for any number of messages if either $N=3, X=T=1$ or if $N\leq X+T$ . Insights are drawn from these results, about aligning versus decoding noise, dependence of PIR rate on field size, and robustness to symmetric security constraints. In particular, the idea of cross subspace alignment, i.e., introducing a subspace dependence between Reed–Solomon code parameters, emerges as the optimal way to align undesired terms while keeping desired terms resolvable.

Journal ArticleDOI
TL;DR: In this article, the problem of estimating a random variable under a privacy constraint dictated by another correlated random variable $X$ is investigated, and the underlying privacy-utility tradeoff is expressed in terms of the privacy-constrained guessing probability.
Abstract: We investigate the problem of estimating a random variable $Y$ under a privacy constraint dictated by another correlated random variable $X$ . When $X$ and $Y$ are discrete, we express the underlying privacy-utility tradeoff in terms of the privacy-constrained guessing probability ${\mathcal {h}}(P_{XY}, \varepsilon)$ , and the maximum probability $\mathsf {P}_{\mathsf {c}}(Y|Z)$ of correctly guessing $Y$ given an auxiliary random variable $Z$ , where the maximization is taken over all $P_{Z|Y}$ ensuring that $\mathsf {P}_{\mathsf {c}}(X|Z)\leq \varepsilon $ for a given privacy threshold $\varepsilon \geq 0$ . We prove that ${\mathcal {h}}(P_{XY}, \cdot)$ is concave and piecewise linear, which allows us to derive its expression in closed form for any $\varepsilon $ when $X$ and $Y$ are binary. In the non-binary case, we derive ${\mathcal {h}}(P_{XY}, \varepsilon)$ in the high-utility regime (i.e., for sufficiently large, but nontrivial, values of $\varepsilon $ ) under the assumption that $Y$ and $Z$ have the same alphabets. We also analyze the privacy-constrained guessing probability for two scenarios in which $X$ , $Y$ , and $Z$ are binary vectors. When $X$ and $Y$ are continuous random variables, we formulate the corresponding privacy-utility tradeoff in terms of ${\mathsf {sENSR}}(P_{XY}, \varepsilon)$ , the smallest normalized minimum mean squared-error (mmse) incurred in estimating $Y$ from a Gaussian perturbation $Z$ . Here, the minimization is taken over a family of Gaussian perturbations $Z$ for which the mmse of $f(X)$ given $Z$ is within a factor $1- \varepsilon $ from the variance of $f(X)$ for any non-constant real-valued function $f$ . We derive tight upper and lower bounds for ${\mathsf {sENSR}}$ when $Y$ is Gaussian. For general absolutely continuous random variables, we obtain a tight lower bound for ${\mathsf {sENSR}}(P_{XY}, \varepsilon)$ in the high privacy regime, i.e., for small $\varepsilon $ .

Journal ArticleDOI
TL;DR: In this article, a new characterization of binary linear complementary dual (LCD) cyclic codes in terms of their orthogonal or symplectic basis is presented, and a conjecture proposed by Galvez et al. on the minimum distance of binary LCD codes is solved.
Abstract: Linear complementary dual (LCD) cyclic codes were referred historically to as reversible cyclic codes, which had applications in data storage. Due to a newly discovered application in cryptography, there has been renewed interest in LCD codes. In particular, it has been shown that binary LCD codes play an important role in implementations against side-channel attacks and fault injection attacks. In this paper, we first present a new characterization of binary LCD codes in terms of their orthogonal or symplectic basis. Using such a characterization, we solve a conjecture proposed by Galvez et al. on the minimum distance of binary LCD codes. Next, we consider the action of the orthogonal group on the set of all LCD codes, determine all possible orbits of this action, derive simple closed formulas of the size of the orbits, and present some asymptotic results on the size of the corresponding orbits. Our results show that almost all binary LCD codes are odd-like codes with odd-like duals, and about half of $q$ -ary LCD codes have orthonormal basis, where $q$ is a power of an odd prime.