Papers published on a yearly basis
Papers
More filters
••
16 Apr 2018TL;DR: In this article, an experience-driven approach that can learn to well control a communication network from its own experience rather than an accurate mathematical model is proposed. And two new techniques, TE-aware exploration and actor-critic-based prioritized experience replay, are proposed to optimize the general DRL framework particularly for TE.
Abstract: Modern communication networks have become very complicated and highly dynamic, which makes them hard to model, predict and control. In this paper, we develop a novel experience-driven approach that can learn to well control a communication network from its own experience rather than an accurate mathematical model, just as a human learns a new skill (such as driving, swimming, etc). Specifically, we, for the first time, propose to leverage emerging Deep Reinforcement Learning (DRL) for enabling model-free control in communication networks; and present a novel and highly effective DRL-based control framework, DRL-TE, for a fundamental networking problem: Traffic Engineering (TE). The proposed framework maximizes a widely-used utility function by jointly learning network environment and its dynamics, and making decisions under the guidance of powerful Deep Neural Networks (DNNs). We propose two new techniques, TE-aware exploration and actor-critic-based prioritized experience replay, to optimize the general DRL framework particularly for TE. To validate and evaluate the proposed framework, we implemented it in ns-3, and tested it comprehensively with both representative and randomly generated network topologies. Extensive packet-level simulation results show that 1) compared to several widely-used baseline methods, DRL-TE significantly reduces end-to-end delay and consistently improves the network utility, while offering better or comparable throughput; 2) DRL-TE is robust to network changes; and 3) DRL-TE consistently outperforms a state-of-the-art DRL method (for continuous control), Deep Deterministic Policy Gradient (DDPG), which, however, does not offer satisfying performance.
260 citations
•
01 Dec 1998TL;DR: It is shown that both Q-learning and the indirect approach enjoy rather rapid convergence to the optimal policy as a function of the number of state transitions observed, and that the amount of memory required by the model-based approach is closer to N than to N2.
Abstract: In this paper, we address two issues of long-standing interest in the reinforcement learning literature. First, what kinds of performance guarantees can be made for Q-learning after only a finite number of actions? Second, what quantitative comparisons can be made between Q-learning and model-based (indirect) approaches, which use experience to estimate next-state distributions for off-line value iteration?
We first show that both Q-learning and the indirect approach enjoy rather rapid convergence to the optimal policy as a function of the number of state transitions observed. In particular, on the order of only (N log(1/e)/e2)(log(N) + loglog(l/e)) transitions are sufficient for both algorithms to come within e of the optimal policy, in an idealized model that assumes the observed transitions are "well-mixed" throughout an N-state MDP. Thus, the two approaches have roughly the same sample complexity. Perhaps surprisingly, this sample complexity is far less than what is required for the model-based approach to actually construct a good approximation to the next-state distribution. The result also shows that the amount of memory required by the model-based approach is closer to N than to N2.
For either approach, to remove the assumption that the observed transitions are well-mixed, we consider a model in which the transitions are determined by a fixed, arbitrary exploration policy. Bounds on the number of transitions required in order to achieve a desired level of performance are then related to the stationary distribution and mixing time of this policy.
260 citations
••
TL;DR: A geometry-based model is proposed that includes the propagation effects that are critical for MIMO performance: i) single scattering around the BS and MS, ii) scattering by far clusters, iii) double-scattering, iv) waveguiding, and v) diffraction by roof edges.
Abstract: This paper derives a generic model for the multiple-input multiple-output (MIMO) wireless channel. The model incorporates important effects, including i) interdependency of directions-of-arrival and directions-of-departure, ii) large delay and angle dispersion by propagation via far clusters, and iii) rank reduction of the transfer function matrix. We propose a geometry-based model that includes the propagation effects that are critical for MIMO performance: i) single scattering around the BS and MS, ii) scattering by far clusters, iii) double-scattering, iv) waveguiding, and v) diffraction by roof edges. The required parameters for the complete definition of the model are enumerated, and typical parameter values in macro and microcellular environments are discussed.
260 citations
••
TL;DR: In this paper, the authors analyzed intrachannel nonlinear effects in high-bit-rate transmission systems based on short optical pulses that are dispersion compensated and showed that the magnitude of nonlinear impairments reduces monotonically with the reduction of pulse width and with the increase of the dispersion coefficient.
Abstract: We analyze intrachannel nonlinear effects in high-bit-rate transmission systems based on short optical pulses that are dispersion compensated. We perform an analytical study of a generic example with two pulses, in which case the nonlinearity shifts the pulses in time and results in the generation of leading and trailing pulse echoes. We show that in all the relevant range of parameters, the magnitude of the nonlinear impairments reduces monotonically with the reduction of pulse width and with the increase of the dispersion coefficient.
259 citations
••
28 Jun 2011TL;DR: ProxiMate, a system that allows wireless devices in proximity to securely pair with one another autonomously by generating a common cryptographic key directly from their shared time-varying wireless environment, is presented.
Abstract: Forming secure associations between wireless devices that do not share a prior trust relationship is an important problem. This paper presents ProxiMate, a system that allows wireless devices in proximity to securely pair with one another autonomously by generating a common cryptographic key directly from their shared time-varying wireless environment. The shared key synthesized by ProxiMate can be used by the devices to authenticate each others' physical proximity and then to communicate confidentially. Unlike traditional pairing approaches such as Diffie-Hellman, ProxiMate is secure against a computationally unbounded adversary and its computational complexity is linear in the size of the key. We evaluate ProxiMate using an experimental prototype built using an open-source software-defined platform and demonstrate its effectiveness in generating common secret bits. We further show that it is possible to speed up secret key synthesis by monitoring multiple RF sources simultaneously or by shaking together the devices that need to be paired. Finally, we show that ProxiMate is resistant to even the most powerful attacker who controls the public RF source used by the legitimate devices for pairing.
259 citations
Authors
Showing all 1881 results
Name | H-index | Papers | Citations |
---|---|---|---|
Yoshua Bengio | 202 | 1033 | 420313 |
Scott Shenker | 150 | 454 | 118017 |
Paul Shala Henry | 137 | 318 | 35971 |
Peter Stone | 130 | 1229 | 79713 |
Yann LeCun | 121 | 369 | 171211 |
Louis E. Brus | 113 | 347 | 63052 |
Jennifer Rexford | 102 | 394 | 45277 |
Andreas F. Molisch | 96 | 777 | 47530 |
Vern Paxson | 93 | 267 | 48382 |
Lorrie Faith Cranor | 92 | 326 | 28728 |
Ward Whitt | 89 | 424 | 29938 |
Lawrence R. Rabiner | 88 | 378 | 70445 |
Thomas E. Graedel | 86 | 348 | 27860 |
William W. Cohen | 85 | 384 | 31495 |
Michael K. Reiter | 84 | 380 | 30267 |