Data streaming algorithms for efficient and accurate estimation of flow size distribution
Abhishek Kumar,Minho Sung,Jun Xu,Jia Wang +3 more
- Vol. 32, Iss: 1, pp 177-188
Reads0
Chats0
TLDR
A novel data streaming algorithm to provide much more accurate estimates of flow distribution, using a "lossy data structure" which consists of an array of counters fitted well into SRAM, which not only dramatically improves the accuracy offlow distribution measurement, but also contributes to the field of data streaming.Abstract:
Knowing the distribution of the sizes of traffic flows passing through a network link helps a network operator to characterize network resource usage, infer traffic demands, detect traffic anomalies, and accommodate new traffic demands through better traffic engineering. Previous work on estimating the flow size distribution has been focused on making inferences from sampled network traffic. Its accuracy is limited by the (typically) low sampling rate required to make the sampling operation affordable. In this paper we present a novel data streaming algorithm to provide much more accurate estimates of flow distribution, using a "lossy data structure" which consists of an array of counters fitted well into SRAM. For each incoming packet, our algorithm only needs to increment one underlying counter, making the algorithm fast enough even for 40 Gbps (OC-768) links. The data structure is lossy in the sense that sizes of multiple flows may collide into the same counter. Our algorithm uses Bayesian statistical methods such as Expectation Maximization to infer the most likely flow size distribution that results in the observed counter values after collision. Evaluations of this algorithm on large Internet traces obtained from several sources (including a tier-1 ISP) demonstrate that it has very high measurement accuracy (within 2%). Our algorithm not only dramatically improves the accuracy of flow distribution measurement, but also contributes to the field of data streaming by formalizing an existing methodology and applying it to the context of estimating the flow-distribution.read more
Citations
More filters
Journal ArticleDOI
A Survey on Software-Defined Networking
TL;DR: A generally accepted definition for SDN is presented, including decoupling the control plane from the data plane and providing programmability for network application development, and its three-layer architecture is dwelled on, including an infrastructure layer, a control layer, and an application layer.
Proceedings Article
Software defined traffic measurement with OpenSketch
Minlan Yu,Lavanya Jose,Rui Miao +2 more
TL;DR: This work proposes a software defined traffic measurement architecture OpenSketch, which separates the measurement data plane from the control plane and provides a measurement library that automatically configures the pipeline and allocates resources for different measurement tasks.
Journal ArticleDOI
A roadmap for traffic engineering in SDN-OpenFlow networks
TL;DR: This paper surveys the state-of-the-art in traffic engineering for SDNs, and mainly focuses on four thrusts including flow management, fault tolerance, topology update, and traffic analysis/characterization.
Proceedings ArticleDOI
One Sketch to Rule Them All: Rethinking Network Flow Monitoring with UnivMon
TL;DR: UnivMon is presented, a framework for flow monitoring which leverages recent theoretical advances and demonstrates that it is possible to achieve both generality and high accuracy, and evaluated using a range of trace-driven evaluations to show that it offers comparable (and sometimes better) accuracy relative to custom sketching solutions.
Proceedings ArticleDOI
Elastic sketch: adaptive and fast network-wide measurements
Tong Yang,Jie Jiang,Peng Liu,Qun Huang,Junzhi Gong,Yang Zhou,Rui Miao,Xiaoming Li,Steve Uhlig +8 more
TL;DR: The Elastic sketch is proposed, which is adaptive to currently traffic characteristics, generic to measurement tasks and platforms, and implemented on six platforms to process typical measurement tasks.
References
More filters
Journal ArticleDOI
Maximum likelihood from incomplete data via the EM algorithm
Journal ArticleDOI
Space/time trade-offs in hash coding with allowable errors
TL;DR: Analysis of the paradigm problem demonstrates that allowing a small number of test messages to be falsely identified as members of the given set will permit a much smaller hash area to be used without increasing reject time.
Journal ArticleDOI
Summary cache: a scalable wide-area web cache sharing protocol
TL;DR: This paper demonstrates the benefits of cache sharing, measures the overhead of the existing protocols, and proposes a new protocol called "summary cache", which reduces the number of intercache protocol messages, reduces the bandwidth consumption, and eliminates 30% to 95% of the protocol CPU overhead, all while maintaining almost the same cache hit ratios as ICP.
Journal ArticleDOI
Data streams: algorithms and applications
TL;DR: Data Streams: Algorithms and Applications surveys the emerging area of algorithms for processing data streams and associated applications, which rely on metric embeddings, pseudo-random computations, sparse approximation theory and communication complexity.
Book
Data Streams: Algorithms and Applications
TL;DR: In this paper, the authors present a survey of basic mathematical foundations for data streaming systems, including basic mathematical ideas, basic algorithms, and basic algorithms and algorithms for data stream processing.