scispace - formally typeset
Search or ask a question

Showing papers by "Thomas Clausen published in 2021"


Journal ArticleDOI
TL;DR: In this paper, a unified and centralized-monitoring-free architecture achieving both autoscaling and load-balancing, reducing operational overhead while increasing response time performance, is proposed, which can achieve asymptotic zero-wait time with high (and controlable) probability.
Abstract: Cloud architectures achieve scaling through two main functions: (i) load-balancers, which dispatch queries among replicated virtualized application instances, and (ii) autoscalers, which automatically adjust the number of replicated instances to accommodate variations in load patterns. These functions are often provided through centralized load monitoring, incurring operational complexity. This article introduces a unified and centralized-monitoring-free architecture achieving both autoscaling and load-balancing, reducing operational overhead while increasing response time performance. Application instances are virtually ordered in a chain, and new queries are forwarded along this chain until an instance, based on its local load, accepts the query. Autoscaling is triggered by the last application instance, which inspects its average load and infers if its chain is under- or over-provisioned. An analytical model of the system is derived, and proves that the proposed technique can achieve asymptotic zero-wait time with high (and controlable) probability. This result is confirmed by extensive simulations, which highlight close-to-ideal performance in terms of both response time and resource costs.

2 citations


Proceedings ArticleDOI
13 Jan 2021
TL;DR: The Open Platform for Programmable Precise Packet Timestamping (OP4T) as mentioned in this paper is a hardware architecture targeting Field-Programmable Gateway Arrays (FPGAs), integrated into data-centre servers as a Smart Network Interface Card (SmartNIC), and flexible enough to enable advanced latency diagnosis.
Abstract: Because it is very bursty, the microsecond-scale temporal behaviour of network traffic in data-centres is challenging to measure and understand To bring observability into data-centre networks, this paper introduces the Open Platform for Programmable Precise Packet Timestamping (OP4T), a hardware architecture, targeting Field-Programmable Gateway Arrays (FPGAs), integrated into data-centre servers as a Smart Network Interface Card (SmartNIC), and flexible enough to enable advanced latency diagnosisIn this paper, OP4T is specified, and an open-source implementation of that architecture is proposed, targeting the NetFPGA SUME prototyping board By leveraging the P4 programming language, and partial reconfiguration, that opensource implementation is experimentally shown to enable in-band, precise packet timestamping, without sacrificing the achievable throughput As an illustration, OP4T is shown to be usable to measure fine-grained properties of a software packet forwarder, eg, packet batching

2 citations


Posted Content
TL;DR: Charon as discussed by the authors is a stateless load-aware load balancer that has line-rate performance implemented in P4-NetFPGA, which passively collects load states from application servers and employs the power-of-2-choices scheme to make data-driven load balancing decisions and improve resource utilization.
Abstract: Load-Balancers play an important role in data centers as they distribute network flows across application servers and guarantee per-connection consistency. It is hard however to make fair load balancing decisions so that all resources are efficiently occupied yet not overloaded. Tracking connection states allows to infer server load states and make informed decisions, but at the cost of additional memory space consumption. This makes it hard to implement on programmable hardware, which has constrained memory but offers line-rate performance. This paper presents Charon, a stateless load-aware load balancer that has line-rate performance implemented in P4-NetFPGA. Charon passively collects load states from application servers and employs the power-of-2-choices scheme to make data-driven load balancing decisions and improve resource utilization. Perconnection consistency is preserved statelessly by encoding server ID in a covert channel. The prototype design and implementation details are described in this paper. Simulation results show performance gains in terms of load distribution fairness, quality of service, throughput and processing latency.

Posted Content
TL;DR: Aquarius as discussed by the authors is a machine learning-based approach for load balancing in network load balancers, which is trained and deployed in real-world data centers and demonstrated its ability of conducting both offline data analysis and online model deployment.
Abstract: Network load balancers are important components in data centers to provide scalable services. Workload distribution algorithms are based on heuristics, e.g., Equal-Cost Multi-Path (ECMP), Weighted-Cost Multi-Path (WCMP) or naive machine learning (ML) algorithms, e.g., ridge regression. Advanced ML-based approaches help achieve performance gain in different networking and system problems. However, it is challenging to apply ML algorithms on networking problems in real-life systems. It requires domain knowledge to collect features from low-latency, high-throughput, and scalable networking systems, which are dynamic and heterogenous. This paper proposes Aquarius to bridge the gap between ML and networking systems and demonstrates its usage in the context of network load balancers. This paper demonstrates its ability of conducting both offline data analysis and online model deployment in realistic systems. The results show that the ML model trained and deployed using Aquarius improves load balancing performance yet they also reveals more challenges to be resolved to apply ML for networking systems.

Posted Content
TL;DR: In this article, a distributed asynchronous reinforcement learning mechanism is proposed to improve the fairness of the workload distribution achieved by a load balancer in dynamic environments with limited monitoring of application server loads.
Abstract: Network load balancers are central components in data centers, that distributes workloads across multiple servers and thereby contribute to offering scalable services. However, when load balancers operate in dynamic environments with limited monitoring of application server loads, they rely on heuristic algorithms that require manual configurations for fairness and performance. To alleviate that, this paper proposes a distributed asynchronous reinforcement learning mechanism to-with no active load balancer state monitoring and limited network observations-improve the fairness of the workload distribution achieved by a load balancer. The performance of proposed mechanism is evaluated and compared with stateof-the-art load balancing algorithms in a simulator, under configurations with progressively increasing complexities. Preliminary results show promise in RLbased load balancing algorithms, and identify additional challenges and future research directions, including reward function design and model scalability.