Showing papers by "Thomas Clausen published in 2021"

PDF

Open Access

Journal Article•DOI•

Joint Monitorless Load-Balancing and Autoscaling for Zero-Wait-Time in Data Centers

[...]

Yoann Desmouceaux¹, Marcel Enguehard, Thomas Clausen²•Institutions (2)

Cisco Systems, Inc.¹, École Polytechnique²

01 Mar 2021-IEEE Transactions on Network and Service Management

TL;DR: In this paper, a unified and centralized-monitoring-free architecture achieving both autoscaling and load-balancing, reducing operational overhead while increasing response time performance, is proposed, which can achieve asymptotic zero-wait time with high (and controlable) probability.

...read moreread less

Abstract: Cloud architectures achieve scaling through two main functions: (i) load-balancers, which dispatch queries among replicated virtualized application instances, and (ii) autoscalers, which automatically adjust the number of replicated instances to accommodate variations in load patterns. These functions are often provided through centralized load monitoring, incurring operational complexity. This article introduces a unified and centralized-monitoring-free architecture achieving both autoscaling and load-balancing, reducing operational overhead while increasing response time performance. Application instances are virtually ordered in a chain, and new queries are forwarded along this chain until an instance, based on its local load, accepts the query. Autoscaling is triggered by the last application instance, which inspects its average load and infers if its chain is under- or over-provisioned. An analytical model of the system is derived, and proves that the proposed technique can achieve asymptotic zero-wait time with high (and controlable) probability. This result is confirmed by extensive simulations, which highlight close-to-ideal performance in terms of both response time and resource costs.

...read moreread less

2 citations

Proceedings Article•DOI•

OP4T: Bringing Advanced Network Packet Timestamping into the Field

[...]

Mohammed Hawari¹, Thomas Clausen²•Institutions (2)

Cisco Systems, Inc.¹, École Polytechnique²

13 Jan 2021

TL;DR: The Open Platform for Programmable Precise Packet Timestamping (OP4T) as mentioned in this paper is a hardware architecture targeting Field-Programmable Gateway Arrays (FPGAs), integrated into data-centre servers as a Smart Network Interface Card (SmartNIC), and flexible enough to enable advanced latency diagnosis.

...read moreread less

Abstract: Because it is very bursty, the microsecond-scale temporal behaviour of network traffic in data-centres is challenging to measure and understand To bring observability into data-centre networks, this paper introduces the Open Platform for Programmable Precise Packet Timestamping (OP4T), a hardware architecture, targeting Field-Programmable Gateway Arrays (FPGAs), integrated into data-centre servers as a Smart Network Interface Card (SmartNIC), and flexible enough to enable advanced latency diagnosisIn this paper, OP4T is specified, and an open-source implementation of that architecture is proposed, targeting the NetFPGA SUME prototyping board By leveraging the P4 programming language, and partial reconfiguration, that opensource implementation is experimentally shown to enable in-band, precise packet timestamping, without sacrificing the achievable throughput As an illustration, OP4T is shown to be usable to measure fine-grained properties of a software packet forwarder, eg, packet batching

...read moreread less

2 citations

Posted Content•

Charon: Load-Aware Load-Balancing in P4.

[...]

Carmine Rizzi¹, Zhiyuan Yao¹, Yoann Desmouceaux¹, Mark Townsley², Thomas Clausen² - Show less +1 more•Institutions (2)

École Polytechnique¹, Cisco Systems, Inc.²

27 Oct 2021-arXiv: Hardware Architecture

TL;DR: Charon as discussed by the authors is a stateless load-aware load balancer that has line-rate performance implemented in P4-NetFPGA, which passively collects load states from application servers and employs the power-of-2-choices scheme to make data-driven load balancing decisions and improve resource utilization.

...read moreread less

Abstract: Load-Balancers play an important role in data centers as they distribute network flows across application servers and guarantee per-connection consistency. It is hard however to make fair load balancing decisions so that all resources are efficiently occupied yet not overloaded. Tracking connection states allows to infer server load states and make informed decisions, but at the cost of additional memory space consumption. This makes it hard to implement on programmable hardware, which has constrained memory but offers line-rate performance. This paper presents Charon, a stateless load-aware load balancer that has line-rate performance implemented in P4-NetFPGA. Charon passively collects load states from application servers and employs the power-of-2-choices scheme to make data-driven load balancing decisions and improve resource utilization. Perconnection consistency is preserved statelessly by encoding server ID in a covert channel. The prototype design and implementation details are described in this paper. Simulation results show performance gains in terms of load distribution fairness, quality of service, throughput and processing latency.

...read moreread less

Posted Content•

Towards Intelligent Load Balancing in Data Centers

[...]

Zhiyuan Yao, Yoann Desmouceaux, Mark Townsley, Thomas Clausen

27 Oct 2021-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: Aquarius as discussed by the authors is a machine learning-based approach for load balancing in network load balancers, which is trained and deployed in real-world data centers and demonstrated its ability of conducting both offline data analysis and online model deployment.

...read moreread less

Abstract: Network load balancers are important components in data centers to provide scalable services. Workload distribution algorithms are based on heuristics, e.g., Equal-Cost Multi-Path (ECMP), Weighted-Cost Multi-Path (WCMP) or naive machine learning (ML) algorithms, e.g., ridge regression. Advanced ML-based approaches help achieve performance gain in different networking and system problems. However, it is challenging to apply ML algorithms on networking problems in real-life systems. It requires domain knowledge to collect features from low-latency, high-throughput, and scalable networking systems, which are dynamic and heterogenous. This paper proposes Aquarius to bridge the gap between ML and networking systems and demonstrates its usage in the context of network load balancers. This paper demonstrates its ability of conducting both offline data analysis and online model deployment in realistic systems. The results show that the ML model trained and deployed using Aquarius improves load balancing performance yet they also reveals more challenges to be resolved to apply ML for networking systems.

...read moreread less

Posted Content•

Reinforced Workload Distribution Fairness.

[...]

Zhiyuan Yao¹, Zihan Ding, Thomas Clausen¹•Institutions (1)

École Polytechnique¹

29 Oct 2021-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: In this article, a distributed asynchronous reinforcement learning mechanism is proposed to improve the fairness of the workload distribution achieved by a load balancer in dynamic environments with limited monitoring of application server loads.

...read moreread less

Abstract: Network load balancers are central components in data centers, that distributes workloads across multiple servers and thereby contribute to offering scalable services. However, when load balancers operate in dynamic environments with limited monitoring of application server loads, they rely on heuristic algorithms that require manual configurations for fairness and performance. To alleviate that, this paper proposes a distributed asynchronous reinforcement learning mechanism to-with no active load balancer state monitoring and limited network observations-improve the fairness of the workload distribution achieved by a load balancer. The performance of proposed mechanism is evaluated and compared with stateof-the-art load balancing algorithms in a simulator, under configurations with progressively increasing complexities. Preliminary results show promise in RLbased load balancing algorithms, and identify additional challenges and future research directions, including reward function design and model scalability.

...read moreread less