Home
/
Authors
/
Aravind Menon

Author

Aravind Menon

École Polytechnique Fédérale de Lausanne

Bio: Aravind Menon is an academic researcher from École Polytechnique Fédérale de Lausanne. The author has contributed to research in topics: Virtual machine & Virtualization. The author has an hindex of 6, co-authored 6 publications receiving 1230 citations.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Diagnosing performance overheads in the xen virtual machine environment

[...]

Aravind Menon¹, Jose Renato Santos², Yoshio Turner², Gopalakrishnan Janakiraman², Willy Zwaenepoel¹ - Show less +1 more•Institutions (2)

École Polytechnique Fédérale de Lausanne¹, Hewlett-Packard²

11 Jun 2005

TL;DR: Xenoprof is presented, a system-wide statistical profiling toolkit implemented for the Xen virtual machine environment that will facilitate a better understanding of performance characteristics of Xen's mechanisms allowing the community to optimize the Xen implementation.

...read moreread less

Abstract: Virtual Machine (VM) environments (e.g., VMware and Xen) are experiencing a resurgence of interest for diverse uses including server consolidation and shared hosting. An application's performance in a virtual machine environment can differ markedly from its performance in a non-virtualized environment because of interactions with the underlying virtual machine monitor and other virtual machines. However, few tools are currently available to help debug performance problems in virtual machine environments.In this paper, we present Xenoprof, a system-wide statistical profiling toolkit implemented for the Xen virtual machine environment. The toolkit enables coordinated profiling of multiple VMs in a system to obtain the distribution of hardware events such as clock cycles and cache and TLB misses. The toolkit will facilitate a better understanding of performance characteristics of Xen's mechanisms allowing the community to optimize the Xen implementation.We use our toolkit to analyze performance overheads incurred by networking applications running in Xen VMs. We focus on networking applications since virtualizing network I/O devices is relatively expensive. Our experimental results quantify Xen's performance overheads for network I/O device virtualization in uni- and multi-processor systems. With certain Xen configurations, networking workloads in the Xen environment can suffer significant performance degradation. Our results identify the main sources of this overhead which should be the focus of Xen optimization efforts. We also show how our profiling toolkit was used to uncover and resolve performance bugs that we encountered in our experiments which caused unexpected application behavior.

...read moreread less

571 citations

Proceedings Article•

Optimizing network virtualization in Xen

[...]

Aravind Menon¹, Alan L. Cox², Willy Zwaenepoel¹•Institutions (2)

École Polytechnique Fédérale de Lausanne¹, Rice University²

30 May 2006

TL;DR: The overall impact of these optimizations is an improvement in transmit performance of guest domains by a factor of 4.4, and support for guest operating systems to effectively utilize advanced virtual memory features such as superpages and global page mappings.

...read moreread less

Abstract: In this paper, we propose and evaluate three techniques for optimizing network performance in the Xen virtualized environment. Our techniques retain the basic Xen architecture of locating device drivers in a privileged 'driver' domain with access to I/O devices, and providing network access to unprivileged 'guest' domains through virtualized network interfaces. First, we redefine the virtual network interfaces of guest domains to incorporate high-level network offfload features available in most modern network cards. We demonstrate the performance benefits of high-level offload functionality in the virtual interface, even when such functionality is not supported in the underlying physical interface. Second, we optimize the implementation of the data transfer path between guest and driver domains. The optimization avoids expensive data remapping operations on the transmit path, and replaces page remapping by data copying on the receive path. Finally, we provide support for guest operating systems to effectively utilize advanced virtual memory features such as superpages and global page mappings. The overall impact of these optimizations is an improvement in transmit performance of guest domains by a factor of 4.4. The receive performance of the driver domain is improved by 35% and reaches within 7% of native Linux performance. The receive performance in guest domains improves by 18%, but still trails the native Linux performance by 61%. We analyse the performance improvements in detail, and quantify the contribution of each optimization to the overall performance.

...read moreread less

353 citations

Proceedings Article•DOI•

Concurrent Direct Network Access for Virtual Machine Monitors

[...]

Paul Willmann¹, Jeffrey Shafer¹, D. Carr¹, Aravind Menon², Scott Rixner¹, Alan L. Cox¹, Willy Zwaenepoel² - Show less +3 more•Institutions (2)

Rice University¹, École Polytechnique Fédérale de Lausanne²

10 Feb 2007

TL;DR: Through the use of CDNA, many of the bottlenecks imposed by software multiplexing can be eliminated without sacrificing protection, producing substantial efficiency improvements.

...read moreread less

Abstract: This paper presents hardware and software mechanisms to enable concurrent direct network access (CDNA) by operating systems running within a virtual machine monitor. In a conventional virtual machine monitor, each operating system running within a virtual machine must access the network through a software-virtualized network interface. These virtual network interfaces are multiplexed in software onto a physical network interface, incurring significant performance overheads. The CDNA architecture improves networking efficiency and performance by dividing the tasks of traffic multiplexing, interrupt delivery, and memory protection between hardware and software in a novel way. The virtual machine monitor delivers interrupts and provides protection between virtual machines, while the network interface performs multiplexing of the network data. In effect, the CDNA architecture provides the abstraction that each virtual machine is connected directly to its own network interface. Through the use of CDNA, many of the bottlenecks imposed by software multiplexing can be eliminated without sacrificing protection, producing substantial efficiency improvements

...read moreread less

177 citations

Proceedings Article•

Optimizing TCP receive performance

[...]

Aravind Menon¹, Willy Zwaenepoel¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

22 Jun 2008

TL;DR: Two optimizations, receive aggregation and acknowledgment offload, are presented that improve the receive side TCP performance by reducing the number of packets that need to be processed by the TCP/IP stack.

...read moreread less

Abstract: The performance of receive side TCP processing has traditionally been dominated by the cost of the 'per-byte' operations, such as data copying and checksumming. We show that architectural trends in modern processors, in particular aggressive prefetching, have resulted in a fundamental shift in the relative overheads of per-byte and per-packet operations in TCP receive processing, making per-packet operations the dominant source of overhead. Motivated by this architectural trend, we present two optimizations, receive aggregation and acknowledgment offload, that improve the receive side TCP performance by reducing the number of packets that need to be processed by the TCP/IP stack. Our optimizations are similar in spirit to the use of TCP Segment Offload (TSO) for improving transmit side performance, but without need for hardware support. With these optimizations, we demonstrate performance improvements of 45-67% for receive processing in native Linux, and of 86% for receive processing in a Linux guest operating system running on Xen.

...read moreread less

70 citations

Proceedings Article•DOI•

TwinDrivers: semi-automatic derivation of fast and safe hypervisor network drivers from guest OS drivers

[...]

Aravind Menon¹, Simon Schubert¹, Willy Zwaenepoel¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

07 Mar 2009

TL;DR: The TwinDrivers hypervisor driver is presented, a framework which allows us to semi-automatically create safe and efficient hypervisor drivers from guest OS drivers that improves the guest domain networking throughput in Xen by a factor of 2.4 for transmit workloads, and 2.1 for receive workloads.

...read moreread less

Abstract: In a virtualized environment, device drivers are often run inside a virtual machine (VM) rather than in the hypervisor, for reasons of safety and reduction in software engineering effort. Unfortunately, this approach results in poor performance for I/O-intensive devices such as network cards. The alternative approach of running device drivers directly in the hypervisor yields better performance, but results in the loss of safety guarantees for the hypervisor and incurs additional software engineering costs.In this paper we present TwinDrivers, a framework which allows us to semi-automatically create safe and efficient hypervisor drivers from guest OS drivers. The hypervisor driver runs directly in the hypervisor, but its data resides completely in the driver VM address space. A Software Virtual Memory mechanism allows the driver to access its VM data efficiently from the hypervisor running in any guest context, and also protects the hypervisor from invalid memory accesses from the driver. An upcall mechanism allows the hypervisor to largely reuse the driver support infrastructure present in the VM. The TwinDriver system thus combines most of the performance benefits of hypervisor-based driver approaches with the safety and software engineering benefits of VM-based driver approaches.Using the TwinDrivers hypervisor driver, we are able to improve the guest domain networking throughput in Xen by a factor of 2.4 for transmit workloads, and 2.1 for receive workloads, both in CPU-scaled units, and achieve close to 64-67 of native Linux throughput.

...read moreread less

39 citations

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

The Eucalyptus Open-Source Cloud-Computing System

[...]

Daniel Nurmi¹, Rich Wolski¹, Chris Grzegorczyk¹, Graziano Obertelli¹, Sunil Soman¹, Lamia Youseff¹, Dmitrii Zagorodnov¹ - Show less +3 more•Institutions (1)

University of California, Santa Barbara¹

18 May 2009

TL;DR: This work presents Eucalyptus -- an open-source software framework for cloud computing that implements what is commonly referred to as Infrastructure as a Service (IaaS); systems that give users the ability to run and control entire virtual machine instances deployed across a variety physical resources.

...read moreread less

Abstract: Cloud computing systems fundamentally provide access to large pools of data and computational resources through a variety of interfaces similar in spirit to existing grid and HPC resource management and programming systems. These types of systems offer a new programming target for scalable application developers and have gained popularity over the past few years. However, most cloud computing systems in operation today are proprietary, rely upon infrastructure that is invisible to the research community, or are not explicitly designed to be instrumented and modified by systems researchers. In this work, we present Eucalyptus -- an open-source software framework for cloud computing that implements what is commonly referred to as Infrastructure as a Service (IaaS); systems that give users the ability to run and control entire virtual machine instances deployed across a variety physical resources. We outline the basic principles of the Eucalyptus design, detail important operational aspects of the system, and discuss architectural trade-offs that we have made in order to allow Eucalyptus to be portable, modular and simple to use on infrastructure commonly found within academic settings. Finally, we provide evidence that Eucalyptus enables users familiar with existing Grid and HPC systems to explore new cloud computing functionality while maintaining access to existing, familiar application development software and Grid middle-ware.

...read moreread less

1,962 citations

Book•

Computer Architecture, Fifth Edition: A Quantitative Approach

[...]

John L. Hennessy, David A. Patterson

29 Sep 2011

TL;DR: The Fifth Edition of Computer Architecture focuses on this dramatic shift in the ways in which software and technology in the "cloud" are accessed by cell phones, tablets, laptops, and other mobile computing devices.

...read moreread less

Abstract: The computing world today is in the middle of a revolution: mobile clients and cloud computing have emerged as the dominant paradigms driving programming and hardware innovation today. The Fifth Edition of Computer Architecture focuses on this dramatic shift, exploring the ways in which software and technology in the "cloud" are accessed by cell phones, tablets, laptops, and other mobile computing devices. Each chapter includes two real-world examples, one mobile and one datacenter, to illustrate this revolutionary change. Updated to cover the mobile computing revolutionEmphasizes the two most important topics in architecture today: memory hierarchy and parallelism in all its forms.Develops common themes throughout each chapter: power, performance, cost, dependability, protection, programming models, and emerging trends ("What's Next")Includes three review appendices in the printed text. Additional reference appendices are available online.Includes updated Case Studies and completely new exercises.

...read moreread less

984 citations

Journal Article•DOI•

Performance Analysis of Cloud Computing Services for Many-Tasks Scientific Computing

[...]

Alexandru Iosup¹, Simon Ostermann², M N Yigitbasi¹, Radu Prodan², Thomas Fahringer², Dick Epema¹ - Show less +2 more•Institutions (2)

Delft University of Technology¹, University of Innsbruck²

01 Jun 2011-IEEE Transactions on Parallel and Distributed Systems

TL;DR: The results indicate that the current clouds need an order of magnitude in performance improvement to be useful to the scientific community, and show which improvements should be considered first to address this discrepancy between offer and demand.

...read moreread less

Abstract: Cloud computing is an emerging commercial infrastructure paradigm that promises to eliminate the need for maintaining expensive computing facilities by companies and institutes alike. Through the use of virtualization and resource time sharing, clouds serve with a single set of physical resources a large user base with different needs. Thus, clouds have the potential to provide to their owners the benefits of an economy of scale and, at the same time, become an alternative for scientists to clusters, grids, and parallel production environments. However, the current commercial clouds have been built to support web and small database workloads, which are very different from typical scientific computing workloads. Moreover, the use of virtualization and resource time sharing may introduce significant performance penalties for the demanding scientific computing workloads. In this work, we analyze the performance of cloud computing services for scientific computing workloads. We quantify the presence in real scientific computing workloads of Many-Task Computing (MTC) users, that is, of users who employ loosely coupled applications comprising many tasks to achieve their scientific goals. Then, we perform an empirical evaluation of the performance of four commercial cloud computing services including Amazon EC2, which is currently the largest commercial cloud. Last, we compare through trace-based simulation the performance characteristics and cost models of clouds and other scientific computing platforms, for general and MTC-based scientific computing workloads. Our results indicate that the current clouds need an order of magnitude in performance improvement to be useful to the scientific community, and show which improvements should be considered first to address this discrepancy between offer and demand.

...read moreread less

915 citations

Proceedings Article•DOI•

Performance Evaluation of Container-Based Virtualization for High Performance Computing Environments

[...]

Miguel G. Xavier¹, Marcelo Neves¹, Fábio Diniz Rossi¹, Tiago Ferreto¹, Timoteo Alberto Peters Lange¹, C.A.F. De Rose¹ - Show less +2 more•Institutions (1)

Pontifícia Universidade Católica do Rio Grande do Sul¹

27 Feb 2013

TL;DR: This work conducted a number of experiments in order to perform an in-depth performance evaluation of container-based virtualization for HPC, and compared them with Xen, which is a representative of the traditional hypervisor-basedvirtualization systems used today.

...read moreread less

Abstract: The use of virtualization technologies in high performance computing (HPC) environments has traditionally been avoided due to their inherent performance overhead. However, with the rise of container-based virtualization implementations, such as Linux VServer, OpenVZ and Linux Containers (LXC), it is possible to obtain a very low overhead leading to near-native performance. In this work, we conducted a number of experiments in order to perform an in-depth performance evaluation of container-based virtualization for HPC. We also evaluated the trade-off between performance and isolation in container-based virtualization systems and compared them with Xen, which is a representative of the traditional hypervisor-based virtualization systems used today.

...read moreread less

445 citations

Proceedings Article•DOI•

Proactive fault tolerance for HPC with Xen virtualization

[...]

Arun Babu Nagarajan¹, Frank Mueller¹, Christian Engelmann², Stephen L. Scott²•Institutions (2)

North Carolina State University¹, Oak Ridge National Laboratory²

17 Jun 2007

TL;DR: This paper believes that this is the first comprehensive study of proactive fault tolerance where live migration is actually triggered by health monitoring, and makes proactive FT a valuable asset for long-running MPI application that is complementary to reactive FT using full checkpoint/restart schemes.

...read moreread less

Abstract: Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current techniques to tolerate faults focus on reactive schemes to recover from faults and generally rely on a checkpoint/restart mechanism. Yet, in today's systems, node failures can often be anticipated by detecting a deteriorating health status.Instead of a reactive scheme for fault tolerance (FT), we are promoting a proactive one where processes automatically migrate from "unhealthy" nodes to healthy ones. Our approach relies on operating system virtualization techniques exemplified by but not limited to Xen. This paper contributes an automatic and transparent mechanism for proactive FT for arbitrary MPI applications. It leverages virtualization techniques combined with health monitoring and load-based migration. We exploit Xen's live migration mechanism for a guest operating system (OS) to migrate an MPI task from a health-deteriorating node to a healthy one without stopping the MPI task during most of the migration. Our proactive FT daemon orchestrates the tasks of health monitoring, load determination and initiation of guest OS migration. Experimental results demonstrate that live migration hides migration costs and limits the overhead to only a few seconds making it an attractive approach to realize FT in HPC systems. Overall, our enhancements make proactive FT a valuable asset for long-running MPI application that is complementary to reactive FT using full checkpoint/restart schemes since checkpoint frequencies can be reduced as fewer unanticipated failures are encountered. In the context of OS virtualization, we believe that this is the first comprehensive study of proactive fault tolerance where live migration is actually triggered by health monitoring.

...read moreread less

394 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189

Collapse