Home
/
Authors
/
Derrick Kondo

Author

Derrick Kondo

French Institute for Research in Computer Science and Automation

Other affiliations: Teradata, University of Paris-Sud, Stanford University ...read more

Bio: Derrick Kondo is an academic researcher from French Institute for Research in Computer Science and Automation. The author has contributed to research in topics: Grid computing & Cloud computing. The author has an hindex of 29, co-authored 63 publications receiving 3454 citations. Previous affiliations of Derrick Kondo include Teradata & University of Paris-Sud.

Papers published on a yearly basis

2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2002
2000

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Cost-benefit analysis of Cloud Computing versus desktop grids

[...]

Derrick Kondo¹, Bahman Javadi¹, Paul Malecot¹, Franck Cappello¹, Dustin Anderson² - Show less +1 more•Institutions (2)

French Institute for Research in Computer Science and Automation¹, University of California, Berkeley²

23 May 2009

TL;DR: This work compares and contrast the performance and monetary cost-benefits of clouds for desktop grid applications, ranging in computational size and storage and examines performance measurements and monetary expenses of real desktop grids and the Amazon elastic compute cloud.

...read moreread less

Abstract: Cloud Computing has taken commercial computing by storm. However, adoption of cloud computing platforms and services by the scientific community is in its infancy as the performance and monetary cost-benefits for scientific applications are not perfectly clear. This is especially true for desktop grids (aka volunteer computing) applications. We compare and contrast the performance and monetary cost-benefits of clouds for desktop grid applications, ranging in computational size and storage. We address the following questions: (i) What are the performance tradeoffs in using one platform over the other? (ii) What are the specific resource requirements and monetary costs of creating and deploying applications on each platform? (iii) In light of those monetary and performance cost-benefits, how do these platforms compare? (iv) Can cloud computing platforms be used in combination with desktop grids to improve cost-effectiveness even further? We examine those questions using performance measurements and monetary expenses of real desktop grids and the Amazon elastic compute cloud.

...read moreread less

383 citations

Proceedings Article•DOI•

Reducing Costs of Spot Instances via Checkpointing in the Amazon Elastic Compute Cloud

[...]

Sangho Yi¹, Derrick Kondo¹, Artur Andrzejak²•Institutions (2)

French Institute for Research in Computer Science and Automation¹, Zuse Institute Berlin²

05 Jul 2010

TL;DR: Based on the real price history of EC2 spot instances, this work compares several adaptive check pointing schemes in terms of monetary costs and improvement of job completion times and shows that its approach can reduce significantly both price and the task completion times.

...read moreread less

Abstract: Recently introduced spot instances in the Amazon Elastic Compute Cloud (EC2) offer lower resource costs in exchange for reduced reliability; these instances can be revoked abruptly due to price and demand fluctuations. Mechanisms and tools that deal with the cost-reliability trade-offs under this schema are of great value for users seeking to lessen their costs while maintaining high reliability. We study how one such a mechanism, namely check pointing, can be used to minimize the cost and volatility of resource provisioning. Based on the real price history of EC2 spot instances, we compare several adaptive check pointing schemes in terms of monetary costs and improvement of job completion times. Trace-based simulations show that our approach can reduce significantly both price and the task completion times.

...read moreread less

270 citations

Proceedings Article•DOI•

The Failure Trace Archive: Enabling Comparative Analysis of Failures in Diverse Distributed Systems

[...]

Derrick Kondo, Bahman Javadi, Alexandru Iosup¹, Dick Epema¹•Institutions (1)

Delft University of Technology¹

17 May 2010

TL;DR: The Failure Trace Archive is created as an online public repository of availability traces taken from diverse parallel and distributed systems to facilitate the design, validation, and comparison of fault-tolerant models and algorithms.

...read moreread less

Abstract: With the increasing functionality and complexity of distributed systems, resource failures are inevitable. While numerous models and algorithms for dealing with failures exist, the lack of public trace data sets and tools has prevented meaningful comparisons. To facilitate the design, validation, and comparison of fault-tolerant models and algorithms, we have created the Failure Trace Archive (FTA) as an online public repository of availability traces taken from diverse parallel and distributed systems. Our main contributions in this study are the following. First, we describe the design of the archive, in particular the rationale of the standard FTA format, and the design of a toolbox that facilitates automated analysis of trace data sets. Second, applying the toolbox, we present a uniform comparative analysis with statistics and models of failures in nine distributed systems. Third, we show how different interpretations of these data sets can result in different conclusions. This emphasizes the critical need for the public availability of trace data and methods for their analysis.

...read moreread less

203 citations

Proceedings Article•DOI•

Characterizing and evaluating desktop grids: an empirical study

[...]

Derrick Kondo¹, Michela Taufer¹, Charles L. Brooks², Henri Casanova¹, Andrew A. Chien¹ - Show less +1 more•Institutions (2)

University of California, San Diego¹, Scripps Research Institute²

26 Apr 2004

TL;DR: This work utilizes measurements of an enterprise desktop grid with over 220 hosts running the Entropia commercial desktop grid software to characterize CPU availability and develops a performance model for desktop grid applications for various task granularities, showing that there is an optimal task size.

...read moreread less

Abstract: Summary form only given. Desktop resources are attractive for running compute-intensive distributed applications. Several systems that aggregate these resources in desktop grids have been developed. While these systems have been successfully used for many high throughput applications there has been little insight into the detailed temporal structure of CPU availability of desktop grid resources. Yet, this structure is critical to characterize the utility of desktop grid platforms for both task parallel and even data parallel applications. We address the following questions: (i) What are the temporal characteristics of desktop CPU availability in an enterprise setting? (ii) How do these characteristics affect the utility of desktop grids? (iii) Based on these characteristics, can we construct a model of server "equivalents" for the desktop grids, which can be used to predict application performance? We present measurements of an enterprise desktop grid with over 220 hosts running the Entropia commercial desktop grid software. We utilize these measurements to characterize CPU availability and develop a performance model for desktop grid applications for various task granularities, showing that there is an optimal task size. We then use a cluster equivalence metric to quantify the utility of the desktop grid relative to that of a dedicated cluster.

...read moreread less

202 citations

Proceedings Article•DOI•

Decision Model for Cloud Computing under SLA Constraints

[...]

Artur Andrzejak¹, Derrick Kondo, Sangho Yi•Institutions (1)

Zuse Institute Berlin¹

17 Aug 2010

TL;DR: This work proposes a probabilistic model for the optimization of monetary costs, performance, and reliability, given user and application requirements and dynamic conditions and demonstrates how users should bid optimally on Spot Instances to reach different objectives with desired levels of confidence.

...read moreread less

Abstract: With the recent introduction of Spot Instances in the Amazon Elastic Compute Cloud (EC2), users can bid for resources and thus control the balance of reliability versus monetary costs. A critical challenge is to determine bid prices that minimize monetary costs for a user while meeting Service Level Agreement (SLA) constraints (for example, sufficient resource availability to complete a computation within a desired deadline). We propose a probabilistic model for the optimization of monetary costs, performance, and reliability, given user and application requirements and dynamic conditions. Using real instance price traces and workload models, we evaluate our model and demonstrate how users should bid optimally on Spot Instances to reach different objectives with desired levels of confidence.

...read moreread less

194 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13

Collapse

Cited by

PDF

Open Access

More filters

A Break in the Clouds: Towards a Cloud Definition

[...]

Chris Rose

01 Jan 2011

2,037 citations

Proceedings Article•

The Grid 2: Blueprint for a New Computing Infrastructure

[...]

R.V. van Nieuwpoort

01 Jan 2003

1,212 citations

Proceedings Article•DOI•

Large-scale cluster management at Google with Borg

[...]

Abhishek Verma¹, Luis Pedrosa¹, Madhukar R. Korupolu¹, David Oppenheimer¹, Eric S. Tune¹, John Wilkes¹ - Show less +2 more•Institutions (1)

Google¹

17 Apr 2015

TL;DR: A summary of the Borg system architecture and features, important design decisions, a quantitative analysis of some of its policy decisions, and a qualitative examination of lessons learned from a decade of operational experience with it are presented.

...read moreread less

Abstract: Google's Borg system is a cluster manager that runs hundreds of thousands of jobs, from many thousands of different applications, across a number of clusters each with up to tens of thousands of machines. It achieves high utilization by combining admission control, efficient task-packing, over-commitment, and machine sharing with process-level performance isolation. It supports high-availability applications with runtime features that minimize fault-recovery time, and scheduling policies that reduce the probability of correlated failures. Borg simplifies life for its users by offering a declarative job specification language, name service integration, real-time job monitoring, and tools to analyze and simulate system behavior. We present a summary of the Borg system architecture and features, important design decisions, a quantitative analysis of some of its policy decisions, and a qualitative examination of lessons learned from a decade of operational experience with it.

...read moreread less

1,185 citations

Proceedings Article•DOI•

Heterogeneity and dynamicity of clouds at scale: Google trace analysis

[...]

Charles Reiss¹, Alexey Tumanov², Gregory R. Ganger², Randy H. Katz¹, Michael Kozuch³ - Show less +1 more•Institutions (3)

University of California, Berkeley¹, Carnegie Mellon University², Intel³

14 Oct 2012

TL;DR: Analysis of the first publicly available trace data from a sizable multi-purpose cluster finds that many longer-running jobs have relatively stable resource utilizations, which can help adaptive resource schedulers.

...read moreread less

Abstract: To better understand the challenges in developing effective cloud-based resource schedulers, we analyze the first publicly available trace data from a sizable multi-purpose cluster. The most notable workload characteristic is heterogeneity: in resource types (e.g., cores:RAM per machine) and their usage (e.g., duration and resources needed). Such heterogeneity reduces the effectiveness of traditional slot- and core-based scheduling. Furthermore, some tasks are constrained as to the kind of machine types they can use, increasing the complexity of resource assignment and complicating task migration. The workload is also highly dynamic, varying over time and most workload features, and is driven by many short jobs that demand quick scheduling decisions. While few simplifying assumptions apply, we find that many longer-running jobs have relatively stable resource utilizations, which can help adaptive resource schedulers.

...read moreread less

1,051 citations

Green Cloud Computing: Balancing Energy in Processing, Storage, and Transport For processing large amounts of data, management and switching of communications may contribute significantly to energy consumption and cloud computing seems to be an alternative to office-based computing.

[...]

J. Baliga, Robert Ayre, Kerry Hinton, Rodney S. Tucker

01 Jan 2011

TL;DR: It is shown thatEnergy consumption in transport and switching can be a significant percentage of total energy consumption in cloud computing, and considers both public and private clouds, and includes energy consumption of the transmission and switching networks.

...read moreread less

Abstract: Network-based cloud computing is rapidly expanding as an alternative to conventional office-based computing. As cloud computing becomes more widespread, the energy consumption of the network and computing resources that underpin the cloud will grow. This is happening at a time when there is increasing attention being paid to the need to manage energy consumption across the entire information and communications technology (ICT) sector. While data center energy use has received much attention recently, there has been less attention paid to the energy consumption of the transmission and switching networks that are key to connecting users to the cloud. In this paper, we present an analysis of energy consumption in cloud computing. The analysis considers both public and private clouds, and includes energy consumption in switching and transmission as well as data processing and data storage. We show that energy consumption in transport and switching can be a significant percentage of total energy consumption in cloud computing. Cloud computing can enable more energy-efficient use of computing power, especially when the computing tasks are of low intensity or infrequent. However, under some circum- stances cloud computing can consume more energy than conventional computing where each user performs all com- puting on their own personal computer (PC).

...read moreread less

748 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse