Home
/
Authors
/
Hans-Arno Jacobsen

Author

Hans-Arno Jacobsen

Other affiliations: CA Technologies, French Institute for Research in Computer Science and Automation, Ludwig Maximilian University of Munich ...read more

Bio: Hans-Arno Jacobsen is an academic researcher from University of Toronto. The author has contributed to research in topics: Scalability & Computer science. The author has an hindex of 44, co-authored 375 publications receiving 8779 citations. Previous affiliations of Hans-Arno Jacobsen include CA Technologies & French Institute for Research in Computer Science and Automation.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1996
1995

Papers

PDF

Open Access

More filters

Journal Article•DOI•

PNUTS: Yahoo!'s hosted data serving platform

[...]

Brian F. Cooper¹, Raghu Ramakrishnan¹, Utkarsh Srivastava¹, Adam Silberstein¹, Philip Bohannon¹, Hans-Arno Jacobsen¹, Nick Puz¹, Daniel Weaver¹, Ramana Yerneni¹ - Show less +5 more•Institutions (1)

Yahoo!¹

01 Aug 2008

TL;DR: PNUTS provides data storage organized as hashed or ordered tables, low latency for large numbers of concurrent requests including updates and queries, and novel per-record consistency guarantees and utilizes automated load-balancing and failover to reduce operational complexity.

...read moreread less

Abstract: We describe PNUTS, a massively parallel and geographically distributed database system for Yahoo!'s web applications. PNUTS provides data storage organized as hashed or ordered tables, low latency for large numbers of concurrent requests including updates and queries, and novel per-record consistency guarantees. It is a hosted, centrally managed, and geographically distributed service, and utilizes automated load-balancing and failover to reduce operational complexity. The first version of the system is currently serving in production. We describe the motivation for PNUTS and the design and implementation of its table storage and replication layers, and then present experimental results.

...read moreread less

1,142 citations

Proceedings Article•DOI•

BigBench: towards an industry standard benchmark for big data analytics

[...]

Ahmad Ghazal¹, Tilmann Rabl², Minqing Hu¹, Francois Raab, Meikel Poess³, Alain Crolotte¹, Hans-Arno Jacobsen² - Show less +3 more•Institutions (3)

Teradata¹, University of Toronto², Oracle Corporation³

22 Jun 2013

TL;DR: BigBench, an end-to-end big data benchmark proposal that addresses the variety, velocity and volume aspects of big data systems containing structured, semi-structured and unstructured data, and is implemented on the Teradata Aster Database.

...read moreread less

Abstract: There is a tremendous interest in big data by academia, industry and a large user base. Several commercial and open source providers unleashed a variety of products to support big data storage and processing. As these products mature, there is a need to evaluate and compare the performance of these systems.In this paper, we present BigBench, an end-to-end big data benchmark proposal. The underlying business model of BigBench is a product retailer. The proposal covers a data model and synthetic data generator that addresses the variety, velocity and volume aspects of big data systems containing structured, semi-structured and unstructured data. The structured part of the BigBench data model is adopted from the TPC-DS benchmark, which is enriched with semi-structured and unstructured data components. The semi-structured part captures registered and guest user clicks on the retailer's website. The unstructured data captures product reviews submitted online. The data generator designed for BigBench provides scalable volumes of raw data based on a scale factor. The BigBench workload is designed around a set of queries against the data model. From a business prospective, the queries cover the different categories of big data analytics proposed by McKinsey. From a technical prospective, the queries are designed to span three different dimensions based on data sources, query processing types and analytic techniques.We illustrate the feasibility of BigBench by implementing it on the Teradata Aster Database. The test includes generating and loading a 200 Gigabyte BigBench data set and testing the workload by executing the BigBench queries (written using Teradata Aster SQL-MR) and reporting their response times.

...read moreread less

325 citations

Journal Article•DOI•

Solving big data challenges for enterprise application performance management

[...]

Tilmann Rabl¹, Sergio Gómez-Villamor², Mohammad Sadoghi¹, Victor Muntés-Mulero, Hans-Arno Jacobsen¹, Serge Mankovskii - Show less +2 more•Institutions (2)

University of Toronto¹, Polytechnic University of Catalonia²

01 Aug 2012

TL;DR: In this article, the authors present their experience and a comprehensive performance evaluation of six modern (Open-source) data stores in the context of application performance monitoring as part of CA Technologies initiative.

...read moreread less

Abstract: As the complexity of enterprise systems increases, the need for monitoring and analyzing such systems also grows. A number of companies have built sophisticated monitoring tools that go far beyond simple resource utilization reports. For example, based on instrumentation and specialized APIs, it is now possible to monitor single method invocations and trace individual transactions across geographically distributed systems. This high-level of detail enables more precise forms of analysis and prediction but comes at the price of high data rates (i.e., big data). To maximize the benefit of data monitoring, the data has to be stored for an extended period of time for ulterior analysis. This new wave of big data analytics imposes new challenges especially for the application performance monitoring systems. The monitoring data has to be stored in a system that can sustain the high data rates and at the same time enable an up-to-date view of the underlying infrastructure. With the advent of modern key-value stores, a variety of data storage systems have emerged that are built with a focus on scalability and high data rates as predominant in this monitoring use case.In this work, we present our experience and a comprehensive performance evaluation of six modern (open-source) data stores in the context of application performance monitoring as part of CA Technologies initiative. We evaluated these systems with data and workloads that can be found in application performance monitoring, as well as, on-line advertisement, power monitoring, and many other use cases. We present our insights not only as performance results but also as lessons learned and our experience relating to the setup and configuration complexity of these data stores in an industry setting.

...read moreread less

241 citations

Book Chapter•DOI•

Composite subscriptions in content-based publish/subscribe systems

[...]

Guoli Li¹, Hans-Arno Jacobsen¹•Institutions (1)

University of Toronto¹

01 Nov 2005

TL;DR: An overview of PADRES is presented, highlighting some of its novel features, including the composite subscription language, the coordination patterns, the composite event detection algorithms, the rule-based router design, and a detailed case study illustrating the decentralized processing of workflows.

...read moreread less

Abstract: Distributed publish/subscribe systems are naturally suited for processing events in distributed systems. However, support for expressing patterns about distributed events and algorithms for detecting correlations among these events are still largely unexplored. Inspired from the requirements of decentralized, event-driven workflow processing, we design a subscription language for expressing correlations among distributed events. We illustrate the potential of our approach with a workflow management case study. The language is validated and implemented in PADRES. In this paper we present an overview of PADRES, highlighting some of its novel features, including the composite subscription language, the coordination patterns, the composite event detection algorithms, the rule-based router design, and a detailed case study illustrating the decentralized processing of workflows. Our experimental evaluation shows that rule-based brokers are a viable and powerful alternative to existing, special-purpose, content-based routing algorithms. The experiments also show that the use of composite subscriptions in PADRES significantly reduces the load on the network. Complex workflows can be processed in a decentralized fashion with a gain of 40% in message dissemination cost. All processing is realized entirely in the publish/subscribe paradigm.

...read moreread less

185 citations

Journal Article•DOI•

A distributed service-oriented architecture for business process execution

[...]

Guoli Li¹, Vinod Muthusamy¹, Hans-Arno Jacobsen¹•Institutions (1)

University of Toronto¹

29 Jan 2010-ACM Transactions on The Web

TL;DR: It is demonstrated that agent-based execution scales better than a non-distributed approach, with at least 70% and 120% improvements in process execution time, and throughput, respectively, even with a large number of concurrent process instances.

...read moreread less

Abstract: The Business Process Execution Language (BPEL) standardizes the development of composite enterprise applications that make use of software components exposed as Web services. BPEL processes are currently executed by a centralized orchestration engine, in which issues such as scalability, platform heterogeneity, and division across administrative domains can be difficult to manage. We propose a distributed agent-based orchestration engine in which several lightweight agents execute a portion of the original business process and collaborate in order to execute the complete process. The complete set of standard BPEL activities are supported, and the transformations of several BPEL activities to the agent-based architecture are described. Evaluations of an implementation of this architecture demonstrate that agent-based execution scales better than a non-distributed approach, with at least 70p and 120p improvements in process execution time, and throughput, respectively, even with a large number of concurrent process instances. In addition, the distributed architecture successfully executes large processes that are shown to be infeasible to execute with a nondistributed engine.

...read moreread less

179 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82

Collapse

Cited by

PDF

Open Access

More filters

Data Mining - Concepts and Techniques.

[...]

Petra Perner

01 Jan 2002

9,314 citations

Proceedings Article•DOI•

Random graphs

[...]

Alan Frieze¹•Institutions (1)

Carnegie Mellon University¹

22 Jan 2006

TL;DR: Some of the major results in random graphs and some of the more challenging open problems are reviewed, including those related to the WWW.

...read moreread less

Abstract: We will review some of the major results in random graphs and some of the more challenging open problems. We will cover algorithmic and structural questions. We will touch on newer models, including those related to the WWW.

...read moreread less

7,116 citations

Journal Article•

The European commission

[...]

Titkemeyer W

16 Dec 1965-Zahnärztliche Mitteilungen

TL;DR: This research examines the interaction between demand and socioeconomic attributes through Mixed Logit models and the state of art in the field of automatic transport systems in the CityMobil project.

...read moreread less

Abstract: 2 1 The innovative transport systems and the CityMobil project 10 1.1 The research questions 10 2 The state of art in the field of automatic transport systems 12 2.1 Case studies and demand studies for innovative transport systems 12 3 The design and implementation of surveys 14 3.1 Definition of experimental design 14 3.2 Questionnaire design and delivery 16 3.3 First analyses on the collected sample 18 4 Calibration of Logit Multionomial demand models 21 4.1 Methodology 21 4.2 Calibration of the “full” model. 22 4.3 Calibration of the “final” model 24 4.4 The demand analysis through the final Multinomial Logit model 25 5 The analysis of interaction between the demand and socioeconomic attributes 31 5.1 Methodology 31 5.2 Application of Mixed Logit models to the demand 31 5.3 Analysis of the interactions between demand and socioeconomic attributes through Mixed Logit models 32 5.4 Mixed Logit model and interaction between age and the demand for the CTS 38 5.5 Demand analysis with Mixed Logit model 39 6 Final analyses and conclusions 45 6.1 Comparison between the results of the analyses 45 6.2 Conclusions 48 6.3 Answers to the research questions and future developments 52

...read moreread less

4,784 citations

Proceedings Article•DOI•

Benchmarking cloud serving systems with YCSB

[...]

Brian F. Cooper¹, Adam Silberstein¹, Erwin Tam¹, Raghu Ramakrishnan¹, Russell Sears¹ - Show less +1 more•Institutions (1)

Yahoo!¹

10 Jun 2010

TL;DR: This work presents the "Yahoo! Cloud Serving Benchmark" (YCSB) framework, with the goal of facilitating performance comparisons of the new generation of cloud data serving systems, and defines a core set of benchmarks and reports results for four widely used systems.

...read moreread less

Abstract: While the use of MapReduce systems (such as Hadoop) for large scale data analysis has been widely recognized and studied, we have recently seen an explosion in the number of systems developed for cloud data serving. These newer systems address "cloud OLTP" applications, though they typically do not support ACID transactions. Examples of systems proposed for cloud serving use include BigTable, PNUTS, Cassandra, HBase, Azure, CouchDB, SimpleDB, Voldemort, and many others. Further, they are being applied to a diverse range of applications that differ considerably from traditional (e.g., TPC-C like) serving workloads. The number of emerging cloud serving systems and the wide range of proposed applications, coupled with a lack of apples-to-apples performance comparisons, makes it difficult to understand the tradeoffs between systems and the workloads for which they are suited. We present the "Yahoo! Cloud Serving Benchmark" (YCSB) framework, with the goal of facilitating performance comparisons of the new generation of cloud data serving systems. We define a core set of benchmarks and report results for four widely used systems: Cassandra, HBase, Yahoo!'s PNUTS, and a simple sharded MySQL implementation. We also hope to foster the development of additional cloud benchmark suites that represent other classes of applications by making our benchmark tool available via open source. In this regard, a key feature of the YCSB framework/tool is that it is extensible--it supports easy definition of new workloads, in addition to making it easy to benchmark new systems.

...read moreread less

3,276 citations

Journal Article•

An overview of AspectJ

[...]

Gregor Kiczales¹, Erik Hilsdale², Jim Hugunin², Mik Kersten², Jeffrey Palm², William G. Griswold³ - Show less +2 more•Institutions (3)

University of British Columbia¹, PARC², University of California³

01 Jan 2001-Lecture Notes in Computer Science

TL;DR: AspectJ as mentioned in this paper is a simple and practical aspect-oriented extension to Java with just a few new constructs, AspectJ provides support for modular implementation of a range of crosscutting concerns.

...read moreread less

Abstract: Aspect] is a simple and practical aspect-oriented extension to Java With just a few new constructs, AspectJ provides support for modular implementation of a range of crosscutting concerns. In AspectJ's dynamic join point model, join points are well-defined points in the execution of the program; pointcuts are collections of join points; advice are special method-like constructs that can be attached to pointcuts; and aspects are modular units of crosscutting implementation, comprising pointcuts, advice, and ordinary Java member declarations. AspectJ code is compiled into standard Java bytecode. Simple extensions to existing Java development environments make it possible to browse the crosscutting structure of aspects in the same kind of way as one browses the inheritance structure of classes. Several examples show that AspectJ is powerful, and that programs written using it are easy to understand.

...read moreread less

2,947 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse