Home
/
Authors
/
Ulrich Meyer

Author

Ulrich Meyer

Bio: Ulrich Meyer is an academic researcher from Goethe University Frankfurt. The author has contributed to research in topics: Shortest path problem & Time complexity. The author has an hindex of 27, co-authored 137 publications receiving 3036 citations. Previous affiliations of Ulrich Meyer include Max Planck Society.

Papers published on a yearly basis

2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1995
1994

Papers

PDF

Open Access

More filters

DOI•

Data Structures and Advanced Models of Computation on Big Data (Dagstuhl Seminar 16101)

[...]

Alejandro López-Ortiz, Ulrich Meyer, Markus E. Nebel, Robert Sedgewick

01 Jan 2016

TL;DR: The extended abstracts included in this report contain both recent state of the art advances and lay the foundation for new directions within data structures research.

...read moreread less

Abstract: This report documents the program and the outcomes of Dagstuhl Seminar 16101 "Data Structures and Advanced Models of Computation on Big Data". In today's computing environment vast amounts of data are processed, exchanged and analyzed. The manner in which information is stored profoundly influences the efficiency of these operations over the data. In spite of the maturity of the field many data structuring problems are still open, while new ones arise due to technological advances. The seminar covered both recent advances in the "classical" data structuring topics as well as new models of computation adapted to modern architectures, scientific studies that reveal the need for such models, applications where large data sets play a central role, modern computing platforms for very large data, and new data structures for large data in modern architectures. The extended abstracts included in this report contain both recent state of the art advances and lay the foundation for new directions within data structures research.

...read moreread less

1 citations

GPU Multisplit: an extended study of a parallel algorithm - eScholarship

[...]

Saman Ashkiani, Andrew A. Davidson, Ulrich Meyer, John D. Owens

01 Aug 2017

TL;DR: In this article, an extended study of multisplit for a small (up to 256) number of buckets is presented, where warp-synchronous programming is used to avoid branch divergence and reduce memory usage.

...read moreread less

Abstract: GPU Multisplit: an extended study of a parallel algorithm SAMAN ASHKIANI, University of California, Davis ANDREW DAVIDSON, University of California, Davis ULRICH MEYER, Goethe-Universitat Frankfurt am Main JOHN D. OWENS, University of California, Davis Multisplit is a broadly useful parallel primitive that permutes its input data into contiguous buckets or bins, where the function that categorizes an element into a bucket is provided by the programmer. Due to the lack of an efficient multisplit on GPUs, programmers often choose to implement multisplit with a sort. One way is to first generate an auxiliary array of bucket IDs and then sort input data based on it. In case smaller indexed buckets possess smaller valued keys, another way for multisplit is to directly sort input data. Both methods are inefficient and require more work than necessary: the former requires more expensive data movements while the latter spends unnecessary effort in sorting elements within each bucket. In this work, we provide a parallel model and multiple implementations for the multisplit problem. Our principal focus is multisplit for a small (up to 256) number of buckets. We use warp-synchronous programming models and emphasize warp-wide communications to avoid branch divergence and reduce memory usage. We also hierarchically reorder input elements to achieve better coalescing of global memory accesses. On a GeForce GTX 1080 GPU, we can reach a peak throughput of 18.93 Gkeys/s (or 11.68 Gpairs/s) for a key-only (or key-value) multisplit. Finally, we demonstrate how multisplit can be used as a building block for radix sort. In our multisplit-based sort implementation, we achieve comparable performance to the fastest GPU sort routines, sorting 32-bit keys (and key-value pairs) with a throughput of 3.0 G keys/s (and 2.1 Gpair/s). CCS Concepts: •Computing methodologies → Parallel algorithms; •Computer systems organization → Single instruction, multiple data; •Theory of computation → Shared memory algorithms; Additional Key Words and Phrases: Graphics Processing Unit (GPU), multisplit, bucketing, warp-synchronous programming, radix sort, histogram, shuffle, ballot ACM Reference format: Saman Ashkiani, Andrew Davidson, Ulrich Meyer, and John D. Owens. 2017. GPU Multisplit: an extended study of a parallel algorithm. ACM Trans. Parallel Comput. 9, 4, Article 39 (September 2017), 44 pages. DOI: 0000001.0000001 INTRODUCTION This paper studies the multisplit primitive for GPUs. 1 Multisplit divides a set of items (keys or key-value pairs) into contiguous buckets, where each bucket contains items whose keys satisfy a programmer-specified criterion (such as falling into a particular range). Multisplit is broadly useful in a wide range of applications, some of which we will cite later in this introduction. But we begin 1 This paper is an extended version of initial results published at PPoPP 2016 [3]. The source code is available at https: //github.com/owensgroup/GpuMultisplit. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. © 2016 ACM. 1539-9087/2017/9-ART39 $15.00 DOI: 0000001.0000001 ACM Transactions on Parallel Computing, Vol. 9, No. 4, Article 39. Publication date: September 2017.

...read moreread less

1 citations

Proceedings Article•DOI•

Parallel and I/O-Efficient Algorithms for Non-Linear Preferential Attachment

[...]

Daniel J. Allendorf, Ulrich Meyer, Manuel Penschuck, Hung Tran

13 Nov 2022

TL;DR: In this article , the authors proposed a simple yet optimal sequential algorithm for the polynomial model and parallelized the algorithm by identifying batches of independent samples and obtaining a near-optimal speedup when adding many nodes.

...read moreread less

Abstract: Preferential attachment lies at the heart of many network models aiming to replicate features of real world networks. To simulate the attachment process, conduct statistical tests, or obtain input data for benchmarks, efficient algorithms are required that are capable of generating large graphs according to these models. Existing graph generators are optimized for the most simple model, where new nodes that arrive in the network are connected to earlier nodes with a probability $P(h) \propto d$ that depends linearly on the degree $d$ of the earlier node $h$. Yet, some networks are better explained by a more general attachment probability $P(h) \propto f(d)$ for some function $f \colon \mathbb N~\to~\mathbb R$. Here, the polynomial case $f(d) = d^\alpha$ where $\alpha \in \mathbb R_{>0}$ is of particular interest. In this paper, we present efficient algorithms that generate graphs according to the more general models. We first design a simple yet optimal sequential algorithm for the polynomial model. We then parallelize the algorithm by identifying batches of independent samples and obtain a near-optimal speedup when adding many nodes. In addition, we present an I/O-efficient algorithm that can even be used for the fully general model. To showcase the efficiency and scalability of our algorithms, we conduct an experimental study and compare their performance to existing solutions.

...read moreread less

1 citations

Proceedings Article•DOI•

On Optimal Balance in B-Trees: What Does It Cost to Stay in Perfect Shape?

[...]

Rolf Fagerberg, David Hammer¹, Ulrich Meyer¹•Institutions (1)

Goethe University Frankfurt¹

01 Jan 2019

TL;DR: A lower bound is proved on the cost of maintaining optimal height ceil[log_B(n)], which shows that this cost must increase from Omega(1/B) to Omega(n/ B) rebalancing per update as n grows from one power of B to the next.

...read moreread less

Abstract: Any B-tree has height at least ceil[log_B(n)]. Static B-trees achieving this height are easy to build. In the dynamic case, however, standard B-tree rebalancing algorithms only maintain a height within a constant factor of this optimum. We investigate exactly how close to ceil[log_B(n)] the height of dynamic B-trees can be maintained as a function of the rebalancing cost. In this paper, we prove a lower bound on the cost of maintaining optimal height ceil[log_B(n)], which shows that this cost must increase from Omega(1/B) to Omega(n/B) rebalancing per update as n grows from one power of B to the next. We also provide an almost matching upper bound, demonstrating this lower bound to be essentially tight. We then give a variant upper bound which can maintain near-optimal height at low cost. As two special cases, we can maintain optimal height for all but a vanishing fraction of values of n using Theta(log_B(n)) amortized rebalancing cost per update and we can maintain a height of optimal plus one using O(1/B) amortized rebalancing cost per update. More generally, for any rebalancing budget, we can maintain (as n grows from one power of B to the next) optimal height essentially up to the point where the lower bound requires the budget to be exceeded, after which optimal height plus one is maintained. Finally, we prove that this balancing scheme gives B-trees with very good storage utilization.

...read moreread less

1 citations

External Memory Algorithms for Diameter and All-Pairs Shortest-Paths on Sparse Graphs

[...]

Lars Arge, Ulrich Meyer¹, Laura Toma, Josep Díaz, Juhani Karhumäki, Arto Lepistö, Donald Sannella - Show less +3 more•Institutions (1)

Max Planck Society¹

01 Jan 2004

TL;DR: I/O-efficient algorithms for diameter and all-pairs shortest-paths (APSP) and it is shown that for unweighted undirected graphs, APSP can be solved with just $O(V \cdot \textrm{sort}(E))$ I/Os.

...read moreread less

Abstract: We develop I/O-efficient algorithms for diameter and all-pairs shortest-paths (APSP). For general undirected graphs G(V,E) with non-negative edge weights and E/V = o(B/ log V) our approaches are the first to achieve o(V 2) I/Os. We also show that for unweighted undirected graphs, APSP can be solved with just $O(V \cdot \textrm{sort}(E))$ I/Os. Both our weighted and unweighted approaches require O(V 2) space. For diameter computations we provide I/O-space tradeoffs. Finally, we provide improved results for both diameter and APSP computation on directed planar graphs.

...read moreread less

1 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
…
23
24
25
26
27
28

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

Proceedings Article•DOI•

Random graphs

[...]

Alan Frieze¹•Institutions (1)

Carnegie Mellon University¹

22 Jan 2006

TL;DR: Some of the major results in random graphs and some of the more challenging open problems are reviewed, including those related to the WWW.

...read moreread less

Abstract: We will review some of the major results in random graphs and some of the more challenging open problems. We will cover algorithmic and structural questions. We will touch on newer models, including those related to the WWW.

...read moreread less

7,116 citations

Proceedings Article•DOI•

Pregel: a system for large-scale graph processing

[...]

Grzegorz Malewicz, Matthew H. Austern¹, Aart J. C. Bik¹, James C. Dehnert¹, Ilan Horn¹, Naty Leiser¹, Grzegorz Czajkowski¹ - Show less +3 more•Institutions (1)

Google¹

06 Jun 2010

TL;DR: A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier.

...read moreread less

Abstract: Many practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs - in some cases billions of vertices, trillions of edges - poses challenges to their efficient processing. In this paper we present a computational model suitable for this task. Programs are expressed as a sequence of iterations, in each of which a vertex can receive messages sent in the previous iteration, send messages to other vertices, and modify its own state and that of its outgoing edges or mutate graph topology. This vertex-centric approach is flexible enough to express a broad set of algorithms. The model has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier. Distribution-related details are hidden behind an abstract API. The result is a framework for processing large graphs that is expressive and easy to program.

...read moreread less

3,840 citations

Journal Article•DOI•

Proceedings of the National Academy of Sciences

[...]

Omar Bagasra

18 Aug 1998-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: It is shown that the full set of hydromagnetic equations admit five more integrals, besides the energy integral, if dissipative processes are absent, which made it possible to formulate a variational principle for the force-free magnetic fields.

...read moreread less

Abstract: where A represents the magnetic vector potential, is an integral of the hydromagnetic equations. This -integral made it possible to formulate a variational principle for the force-free magnetic fields. The integral expresses the fact that motions cannot transform a given field in an entirely arbitrary different field, if the conductivity of the medium isconsidered infinite. In this paper we shall show that the full set of hydromagnetic equations admit five more integrals, besides the energy integral, if dissipative processes are absent. These integrals, as we shall presently verify, are I2 =fbHvdV, (2)

...read moreread less

1,858 citations

Book•

Computational geometry

[...]

F. Frances Yao

02 Jan 1991

1,377 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse