Home
/
Authors
/
Milan Vojnovic

Author

Milan Vojnovic

London School of Economics and Political Science

Other affiliations: Microsoft, University of Split, École Polytechnique Fédérale de Lausanne ...read more

Bio: Milan Vojnovic is an academic researcher from London School of Economics and Political Science. The author has contributed to research in topics: Node (networking) & Scheduling (computing). The author has an hindex of 36, co-authored 122 publications receiving 6168 citations. Previous affiliations of Milan Vojnovic include Microsoft & University of Split.

Papers published on a yearly basis

2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2003
2002
2001
2000
1999
1998

Papers

PDF

Open Access

More filters

Proceedings Article•

QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding

[...]

Dan Alistarh¹, Demjan Grubic, Jerry Li², Ryota Tomioka³, Milan Vojnovic⁴ - Show less +1 more•Institutions (4)

Institute of Science and Technology Austria¹, Massachusetts Institute of Technology², Microsoft³, London School of Economics and Political Science⁴

06 Dec 2017

TL;DR: Quantized SGD (QSGD) as discussed by the authors is a family of compression schemes for gradient updates which provides convergence guarantees for convex and nonconvex objectives, under asynchrony, and can be extended to stochastic variance-reduced techniques.

...read moreread less

Abstract: Parallel implementations of stochastic gradient descent (SGD) have received significant research attention, thanks to its excellent scalability properties. A fundamental barrier when parallelizing SGD is the high bandwidth cost of communicating gradient updates between nodes; consequently, several lossy compresion heuristics have been proposed, by which nodes only communicate quantized gradients. Although effective in practice, these heuristics do not always guarantee convergence, and it is not clear whether they can be improved. In this paper, we propose Quantized SGD (QSGD), a family of compression schemes for gradient updates which provides convergence guarantees. QSGD allows the user to smoothly trade off \emph{communication bandwidth} and \emph{convergence time}: nodes can adjust the number of bits sent per iteration, at the cost of possibly higher variance. We show that this trade-off is inherent, in the sense that improving it past some threshold would violate information-theoretic lower bounds. QSGD guarantees convergence for convex and non-convex objectives, under asynchrony, and can be extended to stochastic variance-reduced techniques. When applied to training deep neural networks for image classification and automated speech recognition, QSGD leads to significant reductions in end-to-end training time. For example, on 16GPUs, we can train the ResNet152 network to full accuracy on ImageNet 1.8x faster than the full-precision variant.

...read moreread less

759 citations

Proceedings Article•DOI•

Power law and exponential decay of inter contact times between mobile devices

[...]

Thomas Karagiannis¹, Jean-Yves Le Boudec², Milan Vojnovic¹•Institutions (2)

Microsoft¹, École Polytechnique Fédérale de Lausanne²

09 Sep 2007

TL;DR: The fundamental properties that determine the basic performance metrics for opportunistic communications are examined, and empirical evidence is presented that the return time of a mobile device to its favorite location site may already explain the observed dichotomy.

...read moreread less

Abstract: We examine the fundamental properties that determine the basic performance metrics for opportunistic communications. We first consider the distribution of inter-contact times between mobile devices. Using a diverse set of measured mobility traces, we find as an invariant property that there is a characteristic time, order of half a day, beyond which the distribution decays exponentially. Up to this value, the distribution in many cases follows a power law, as shown in recent work. This powerlaw finding was previously used to support the hypothesis that inter-contact time has a power law tail, and that common mobility models are not adequate. However, we observe that the time scale of interest for opportunistic forwarding may be of the same order as the characteristic time, and thus the exponential tail is important. We further show that already simple models such as random walk and random way point can exhibit the same dichotomy in the distribution of inter-contact time ascin empirical traces. Finally, we perform an extensive analysis of several properties of human mobility patterns across several dimensions, and we present empirical evidence that the return time of a mobile device to its favorite location site may already explain the observed dichotomy. Our findings suggest that existing results on the performance of forwarding schemes basedon power-law tails might be overly pessimistic.

...read moreread less

687 citations

Proceedings Article•DOI•

Perfect simulation and stationarity of a class of mobility models

[...]

J.-Y. Le Boudec¹, Milan Vojnovic²•Institutions (2)

École Normale Supérieure¹, Microsoft²

13 Mar 2005

TL;DR: A generic mobility model for independent mobiles that contains as special cases the random waypoint on convex or non convex domains, random walk with reflection or wrapping, city section, space graph and other models is defined.

...read moreread less

Abstract: We define "random trip", a generic mobility model for independent mobiles that contains as special cases: the random waypoint on convex or non convex domains, random walk with reflection or wrapping, city section, space graph and other models. We use Palm calculus to study the model and give a necessary and sufficient condition for a stationary regime to exist. When this condition is satisfied, we compute the stationary regime and give an algorithm to start a simulation in steady state (perfect simulation). The algorithm does not require the knowledge of geometric constants. For the special case of random waypoint, we provide for the first time a proof and a sufficient and necessary condition of the existence of a stationary regime. Further, we extend its applicability to a broad class of non convex and multi-site examples, and provide a ready-to-use algorithm for perfect simulation. For the special case of random walks with reflection or wrapping, we show that, in the stationary regime, the mobile location is uniformly distributed and is independent of the speed vector, and that there is no speed decay. Our framework provides a rich set of well understood models that can be used to simulate mobile networks with independent node movements. Our perfect sampling is implemented to use with ns-2, and it is freely available to download from http://ica1www.epfl.ch/RandomTrip.

...read moreread less

503 citations

Posted Content•

QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding

[...]

Dan Alistarh¹, Demjan Grubic, Jerry Li², Ryota Tomioka³, Milan Vojnovic⁴ - Show less +1 more•Institutions (4)

Institute of Science and Technology Austria¹, Massachusetts Institute of Technology², Microsoft³, London School of Economics and Political Science⁴

07 Oct 2016-arXiv: Learning

TL;DR: Quantized SGD is proposed, a family of compression schemes for gradient updates which provides convergence guarantees and leads to significant reductions in end-to-end training time, and can be extended to stochastic variance-reduced techniques.

...read moreread less

Abstract: Parallel implementations of stochastic gradient descent (SGD) have received significant research attention, thanks to excellent scalability properties of this algorithm, and to its efficiency in the context of training deep neural networks. A fundamental barrier for parallelizing large-scale SGD is the fact that the cost of communicating the gradient updates between nodes can be very large. Consequently, lossy compression heuristics have been proposed, by which nodes only communicate quantized gradients. Although effective in practice, these heuristics do not always provably converge, and it is not clear whether they are optimal. In this paper, we propose Quantized SGD (QSGD), a family of compression schemes which allow the compression of gradient updates at each node, while guaranteeing convergence under standard assumptions. QSGD allows the user to trade off compression and convergence time: it can communicate a sublinear number of bits per iteration in the model dimension, and can achieve asymptotically optimal communication cost. We complement our theoretical results with empirical data, showing that QSGD can significantly reduce communication cost, while being competitive with standard uncompressed techniques on a variety of real tasks. In particular, experiments show that gradient quantization applied to training of deep neural networks for image classification and automated speech recognition can lead to significant reductions in communication cost, and end-to-end training time. For instance, on 16 GPUs, we are able to train a ResNet-152 network on ImageNet 1.8x faster to full accuracy. Of note, we show that there exist generic parameter settings under which all known network architectures preserve or slightly improve their full accuracy when using quantization.

...read moreread less

419 citations

Proceedings Article•DOI•

FENNEL: streaming graph partitioning for massive scale graphs

[...]

Charalampos E. Tsourakakis¹, Christos Gkantsidis², Bozidar Radunovic², Milan Vojnovic²•Institutions (2)

Aalto University¹, Microsoft²

24 Feb 2014

TL;DR: This work derives a novel one-pass, streaming graph partitioning algorithm and shows that it yields significant performance improvements over previous approaches using an extensive set of real-world and synthetic graphs.

...read moreread less

Abstract: Balanced graph partitioning in the streaming setting is a key problem to enable scalable and efficient computations on massive graph data such as web graphs, knowledge graphs, and graphs arising in the context of online social networks. Two families of heuristics for graph partitioning in the streaming setting are in wide use: place the newly arrived vertex in the cluster with the largest number of neighbors or in the cluster with the least number of non-neighbors. In this work, we introduce a framework which unifies the two seemingly orthogonal heuristics and allows us to quantify the interpolation between them. More generally, the framework enables a well principled design of scalable, streaming graph partitioning algorithms that are amenable to distributed implementations. We derive a novel one-pass, streaming graph partitioning algorithm and show that it yields significant performance improvements over previous approaches using an extensive set of real-world and synthetic graphs. Surprisingly, despite the fact that our algorithm is a one-pass streaming algorithm, we found its performance to be in many cases comparable to the de-facto standard offline software METIS and in some cases even superiror. For instance, for the Twitter graph with more than 1.4 billion of edges, our method partitions the graph in about 40 minutes achieving a balanced partition that cuts as few as 6.8% of edges, whereas it took more than 81/2 hours by METIS to produce a balanced partition that cuts 11.98% of edges. We also demonstrate the performance gains by using our graph partitioner while solving standard PageRank computation in a graph processing platform with respect to the communication cost and runtime.

...read moreread less

324 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

Collapse

Cited by

PDF

Open Access

More filters

Convex Analysisの二,三の進展について

[...]

徹丸山

01 Feb 1977

5,933 citations

Journal Article•DOI•

Stable non-Gaussian random processes , by G. Samorodnitsky and M. S. Taqqu. Pp. 632. £49.50. 1994. ISBN 0-412-05171-0 (Chapman and Hall).

[...]

Dave Applebaum¹•Institutions (1)

Nottingham Trent University¹

01 Nov 1995-The Mathematical Gazette

2,345 citations

Journal Article•DOI•

Fair end-to-end window-based congestion control

[...]

Jeonghoon Mo¹, Jean Walrand²•Institutions (2)

AT&T Labs¹, University of California, Berkeley²

01 Oct 2000-IEEE ACM Transactions on Networking

TL;DR: The existence of fair end-to-end window-based congestion control protocols for packet-switched networks with first come-first served routers is demonstrated using a Lyapunov function.

...read moreread less

Abstract: In this paper, we demonstrate the existence of fair end-to-end window-based congestion control protocols for packet-switched networks with first come-first served routers. Our definition of fairness generalizes proportional fairness and includes arbitrarily close approximations of max-min fairness. The protocols use only information that is available to end hosts and are designed to converge reasonably fast. Our study is based on a multiclass fluid model of the network. The convergence of the protocols is proved using a Lyapunov function. The technical challenge is in the practical implementation of the protocols.

...read moreread less

2,161 citations

Proceedings Article•DOI•

The ONE simulator for DTN protocol evaluation

[...]

Ari Keränen¹, Jörg Ott¹, Teemu Kärkkäinen¹•Institutions (1)

Helsinki University of Technology¹

02 Mar 2009

TL;DR: This paper presents the Opportunistic Networking Environment (ONE) simulator specifically designed for evaluating DTN routing and application protocols, and shows sample simulations to demonstrate the simulator's flexible support for DTN protocol evaluation.

...read moreread less

Abstract: Delay-tolerant Networking (DTN) enables communication in sparse mobile ad-hoc networks and other challenged environments where traditional networking fails and new routing and application protocols are required. Past experience with DTN routing and application protocols has shown that their performance is highly dependent on the underlying mobility and node characteristics. Evaluating DTN protocols across many scenarios requires suitable simulation tools. This paper presents the Opportunistic Networking Environment (ONE) simulator specifically designed for evaluating DTN routing and application protocols. It allows users to create scenarios based upon different synthetic movement models and real-world traces and offers a framework for implementing routing and application protocols (already including six well-known routing protocols). Interactive visualization and post-processing tools support evaluating experiments and an emulation mode allows the ONE simulator to become part of a real-world DTN testbed. We show sample simulations to demonstrate the simulator's flexible support for DTN protocol evaluation.

...read moreread less

2,075 citations

Journal Article•DOI•

Social capital: a theory of social structure and action

[...]

Alain Degenne¹•Institutions (1)

University of Caen Lower Normandy¹

01 Nov 2004-Tempo Social

1,995 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse