Home
/
Authors
/
Vladimir Braverman

Author

Vladimir Braverman

Other affiliations: University of California, Los Angeles, Google, Wilmington University

Bio: Vladimir Braverman is an academic researcher from Johns Hopkins University. The author has contributed to research in topics: Computer science & Coreset. The author has an hindex of 25, co-authored 158 publications receiving 2475 citations. Previous affiliations of Vladimir Braverman include University of California, Los Angeles & Google.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

One Sketch to Rule Them All: Rethinking Network Flow Monitoring with UnivMon

[...]

Zaoxing Liu¹, Antonis Manousis², Gregory Vorsanger¹, Vyas Sekar², Vladimir Braverman¹ - Show less +1 more•Institutions (2)

Johns Hopkins University¹, Carnegie Mellon University²

22 Aug 2016

TL;DR: UnivMon is presented, a framework for flow monitoring which leverages recent theoretical advances and demonstrates that it is possible to achieve both generality and high accuracy, and evaluated using a range of trace-driven evaluations to show that it offers comparable (and sometimes better) accuracy relative to custom sketching solutions.

...read moreread less

Abstract: Network management requires accurate estimates of metrics for traffic engineering (e.g., heavy hitters), anomaly detection (e.g., entropy of source addresses), and security (e.g., DDoS detection). Obtaining accurate estimates given router CPU and memory constraints is a challenging problem. Existing approaches fall in one of two undesirable extremes: (1) low fidelity general-purpose approaches such as sampling, or (2) high fidelity but complex algorithms customized to specific application-level metrics. Ideally, a solution should be both general (i.e., supports many applications) and provide accuracy comparable to custom algorithms. This paper presents UnivMon, a framework for flow monitoring which leverages recent theoretical advances and demonstrates that it is possible to achieve both generality and high accuracy. UnivMon uses an application-agnostic data plane monitoring primitive; different (and possibly unforeseen) estimation algorithms run in the control plane, and use the statistics from the data plane to compute application-level metrics. We present a proof-of-concept implementation of UnivMon using P4 and develop simple coordination techniques to provide a ``one-big-switch'' abstraction for network-wide monitoring. We evaluate the effectiveness of UnivMon using a range of trace-driven evaluations and show that it offers comparable (and sometimes better) accuracy relative to custom sketching solutions.

...read moreread less

440 citations

Proceedings Article•

FetchSGD: Communication-Efficient Federated Learning with Sketching.

[...]

Daniel Rothchild¹, Ashwinee Panda¹, Enayat Ullah², Nikita Ivkin³, Ion Stoica¹, Vladimir Braverman², Joseph E. Gonzalez¹, Raman Arora² - Show less +4 more•Institutions (3)

University of California, Berkeley¹, Johns Hopkins University², Amazon.com³

21 Nov 2020

TL;DR: This paper introduces a novel algorithm, called FetchSGD, which compresses model updates using a Count Sketch, and then takes advantage of the mergeability of sketches to combine model updates from many workers.

...read moreread less

Abstract: Existing approaches to federated learning suffer from a communication bottleneck as well as convergence issues due to sparse client participation. In this paper we introduce a novel algorithm, called FetchSGD, to overcome these challenges. FetchSGD compresses model updates using a Count Sketch, and then takes advantage of the mergeability of sketches to combine model updates from many workers. A key insight in the design of FetchSGD is that, because the Count Sketch is linear, momentum and error accumulation can both be carried out within the sketch. This allows the algorithm to move momentum and error accumulation from clients to the central aggregator, overcoming the challenges of sparse client participation while still achieving high compression rates and good convergence. We prove that FetchSGD has favorable convergence guarantees, and we demonstrate its empirical effectiveness by training two residual networks and a transformer model.

...read moreread less

169 citations

Proceedings Article•DOI•

Nitrosketch: robust and general sketch-based monitoring in software switches

[...]

Zaoxing Liu¹, Ran Ben-Basat², Gil Einziger³, Yaron Kassner⁴, Vladimir Braverman⁵, Roy Friedman⁴, Vyas Sekar¹ - Show less +3 more•Institutions (5)

Carnegie Mellon University¹, Harvard University², Ben-Gurion University of the Negev³, Technion – Israel Institute of Technology⁴, Johns Hopkins University⁵

19 Aug 2019

TL;DR: The design and implementation of NitroSketch is presented, a sketching framework that systematically addresses the performance bottlenecks of sketches without sacrificing robustness and generality and is implemented on three popular software platforms.

...read moreread less

Abstract: Software switches are emerging as a vital measurement vantage point in many networked systems. Sketching algorithms or sketches, provide high-fidelity approximate measurements, and appear as a promising alternative to traditional approaches such as packet sampling. However, sketches incur significant computation overhead in software switches. Existing efforts in implementing sketches in virtual switches make sacrifices on one or more of the following dimensions: performance (handling 40 Gbps line-rate packet throughput with low CPU footprint), robustness (accuracy guarantees across diverse workloads), and generality (supporting various measurement tasks). In this work, we present the design and implementation of NitroSketch, a sketching framework that systematically addresses the performance bottlenecks of sketches without sacrificing robustness and generality. Our key contribution is the careful synthesis of rigorous, yet practical solutions to reduce the number of per-packet CPU and memory operations. We implement NitroSketch on three popular software platforms (Open vSwitch-DPDK, FD.io-VPP, and BESS) and evaluate the performance. We show that accuracy is comparable to unmodified sketches while attaining up to two orders of magnitude speedup, and up to 45% reduction in CPU usage.

...read moreread less

140 citations

Posted Content•

New Frameworks for Offline and Streaming Coreset Constructions

[...]

Vladimir Braverman, Dan Feldman, Harry Lang

02 Dec 2016-arXiv: Data Structures and Algorithms

TL;DR: This work introduces a new technique for converting an offline coreset construction to the streaming setting, and provides the first generalizations of such coresets for handling outliers.

...read moreread less

Abstract: Let $P$ be a set (called points), $Q$ be a set (called queries) and a function $ f:P\times Q\to [0,\infty)$ (called cost). For an error parameter $\epsilon>0$, a set $S\subseteq P$ with a \emph{weight function} $w:P \rightarrow [0,\infty)$ is an $\epsilon$-coreset if $\sum_{s\in S}w(s) f(s,q)$ approximates $\sum_{p\in P} f(p,q)$ up to a multiplicative factor of $1\pm\epsilon$ for every given query $q\in Q$. We construct coresets for the $k$-means clustering of $n$ input points, both in an arbitrary metric space and $d$-dimensional Euclidean space. For Euclidean space, we present the first coreset whose size is simultaneously independent of both $d$ and $n$. In particular, this is the first coreset of size $o(n)$ for a stream of $n$ sparse points in a $d \ge n$ dimensional space (e.g. adjacency matrices of graphs). We also provide the first generalizations of such coresets for handling outliers. For arbitrary metric spaces, we improve the dependence on $k$ to $k \log k$ and present a matching lower bound. For $M$-estimator clustering (special cases include the well-known $k$-median and $k$-means clustering), we introduce a new technique for converting an offline coreset construction to the streaming setting. Our method yields streaming coreset algorithms requiring the storage of $O(S + k \log n)$ points, where $S$ is the size of the offline coreset. In comparison, the previous state-of-the-art was the merge-and-reduce technique that required $O(S \log^{2a+1} n)$ points, where $a$ is the exponent in the offline construction's dependence on $\epsilon^{-1}$. For example, combining our offline and streaming results, we produce a streaming metric $k$-means coreset algorithm using $O(\epsilon^{-2} k \log k \log n)$ points of storage. The previous state-of-the-art required $O(\epsilon^{-4} k \log k \log^{6} n)$ points.

...read moreread less

136 citations

Proceedings Article•DOI•

Smooth Histograms for Sliding Windows

[...]

Vladimir Braverman¹, Rafail Ostrovsky¹•Institutions (1)

University of California, Los Angeles¹

21 Oct 2007

TL;DR: This paper presents a new smooth histograms method that improves the approximation error rate obtained via exponential histograms and provides the first approximation algorithms for the following functions: Lp norms for p notin, frequency moments, length of increasing subsequence and geometric mean.

...read moreread less

Abstract: In the streaming model elements arrive sequentially and can be observed only once. Maintaining statistics and aggregates is an important and non-trivial task in the model. This becomes even more challenging in the sliding windows model, where statistics must be maintained only over the most recent n elements. In their pioneering paper, Datar, Gionis, Indyk and Motwani [15] presented exponential histograms, an effective method for estimating statistics on sliding windows. In this paper we present a new smooth histograms method that improves the approximation error rate obtained via exponential histograms. Furthermore, our smooth histograms method not only captures and improves multiple previous results on sliding windows bur also extends the class functions that can be approximated on sliding windows. In particular, we provide the first approximation algorithms for the following functions: Lp norms for p notin [1,2], frequency moments, length of increasing subsequence and geometric mean.

...read moreread less

130 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

Collapse

Cited by

PDF

Open Access

More filters

Book Chapter•DOI•

Stochastic Differential Equations

[...]

Ioannis Karatzas¹, Steven E. Shreve²•Institutions (2)

Columbia University¹, Carnegie Mellon University²

01 Jan 1998

TL;DR: In this paper, the authors explore questions of existence and uniqueness for solutions to stochastic differential equations and offer a study of their properties, using diffusion processes as a model of a Markov process with continuous sample paths.

...read moreread less

Abstract: We explore in this chapter questions of existence and uniqueness for solutions to stochastic differential equations and offer a study of their properties. This endeavor is really a study of diffusion processes. Loosely speaking, the term diffusion is attributed to a Markov process which has continuous sample paths and can be characterized in terms of its infinitesimal generator.

...read moreread less

2,446 citations

Methods and Applications

[...]

Ajit Varki, Richard D Cummings, Jeffrey D. Esko, Hudson Freeze, Pamela Stanley, Carolyn R. Bertozzi, Gerald W. Hart, Marilynn E. Etzler - Show less +4 more

01 Jan 2009

TL;DR: The aim of the research presented in this thesis is to create new methods for design for manufacturing, by using several approaches of KE, and find the beneficial and less beneficial aspects of these methods in comparison to each other and earlier research.

...read moreread less

Abstract: As companies strive to develop artefacts intended for services instead of traditional sell-off, new challenges in the product development process arise to promote continuous improvement and increasing market profits. This creates a focus on product life-cycle components as companies then make life-cycle commitments, where they are responsible for the function availability during the extent of the life-cycle, i.e. functional products. One of these life-cycle components is manufacturing; therefore, companies search for new approaches of success during manufacturability evaluation already in engineering design. Efforts have been done to support early engineering design, as this phase sets constraints and opportunities for manufacturing. These efforts have turned into design for manufacturing methods and guidelines. A further step to improve the life-cycle focus during early engineering design is to reuse results and use experience from earlier projects. However, because results and experiences created during project work are often not documented for reuse, only remembered by some people, there is a need for design support. Knowledge engineering (KE) is a methodology for creating knowledge-based systems, e.g. systems that enable reuse of earlier results and make available both explicit and tacit corporate knowledge, enabling the automated generation and evaluation of new engineering design solutions during early product development. There are a variety of KE-approaches, such as knowledge-based engineering, case-based reasoning and programming, which have been used in research to develop design for manufacturing methods and applications. There are, however, opportunities for research where several approaches and their interdependencies, to create a transparent picture of how KE can be used to support engineering design, are investigated. The aim of the research presented in this thesis is to create new methods for design for manufacturing, by using several approaches of KE, and find the beneficial and less beneficial aspects of these methods in comparison to each other and earlier research. This thesis presents methods and applications for design for manufacturing using KE. KE has been employed in several ways, namely rule-based, rule-, programmingand finite element analysis (FEA)-based, and ruleand plan-based, which are tested and compared with each other. Results show that KE can be used to generate information about manufacturing in several ways. The rule-based way is suitable for supporting life-cycle commitments, as engineering design and manufacturing can be integrated with maintenance and performance predictions during early engineering design, though limited to the firing of production rules. The rule-, programmingand FEA-based way can be used to integrate computer-aided design tools and virtual manufacturing for non-linear stress and displacement analysis. This way may also bridge the gap between engineering designers and computational experts, even though this way requires a larger effort to program than the rule-based. The ruleand planbased way can enable design for manufacturing in two fashions – based on earlier manufacturing plans and based on rules. Because earlier manufacturing plans, together with programming algorithms, can handle knowledge that may be more intricate to capture as rules, as opposed to the time demanding routine work that is often automated by means of rules, several opportunities for designing for manufacturing exist.

...read moreread less

727 citations

Proceedings Article•DOI•

Leakage-Resilient Cryptography

[...]

Stefan Dziembowski¹, Krzysztof Pietrzak²•Institutions (2)

Sapienza University of Rome¹, Centrum Wiskunde & Informatica²

25 Oct 2008

TL;DR: In this article, a stream-cipher S whose implementation is secure even if a bounded amount of arbitrary (adversarially chosen) information on the internal state ofS is leaked during computation is presented.

...read moreread less

Abstract: We construct a stream-cipher S whose implementation is secure even if a bounded amount of arbitrary (adversarially chosen) information on the internal state ofS is leaked during computation. This captures all possible side-channel attacks on S where the amount of information leaked in a given period is bounded, but overall can be arbitrary large. The only other assumption we make on the implementation of S is that only data that is accessed during computation leaks information. The stream-cipher S generates its output in chunks K1, K2, . . . and arbitrary but bounded information leakage is modeled by allowing the adversary to adaptively chose a function fl : {0,1}* rarr {0, 1}lambda before Kl is computed, she then gets fl(taul) where taul is the internal state ofS that is accessed during the computation of Kg. One notion of security we prove for S is that Kg is indistinguishable from random when given K1,..., K1-1,f1(tau1 ),..., fl-1(taul-1) and also the complete internal state of S after Kg has been computed (i.e. S is forward-secure). The construction is based on alternating extraction (used in the intrusion-resilient secret-sharing scheme from FOCS'07). We move this concept to the computational setting by proving a lemma that states that the output of any PRG has high HILLpseudoentropy (i.e. is indistinguishable from some distribution with high min-entropy) even if arbitrary information about the seed is leaked. The amount of leakage lambda that we can tolerate in each step depends on the strength of the underlying PRG, it is at least logarithmic, but can be as large as a constant fraction of the internal state of S if the PRG is exponentially hard.

...read moreread less

519 citations

Proceedings Article•DOI•

One Sketch to Rule Them All: Rethinking Network Flow Monitoring with UnivMon

[...]

Zaoxing Liu¹, Antonis Manousis², Gregory Vorsanger¹, Vyas Sekar², Vladimir Braverman¹ - Show less +1 more•Institutions (2)

Johns Hopkins University¹, Carnegie Mellon University²

22 Aug 2016

...read moreread less

440 citations

Journal Article•DOI•

Graph stream algorithms: a survey

[...]

Andrew McGregor¹•Institutions (1)

University of Massachusetts Amherst¹

13 May 2014

TL;DR: The techniques developed in this area are now finding applications in other areas including data structures for dynamic graphs, approximation algorithms, and distributed and parallel computation.

...read moreread less

Abstract: Over the last decade, there has been considerable interest in designing algorithms for processing massive graphs in the data stream model. The original motivation was two-fold: a) in many applications, the dynamic graphs that arise are too large to be stored in the main memory of a single machine and b) considering graph problems yields new insights into the complexity of stream computation. However, the techniques developed in this area are now finding applications in other areas including data structures for dynamic graphs, approximation algorithms, and distributed and parallel computation. We survey the state-of-the-art results; identify general techniques; and highlight some simple algorithms that illustrate basic ideas.

...read moreread less

405 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse