Home
/
Authors
/
David R. Karger

Author

David R. Karger

Other affiliations: Stanford University, Akamai Technologies, Bell Labs ...read more

Bio: David R. Karger is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Semantic Web & User interface. The author has an hindex of 95, co-authored 349 publications receiving 53806 citations. Previous affiliations of David R. Karger include Stanford University & Akamai Technologies.

Topics: Semantic Web, User interface, RDF, Web modeling, Linear network coding ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Chord: A scalable peer-to-peer lookup service for internet applications

[...]

Ion Stoica¹, Robert Morris², David R. Karger², M. Frans Kaashoek², Hari Balakrishnan² - Show less +1 more•Institutions (2)

University of California, Berkeley¹, Massachusetts Institute of Technology²

27 Aug 2001

TL;DR: Results from theoretical analysis, simulations, and experiments show that Chord is scalable, with communication cost and the state maintained by each node scaling logarithmically with the number of Chord nodes.

...read moreread less

Abstract: A fundamental problem that confronts peer-to-peer applications is to efficiently locate the node that stores a particular data item. This paper presents Chord, a distributed lookup protocol that addresses this problem. Chord provides support for just one operation: given a key, it maps the key onto a node. Data location can be easily implemented on top of Chord by associating a key with each data item, and storing the key/data item pair at the node to which the key maps. Chord adapts efficiently as nodes join and leave the system, and can answer queries even if the system is continuously changing. Results from theoretical analysis, simulations, and experiments show that Chord is scalable, with communication cost and the state maintained by each node scaling logarithmically with the number of Chord nodes.

...read moreread less

10,286 citations

Journal Article•DOI•

Chord: a scalable peer-to-peer lookup protocol for Internet applications

[...]

Ion Stoica¹, Robert Morris², David Liben-Nowell², David R. Karger², M. Frans Kaashoek², Frank Dabek², Hari Balakrishnan² - Show less +3 more•Institutions (2)

University of California, Berkeley¹, Massachusetts Institute of Technology²

01 Feb 2003-IEEE ACM Transactions on Networking

TL;DR: Results from theoretical analysis and simulations show that Chord is scalable: Communication cost and the state maintained by each node scale logarithmically with the number of Chord nodes.

...read moreread less

Abstract: A fundamental problem that confronts peer-to-peer applications is the efficient location of the node that stores a desired data item. This paper presents Chord, a distributed lookup protocol that addresses this problem. Chord provides support for just one operation: given a key, it maps the key onto a node. Data location can be easily implemented on top of Chord by associating a key with each data item, and storing the key/data pair at the node to which the key maps. Chord adapts efficiently as nodes join and leave the system, and can answer queries even if the system is continuously changing. Results from theoretical analysis and simulations show that Chord is scalable: Communication cost and the state maintained by each node scale logarithmically with the number of Chord nodes.

...read moreread less

3,518 citations

Journal Article•DOI•

A Random Linear Network Coding Approach to Multicast

[...]

Tracey Ho¹, Muriel Medard¹, Ralf Koetter², David R. Karger¹, Michelle Effros³, Jun Shi⁴, Ben Leong¹ - Show less +3 more•Institutions (4)

Massachusetts Institute of Technology¹, University of Illinois at Urbana–Champaign², California Institute of Technology³, University of California, Los Angeles⁴

01 Oct 2006-IEEE Transactions on Information Theory

TL;DR: This work presents a distributed random linear network coding approach for transmission and compression of information in general multisource multicast networks, and shows that this approach can take advantage of redundant network capacity for improved success probability and robustness.

...read moreread less

Abstract: We present a distributed random linear network coding approach for transmission and compression of information in general multisource multicast networks. Network nodes independently and randomly select linear mappings from inputs onto output links over some field. We show that this achieves capacity with probability exponentially approaching 1 with the code length. We also demonstrate that random linear coding performs compression when necessary in a network, generalizing error exponents for linear Slepian-Wolf coding in a natural way. Benefits of this approach are decentralized operation and robustness to network changes or link failures. We show that this approach can take advantage of redundant network capacity for improved success probability and robustness. We illustrate some potential advantages of random linear network coding over routing in two examples of practical scenarios: distributed network operation and networks with dynamically varying connections. Our derivation of these results also yields a new bound on required field size for centralized network coding on general multicast networks

...read moreread less

2,806 citations

Proceedings Article•DOI•

Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web

[...]

David R. Karger¹, Eric Lehman¹, Tom Leighton¹, Rina Panigrahy¹, Matthew S. Levine¹, Daniel M. Lewin¹ - Show less +2 more•Institutions (1)

Massachusetts Institute of Technology¹

04 May 1997

TL;DR: A family of caching protocols for distrib-uted networks that can be used to decrease or eliminate the occurrence of hot spots in the network, based on a special kind of hashing that is called consistent hashing.

...read moreread less

Abstract: We describe a family of caching protocols for distrib-uted networks that can be used to decrease or eliminate the occurrence of hot spots in the network. Our protocols are particularly designed for use with very large networks such as the Internet, where delays caused by hot spots can be severe, and where it is not feasible for every server to have complete information about the current state of the entire network. The protocols are easy to implement using existing network protocols such as TCP/IP, and require very little overhead. The protocols work with local control, make efficient use of existing resources, and scale gracefully as the network grows. Our caching protocols are based on a special kind of hashing that we call consistent hashing. Roughly speaking, a consistent hash function is one which changes minimally as the range of the function changes. Through the development of good consistent hash functions, we are able to develop caching protocols which do not require users to have a current or even consistent view of the network. We believe that consistent hash functions may eventually prove to be useful in other applications such as distributed name servers and/or quorum systems.

...read moreread less

2,179 citations

Proceedings Article•DOI•

A scalable location service for geographic ad hoc routing

[...]

Jinyang Li¹, John Jannotti¹, Douglas S. J. De Couto¹, David R. Karger¹, Robert Morris¹ - Show less +1 more•Institutions (1)

Massachusetts Institute of Technology¹

01 Aug 2000

TL;DR: GLS combined with geographic forwarding allows the construction of ad hoc mobile networks that scale to a larger number of nodes than possible with previous work, and compares favorably with Dynamic Source Routing.

...read moreread less

Abstract: GLS is a new distributed location service which tracks mobile node locations. GLS combined with geographic forwarding allows the construction of ad hoc mobile networks that scale to a larger number of nodes than possible with previous work. GLS is decentralized and runs on the mobile nodes themselves, requiring no fixed infrastructure. Each mobile node periodically updates a small set of other nodes (its location servers) with its current location. A node sends its position updates to its location servers without knowing their actual identities, assisted by a predefined ordering of node identifiers and a predefined geographic hierarchy. Queries for a mobile node's location also use the predefined identifier ordering and spatial hierarchy to find a location server for that node.Experiments using the ns simulator for up to 600 mobile nodes show that the storage and bandwidth requirements of GLS grow slowly with the size of the network. Furthermore, GLS tolerates node failures well: each failure has only a limited effect and query performance degrades gracefully as nodes fail and restart. The query performance of GLS is also relatively insensitive to node speeds. Simple geographic forwarding combined with GLS compares favorably with Dynamic Source Routing (DSR): in larger networks (over 200 nodes) our approach delivers more packets, but consumes fewer network resources.

...read moreread less

1,769 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Basics of qualitative research: Grounded theory procedures and techniques

[...]

Pamela Jordan

01 Jun 1994-Journal of Pediatric Nursing

TL;DR: Reading a book as this basics of qualitative research grounded theory procedures and techniques and other references can enrich your life quality.

...read moreread less

13,415 citations

Proceedings Article•DOI•

Chord: A scalable peer-to-peer lookup service for internet applications

[...]

Ion Stoica¹, Robert Morris², David R. Karger², M. Frans Kaashoek², Hari Balakrishnan² - Show less +1 more•Institutions (2)

University of California, Berkeley¹, Massachusetts Institute of Technology²

27 Aug 2001

...read moreread less

10,286 citations

Journal Article•DOI•

Authoritative sources in a hyperlinked environment

[...]

Jon Kleinberg¹•Institutions (1)

Cornell University¹

01 Sep 1999-Journal of the ACM

TL;DR: This work proposes and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of “hub pages” that join them together in the link structure, and has connections to the eigenvectors of certain matrices associated with the link graph.

...read moreread less

Abstract: The network structure of a hyperlinked environment can be a rich source of information about the content of the environment, provided we have effective means for understanding it. We develop a set of algorithmic tools for extracting information from the link structures of such environments, and report on experiments that demonstrate their effectiveness in a variety of context on the World Wide Web. The central issue we address within our framework is the distillation of broad search topics, through the discovery of “authorative” information sources on such topics. We propose and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of “hub pages” that join them together in the link structure. Our formulation has connections to the eigenvectors of certain matrices associated with the link graph; these connections in turn motivate additional heuristrics for link-based analysis.

...read moreread less

8,328 citations

Book•

Machine Learning : A Probabilistic Perspective

[...]

Kevin P. Murphy

24 Aug 2012

TL;DR: This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach, and is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

...read moreread less

Abstract: Today's Web-enabled deluge of electronic data calls for automated methods of data analysis. Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach. The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package--PMTK (probabilistic modeling toolkit)--that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

...read moreread less

8,059 citations

Book Chapter•DOI•

Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

[...]

Antony Rowstron¹, Peter Druschel²•Institutions (2)

Microsoft¹, Rice University²

12 Nov 2001-Lecture Notes in Computer Science

TL;DR: Pastry as mentioned in this paper is a scalable, distributed object location and routing substrate for wide-area peer-to-peer ap- plications, which performs application-level routing and object location in a po- tentially very large overlay network of nodes connected via the Internet.

...read moreread less

Abstract: This paper presents the design and evaluation of Pastry, a scalable, distributed object location and routing substrate for wide-area peer-to-peer ap- plications. Pastry performs application-level routing and object location in a po- tentially very large overlay network of nodes connected via the Internet. It can be used to support a variety of peer-to-peer applications, including global data storage, data sharing, group communication and naming. Each node in the Pastry network has a unique identifier (nodeId). When presented with a message and a key, a Pastry node efficiently routes the message to the node with a nodeId that is numerically closest to the key, among all currently live Pastry nodes. Each Pastry node keeps track of its immediate neighbors in the nodeId space, and notifies applications of new node arrivals, node failures and recoveries. Pastry takes into account network locality; it seeks to minimize the distance messages travel, according to a to scalar proximity metric like the number of IP routing hops. Pastry is completely decentralized, scalable, and self-organizing; it automatically adapts to the arrival, departure and failure of nodes. Experimental results obtained with a prototype implementation on an emulated network of up to 100,000 nodes confirm Pastry's scalability and efficiency, its ability to self-organize and adapt to node failures, and its good network locality properties.

...read moreread less

7,423 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse