scispace - formally typeset
Search or ask a question
Author

Eli Upfal

Bio: Eli Upfal is an academic researcher from Brown University. The author has contributed to research in topics: Equal-cost multi-path routing & Randomized algorithm. The author has an hindex of 60, co-authored 287 publications receiving 16234 citations. Previous affiliations of Eli Upfal include Stanford University & Hebrew University of Jerusalem.


Papers
More filters
Book
01 Jan 2005
TL;DR: Preface 1. Events and probability 2. Discrete random variables and expectation 3. Moments and deviations 4. Chernoff bounds 5. Balls, bins and random graphs 6. Probabilistic method 7. Markov chains and random walks 8. Continuous distributions and the Poisson process
Abstract: Preface 1. Events and probability 2. Discrete random variables and expectation 3. Moments and deviations 4. Chernoff bounds 5. Balls, bins and random graphs 6. The probabilistic method 7. Markov chains and random walks 8. Continuous distributions and the Poisson process 9. Entropy, randomness and information 10. The Monte Carlo method 11. Coupling of Markov chains 12. Martingales 13. Pairwise independence and universal hash functions 14. Balanced allocations References.

2,543 citations

Journal ArticleDOI
01 Sep 1999
TL;DR: It is shown that with high probability, the fullest box contains only ln ln n/ln 2 + O(1) balls---exponentially less than before and a similar gap exists in the infinite process, where at each step one ball, chosen uniformly at random, is deleted, and one ball is added in the manner above.
Abstract: Suppose that we sequentially place $n$ balls into n boxes by putting each ball into a randomly chosen box. It is well known that when we are done, the fullest box has with high probability (1 + o(1))ln n/ln ln n balls in it. Suppose instead that for each ball we choose two boxes at random and place the ball into the one which is less full at the time of placement. We show that with high probability, the fullest box contains only ln ln n/ln 2 + O(1) balls---exponentially less than before. Furthermore, we show that a similar gap exists in the infinite process, where at each step one ball, chosen uniformly at random, is deleted, and one ball is added in the manner above. We discuss consequences of this and related theorems for dynamic resource allocation, hashing, and on-line load balancing.

878 citations

Proceedings ArticleDOI
12 Nov 2000
TL;DR: The results are two fold: it is shown that graphs generated using the proposed random graph models exhibit the statistics observed on the Web graph, and additionally, that natural graph models proposed earlier do not exhibit them.
Abstract: The Web may be viewed as a directed graph each of whose vertices is a static HTML Web page, and each of whose edges corresponds to a hyperlink from one Web page to another. We propose and analyze random graph models inspired by a series of empirical observations on the Web. Our graph models differ from the traditional G/sub n,p/ models in two ways: 1. Independently chosen edges do not result in the statistics (degree distributions, clique multitudes) observed on the Web. Thus, edges in our model are statistically dependent on each other. 2. Our model introduces new vertices in the graph as time evolves. This captures the fact that the Web is changing with time. Our results are two fold: we show that graphs generated using our model exhibit the statistics observed on the Web graph, and additionally, that natural graph models proposed earlier do not exhibit them. This remains true even when these earlier models are generalized to account for the arrival of vertices over time. In particular, the sparse random graphs in our models exhibit properties that do not arise in far denser random graphs generated by Erdos-Renyi models.

768 citations

Journal ArticleDOI
TL;DR: This work uses a diffusion process on the interaction network to define a local neighborhood of "influence" for each mutated gene in the network, and derives a two-stage multiple hypothesis test to bound the false discovery rate (FDR) associated with the identified subnetworks.
Abstract: Recent genome sequencing studies have shown that the somatic mutations that drive cancer development are distributed across a large number of genes. This mutational heterogeneity complicates efforts to distinguish functional mutations from sporadic, passenger mutations. Since cancer mutations are hypothesized to target a relatively small number of cellular signaling and regulatory pathways, a common practice is to assess whether known pathways are enriched for mutated genes. We introduce an alternative approach that examines mutated genes in the context of a genome-scale gene interaction network. We present a computationally efficient strategy for de novo identification of subnetworks in an interaction network that are mutated in a statistically significant number of patients. This framework includes two major components. First, we use a diffusion process on the interaction network to define a local neighborhood of "influence" for each mutated gene in the network. Second, we derive a two-stage multiple hypothesis test to bound the false discovery rate (FDR) associated with the identified subnetworks. We test these algorithms on a large human protein-protein interaction network using somatic mutation data from glioblastoma and lung adenocarcinoma samples. We successfully recover pathways that are known to be important in these cancers and also identify additional pathways that have been implicated in other cancers but not previously reported as mutated in these samples. We anticipate that our approach will find increasing use as cancer genome studies increase in size and scope.

431 citations

Journal ArticleDOI
TL;DR: It is proved that any routing scheme for general networks that achieves a stretch factor k ≥ 1 must use a total of &OHgr; bits of routing information in the networks, which is a trade-off between the efficiency of a routing scheme and its space requirements.
Abstract: Two conflicting goals play a crucial role in the design of routing schemes for communication networks. A routing scheme should use paths that are as short as possible for routing messages in the network, while keeping the routing information stored in the processors' local memory as succinct as possible. The efficiency of a routing scheme is measured in terms of its stretch factor-the maximum ratio between the length of a route computed by the scheme and that of a shortest path connecting the same pair of vertices.Most previous work has concentrated on finding good routing schemes (with a small fixed stretch factor) for special classes of network topologies. In this paper the problem for general networks is studied, and the entire range of possible stretch factors is examined. The results exhibit a trade-off between the efficiency of a routing scheme and its space requirements. Almost tight upper and lower bounds for this trade-off are presented. Specifically, it is proved that any routing scheme for general n-vertex networks that achieves a stretch factor k ≥ 1 must use a total of O(n1+1/(2k+4)) bits of routing information in the networks. This lower bound is complemented by a family K(k) of hierarchical routing schemes (for every k ≥ l) for unit-cost general networks, which guarantee a stretch factor of O(k), require storing a total of O(k3n1+(1/h)logn)- bits of routing information in the network, name the vertices with O(log2n)-bit names and use O(logn)-bit headers.

402 citations


Cited by
More filters
Book
08 Sep 2000
TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Abstract: The increasing volume of data in modern business and science calls for more complex and sophisticated tools. Although advances in data mining technology have made extensive data collection much easier, it's still always evolving and there is a constant need for new techniques and tools that can help us transform this data into useful information and knowledge. Since the previous edition's publication, great advances have been made in the field of data mining. Not only does the third of edition of Data Mining: Concepts and Techniques continue the tradition of equipping you with an understanding and application of the theory and practice of discovering patterns hidden in large data sets, it also focuses on new, important topics in the field: data warehouses and data cube technology, mining stream, mining social networks, and mining spatial, multimedia and other complex data. Each chapter is a stand-alone guide to a critical topic, presenting proven algorithms and sound implementations ready to be used directly or with strategic modification against live data. This is the resource you need if you want to apply today's most powerful data mining techniques to meet real business challenges. * Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects. * Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields. *Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data

23,600 citations

28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Journal ArticleDOI
TL;DR: Developments in this field are reviewed, including such concepts as the small-world effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks.
Abstract: Inspired by empirical studies of networked systems such as the Internet, social networks, and biological networks, researchers have in recent years developed a variety of techniques and models to help us understand or predict the behavior of these systems. Here we review developments in this field, including such concepts as the small-world effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks.

17,647 citations

01 Jan 2002

9,314 citations