Papers published on a yearly basis
Papers
More filters
••
20 Aug 2006TL;DR: A new way of measuring and extracting proximity in networks called "cycle free effective conductance"(CFEC) is proposed, which can handle more than two endpoints, directed edges, is statistically well-behaved, and produces an effectiveness score for the computed subgraphs.
Abstract: Measuring distance or some other form of proximity between objects is a standard data mining tool. Connection subgraphs were recently proposed as a way to demonstrate proximity between nodes in networks. We propose a new way of measuring and extracting proximity in networks called "cycle free effective conductance"(CFEC). Our proximity measure can handle more than two endpoints, directed edges, is statistically well-behaved, and produces an effectiveness score for the computed subgraphs. We provide an efficien talgorithm. Also, we report experimental results and show examples for three large network data sets: a telecommunications calling graph, the IMDB actors graph, and an academic co-authorship network.
153 citations
••
[...]
TL;DR: In this article, a tutorial explores the progress that has been made by the data integration community on the topics of schema mapping, record linkage and data fusion in addressing these novel challenges faced by big data integration, and identifies a range of open problems for the community.
Abstract: The Big Data era is upon us: data is being generated, collected and analyzed at an unprecedented scale, and data-driven decision making is sweeping through society. Since the value of data explodes when it can be linked and fused with other data, addressing the big data integration (BDI) challenge is critical to realizing the promise of Big Data.BDI differs from traditional data integration in many dimensions: (i) the number of data sources, even for a single domain, has grown to be in the tens of thousands, (ii) many of the data sources are very dynamic, as a huge amount of newly collected data are continuously made available, (iii) the data sources are extremely heterogeneous in their structure, with considerable variety even for substantially similar entities, and (iv) the data sources are of widely differing qualities, with significant differences in the coverage, accuracy and timeliness of data provided. This tutorial explores the progress that has been made by the data integration community on the topics of schema mapping, record linkage and data fusion in addressing these novel challenges faced by big data integration, and identifies a range of open problems for the community.
153 citations
••
12 Feb 2015TL;DR: It is believed that cellular operators and content providers can tremendously improve video QoE by predicting available bandwidth and sharing it through APIs.
Abstract: Existing video streaming algorithms use various estimation approaches to infer the inherently variable bandwidth in cellular networks, which often leads to reduced quality of experience (QoE). We ask the question: "If accurate bandwidth prediction were possible in a cellular network, how much can we improve video QoE?". Assuming we know the bandwidth for the entire video session, we show that existing streaming algorithms only achieve between 69%-86% of optimal quality. Since such knowledge may be impractical, we study algorithms that know the available bandwidth for a few seconds into the future. We observe that prediction alone is not sufficient and can in fact lead to degraded QoE. However, when combined with rate stabilization functions, prediction outperforms existing algorithms and reduces the gap with optimal to 4%. Our results lead us to believe that cellular operators and content providers can tremendously improve video QoE by predicting available bandwidth and sharing it through APIs.
153 citations
••
01 May 1997TL;DR: It is shown that if the view definitions do not contain existential variables, then it is always possible to find a rewriting that is a union of conjunctive queries, and furthermore, this rewriting produces the maximal set of answers possible from the views.
Abstract: The problem of rewriting queries using views is to iind a query expression that uses only a set of views V and is equivalent to (or maximally contained in) a given query Q. Rewriting queries using views is important for query optimization and for applications such as information integration and data warehousing. Description logics are a family of logics that were developed for modeling complex hierarchical structures, and can also be viewed as a query language with an interesting tradeoff between complexity and expressive power. We consider the problem of rewriting queries using views expressed in description logics and conjunctive queries over description logics. We show that if the view definitions do not contain existential variables, then it is always possible to find a rewriting that is a union of conjunctive queries, and furthermore, this rewriting produces the maximal set of answers possible from the views. If the views have existential variables, the rewriting may be recursive. We present an algorithm for producing a recursive rewriting, that is guaranteed to be a maximal one when the underlying database forms a tree of constants. We show that in general, it is not always be possible to find a maximal rewriting.
153 citations
Authors
Showing all 1881 results
Name | H-index | Papers | Citations |
---|---|---|---|
Yoshua Bengio | 202 | 1033 | 420313 |
Scott Shenker | 150 | 454 | 118017 |
Paul Shala Henry | 137 | 318 | 35971 |
Peter Stone | 130 | 1229 | 79713 |
Yann LeCun | 121 | 369 | 171211 |
Louis E. Brus | 113 | 347 | 63052 |
Jennifer Rexford | 102 | 394 | 45277 |
Andreas F. Molisch | 96 | 777 | 47530 |
Vern Paxson | 93 | 267 | 48382 |
Lorrie Faith Cranor | 92 | 326 | 28728 |
Ward Whitt | 89 | 424 | 29938 |
Lawrence R. Rabiner | 88 | 378 | 70445 |
Thomas E. Graedel | 86 | 348 | 27860 |
William W. Cohen | 85 | 384 | 31495 |
Michael K. Reiter | 84 | 380 | 30267 |