Institution
Yahoo!
Company•London, United Kingdom•
About: Yahoo! is a company organization based out in London, United Kingdom. It is known for research contribution in the topics: Population & Web search query. The organization has 26749 authors who have published 29915 publications receiving 732583 citations. The organization is also known as: Yahoo! Inc. & Maudwen-Yahoo! Inc.
Papers published on a yearly basis
Papers
More filters
••
16 Mar 2010TL;DR: Dedalus is presented, a foundation language for programming and reasoning about distributed systems that reduces to a subset of Datalog with negation, aggregate functions, successor and choice, and adds an explicit notion of logical time to the language.
Abstract: Recent research has explored using Datalog-based languages to express a distributed system as a set of logical invariants. Two properties of distributed systems proved difficult to model in Datalog. First, the state of any such system evolves with its execution. Second, deductions in these systems may be arbitrarily delayed, dropped, or reordered by the unreliable network links they must traverse. Previous efforts addressed the former by extending Datalog to include updates, key constraints, persistence and events, and the latter by assuming ordered and reliable delivery while ignoring delay. These details have a semantics outside Datalog, which increases the complexity of the language and its interpretation, and forces programmers to think operationally. We argue that the missing component from these previous languages is a notion of time.
In this paper we present Dedalus, a foundation language for programming and reasoning about distributed systems. Dedalus reduces to a subset of Datalog with negation, aggregate functions, successor and choice, and adds an explicit notion of logical time to the language. We show that Dedalus provides a declarative foundation for the two signature features of distributed systems: mutable state, and asynchronous processing and communication. Given these two features, we address two important properties of programs in a domain-specific manner: a notion of safety appropriate to non-terminating computations, and stratified monotonic reasoning with negation over time. We also provide conservative syntactic checks for our temporal notions of safety and stratification. Our experience implementing full-featured systems in variants of Datalog suggests that Dedalus is well-suited to the specification of rich distributed services and protocols, and provides both cleaner semantics and richer tests of correctness.
145 citations
••
01 Jan 2010
TL;DR: This chapter surveys the very recent research development on privacy preserving publishing of graphs and social network data, and categorizes the state-of-the-art anonymization methods on simple graphs in three main categories: K-anonymity based privacy Preservation via edge modification, probabilistic privacy preservation via edge randomization, and privacy preservation through generalization.
Abstract: Social networks have received dramatic interest in research and development. In this chapter, we survey the very recent research development on privacypreserving publishing of graphs and social network data. We categorize the state-of-the-art anonymization methods on simple graphs in three main categories: K-anonymity based privacy preservation via edge modification, probabilistic privacy preservation via edge randomization, and privacy preservation via generalization. We then review anonymization methods on rich graphs. We finally discuss challenges and propose new research directions in this area.
144 citations
••
TL;DR: In this paper, the shear strength characteristics of a low lime class F fly ash modified with lime alone or in combination with gypsum were evaluated for both unsoaked and soaked specimens cured up to 90 days.
Abstract: This paper presents the shear strength characteristics of a low lime class F fly ash modified with lime alone or in combination with gypsum. Unconfined compression tests were conducted for both unsoaked and soaked specimens cured up to 90 days. Addition of a small percentage of gypsum (0.5 and 1.0%) along with lime (4–10%) enhanced the shear strength of modified fly ash within short curing periods (7 and 28 days). The gain in unsoaked unconfined compressive strength ( qu ) of the fly ash was 2,853 and 3,567% at 28 and 90 days curing, respectively, for addition of 10% lime along with 1% gypsum to the fly ash. The effect of 24 h soaking showed reduction of qu varying from 30 to 2% depending on mix proportions and curing period. Unconsolidated undrained triaxial tests with pore-pressure measurements were conducted for 7 and 28 days cured specimens. The cohesion of the Class F fly ash increased up to 3,150% with addition of 10% lime along with 1% gypsum to the fly ash and cured for 28 days. The modified fly a...
144 citations
••
09 Feb 2009TL;DR: This paper addresses the issue of integrating search results from a news vertical into web search results, and defines several click-based metrics which allow a system to be monitored and tuned without annotator effort.
Abstract: Aggregated search refers to the integration of content from specialized corpora or verticals into web search results. Aggregation improves search when the user has vertical intent but may not be aware of or desire vertical search. In this paper, we address the issue of integrating search results from a news vertical into web search results. News is particularly challenging because, given a query, the appropriate decision---to integrate news content or not---changes with time. Our system adapts to news intent in two ways. First, by inspecting the dynamics of the news collection and query volume, we can track development of and interest in topics. Second, by using click feedback, we can quickly recover from system errors. We define several click-based metrics which allow a system to be monitored and tuned without annotator effort.
144 citations
••
TL;DR: Lower bounds for determining the length of the shortest cycle and other graph properties are proved and two general techniques for speeding up the per-edge computation time of streaming algorithms while increasing the space by only a small factor are discussed.
Abstract: We explore problems related to computing graph distances in the data-stream model. The goal is to design algorithms that can process the edges of a graph in an arbitrary order given only a limited amount of working memory. We are motivated by both the practical challenge of processing massive graphs such as the web graph and the desire for a better theoretical understanding of the data-stream model. In particular, we are interested in the trade-offs between model parameters such as per-data-item processing time, total space, and the number of passes that may be taken over the stream. These trade-offs are more apparent when considering graph problems than they were in previous streaming work that solved problems of a statistical nature. Our results include the following: (1) Spanner construction: There exists a single-pass, $\tilde{O}(tn^{1+1/t})$-space, $\tilde{O}(t^2n^{1/t})$-time-per-edge algorithm that constructs a $(2t+1)$-spanner. For $t=\Omega(\log n/{\log\log n})$, the algorithm satisfies the semistreaming space restriction of $O(n\operatorname{polylog}n)$ and has per-edge processing time $O(\operatorname{polylog}n)$. This resolves an open question from [J. Feigenbaum et al., Theoret. Comput. Sci., 348 (2005), pp. 207-216]. (2) Breadth-first-search (BFS) trees: For any even constant $k$, we show that any algorithm that computes the first $k$ layers of a BFS tree from a prescribed node with probability at least $2/3$ requires either greater than $k/2$ passes or $\tilde{\Omega}(n^{1+1/k})$ space. Since constructing BFS trees is an important subroutine in many traditional graph algorithms, this demonstrates the need for new algorithmic techniques when processing graphs in the data-stream model. (3) Graph-distance lower bounds: Any $t$-approximation of the distance between two nodes requires $\Omega(n^{1+1/t})$ space. We also prove lower bounds for determining the length of the shortest cycle and other graph properties. (4) Techniques for decreasing per-edge processing: We discuss two general techniques for speeding up the per-edge computation time of streaming algorithms while increasing the space by only a small factor.
144 citations
Authors
Showing all 26766 results
Name | H-index | Papers | Citations |
---|---|---|---|
Ashok Kumar | 151 | 5654 | 164086 |
Alexander J. Smola | 122 | 434 | 110222 |
Howard I. Maibach | 116 | 1821 | 60765 |
Sanjay Jain | 103 | 881 | 46880 |
Amirhossein Sahebkar | 100 | 1307 | 46132 |
Marc Davis | 99 | 412 | 50243 |
Wenjun Zhang | 96 | 976 | 38530 |
Jian Xu | 94 | 1366 | 52057 |
Fortunato Ciardiello | 94 | 695 | 47352 |
Tong Zhang | 93 | 414 | 36519 |
Michael E. J. Lean | 92 | 411 | 30939 |
Ashish K. Jha | 87 | 503 | 30020 |
Xin Zhang | 87 | 1714 | 40102 |
Theunis Piersma | 86 | 632 | 34201 |
George Varghese | 84 | 253 | 28598 |