scispace - formally typeset
Search or ask a question
Institution

Yahoo!

CompanyLondon, United Kingdom
About: Yahoo! is a company organization based out in London, United Kingdom. It is known for research contribution in the topics: Population & Web search query. The organization has 26749 authors who have published 29915 publications receiving 732583 citations. The organization is also known as: Yahoo! Inc. & Maudwen-Yahoo! Inc.


Papers
More filters
Book ChapterDOI
16 Mar 2010
TL;DR: Dedalus is presented, a foundation language for programming and reasoning about distributed systems that reduces to a subset of Datalog with negation, aggregate functions, successor and choice, and adds an explicit notion of logical time to the language.
Abstract: Recent research has explored using Datalog-based languages to express a distributed system as a set of logical invariants. Two properties of distributed systems proved difficult to model in Datalog. First, the state of any such system evolves with its execution. Second, deductions in these systems may be arbitrarily delayed, dropped, or reordered by the unreliable network links they must traverse. Previous efforts addressed the former by extending Datalog to include updates, key constraints, persistence and events, and the latter by assuming ordered and reliable delivery while ignoring delay. These details have a semantics outside Datalog, which increases the complexity of the language and its interpretation, and forces programmers to think operationally. We argue that the missing component from these previous languages is a notion of time. In this paper we present Dedalus, a foundation language for programming and reasoning about distributed systems. Dedalus reduces to a subset of Datalog with negation, aggregate functions, successor and choice, and adds an explicit notion of logical time to the language. We show that Dedalus provides a declarative foundation for the two signature features of distributed systems: mutable state, and asynchronous processing and communication. Given these two features, we address two important properties of programs in a domain-specific manner: a notion of safety appropriate to non-terminating computations, and stratified monotonic reasoning with negation over time. We also provide conservative syntactic checks for our temporal notions of safety and stratification. Our experience implementing full-featured systems in variants of Datalog suggests that Dedalus is well-suited to the specification of rich distributed services and protocols, and provides both cleaner semantics and richer tests of correctness.

145 citations

Book ChapterDOI
01 Jan 2010
TL;DR: This chapter surveys the very recent research development on privacy preserving publishing of graphs and social network data, and categorizes the state-of-the-art anonymization methods on simple graphs in three main categories: K-anonymity based privacy Preservation via edge modification, probabilistic privacy preservation via edge randomization, and privacy preservation through generalization.
Abstract: Social networks have received dramatic interest in research and development. In this chapter, we survey the very recent research development on privacypreserving publishing of graphs and social network data. We categorize the state-of-the-art anonymization methods on simple graphs in three main categories: K-anonymity based privacy preservation via edge modification, probabilistic privacy preservation via edge randomization, and privacy preservation via generalization. We then review anonymization methods on rich graphs. We finally discuss challenges and propose new research directions in this area.

144 citations

Journal ArticleDOI
TL;DR: In this paper, the shear strength characteristics of a low lime class F fly ash modified with lime alone or in combination with gypsum were evaluated for both unsoaked and soaked specimens cured up to 90 days.
Abstract: This paper presents the shear strength characteristics of a low lime class F fly ash modified with lime alone or in combination with gypsum. Unconfined compression tests were conducted for both unsoaked and soaked specimens cured up to 90 days. Addition of a small percentage of gypsum (0.5 and 1.0%) along with lime (4–10%) enhanced the shear strength of modified fly ash within short curing periods (7 and 28 days). The gain in unsoaked unconfined compressive strength ( qu ) of the fly ash was 2,853 and 3,567% at 28 and 90 days curing, respectively, for addition of 10% lime along with 1% gypsum to the fly ash. The effect of 24 h soaking showed reduction of qu varying from 30 to 2% depending on mix proportions and curing period. Unconsolidated undrained triaxial tests with pore-pressure measurements were conducted for 7 and 28 days cured specimens. The cohesion of the Class F fly ash increased up to 3,150% with addition of 10% lime along with 1% gypsum to the fly ash and cured for 28 days. The modified fly a...

144 citations

Proceedings ArticleDOI
Fernando Diaz1
09 Feb 2009
TL;DR: This paper addresses the issue of integrating search results from a news vertical into web search results, and defines several click-based metrics which allow a system to be monitored and tuned without annotator effort.
Abstract: Aggregated search refers to the integration of content from specialized corpora or verticals into web search results. Aggregation improves search when the user has vertical intent but may not be aware of or desire vertical search. In this paper, we address the issue of integrating search results from a news vertical into web search results. News is particularly challenging because, given a query, the appropriate decision---to integrate news content or not---changes with time. Our system adapts to news intent in two ways. First, by inspecting the dynamics of the news collection and query volume, we can track development of and interest in topics. Second, by using click feedback, we can quickly recover from system errors. We define several click-based metrics which allow a system to be monitored and tuned without annotator effort.

144 citations

Journal ArticleDOI
TL;DR: Lower bounds for determining the length of the shortest cycle and other graph properties are proved and two general techniques for speeding up the per-edge computation time of streaming algorithms while increasing the space by only a small factor are discussed.
Abstract: We explore problems related to computing graph distances in the data-stream model. The goal is to design algorithms that can process the edges of a graph in an arbitrary order given only a limited amount of working memory. We are motivated by both the practical challenge of processing massive graphs such as the web graph and the desire for a better theoretical understanding of the data-stream model. In particular, we are interested in the trade-offs between model parameters such as per-data-item processing time, total space, and the number of passes that may be taken over the stream. These trade-offs are more apparent when considering graph problems than they were in previous streaming work that solved problems of a statistical nature. Our results include the following: (1) Spanner construction: There exists a single-pass, $\tilde{O}(tn^{1+1/t})$-space, $\tilde{O}(t^2n^{1/t})$-time-per-edge algorithm that constructs a $(2t+1)$-spanner. For $t=\Omega(\log n/{\log\log n})$, the algorithm satisfies the semistreaming space restriction of $O(n\operatorname{polylog}n)$ and has per-edge processing time $O(\operatorname{polylog}n)$. This resolves an open question from [J. Feigenbaum et al., Theoret. Comput. Sci., 348 (2005), pp. 207-216]. (2) Breadth-first-search (BFS) trees: For any even constant $k$, we show that any algorithm that computes the first $k$ layers of a BFS tree from a prescribed node with probability at least $2/3$ requires either greater than $k/2$ passes or $\tilde{\Omega}(n^{1+1/k})$ space. Since constructing BFS trees is an important subroutine in many traditional graph algorithms, this demonstrates the need for new algorithmic techniques when processing graphs in the data-stream model. (3) Graph-distance lower bounds: Any $t$-approximation of the distance between two nodes requires $\Omega(n^{1+1/t})$ space. We also prove lower bounds for determining the length of the shortest cycle and other graph properties. (4) Techniques for decreasing per-edge processing: We discuss two general techniques for speeding up the per-edge computation time of streaming algorithms while increasing the space by only a small factor.

144 citations


Authors

Showing all 26766 results

NameH-indexPapersCitations
Ashok Kumar1515654164086
Alexander J. Smola122434110222
Howard I. Maibach116182160765
Sanjay Jain10388146880
Amirhossein Sahebkar100130746132
Marc Davis9941250243
Wenjun Zhang9697638530
Jian Xu94136652057
Fortunato Ciardiello9469547352
Tong Zhang9341436519
Michael E. J. Lean9241130939
Ashish K. Jha8750330020
Xin Zhang87171440102
Theunis Piersma8663234201
George Varghese8425328598
Network Information
Related Institutions (5)
University of Toronto
294.9K papers, 13.5M citations

85% related

University of California, San Diego
204.5K papers, 12.3M citations

85% related

University College London
210.6K papers, 9.8M citations

84% related

Cornell University
235.5K papers, 12.2M citations

84% related

University of Washington
305.5K papers, 17.7M citations

84% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20232
202247
20211,088
20201,074
20191,568
20181,352