scispace - formally typeset
Search or ask a question
Institution

Yahoo!

CompanyLondon, United Kingdom
About: Yahoo! is a company organization based out in London, United Kingdom. It is known for research contribution in the topics: Population & Web search query. The organization has 26749 authors who have published 29915 publications receiving 732583 citations. The organization is also known as: Yahoo! Inc. & Maudwen-Yahoo! Inc.


Papers
More filters
Patent
09 Sep 2003
TL;DR: In this paper, a method and system for improving the efficiency of a database processing system for evaluating candidate data items representing search listings that are submitted for inclusion into a search engine database is presented.
Abstract: A method and system for improving the efficiency of a database processing system for evaluating candidate data items representing search listings that are submitted for inclusion into a search engine database. Candidate search listings are automatically assessed for quality, style, and relevance to evaluate risk of unfavorable reception by a user and of potential exposure volume. Search listings which are higher-risk or higher-volume are routed through manual editorial review while lower-risk, lower-volume search listings are routed for immediate inclusion in the search database without manual editorial evaluation. Accordingly, human editorial efforts can be devoted to manual review of high-risk or high-volume search listings while efficiency is simultaneously improved in the processing system as a whole.

219 citations

Book ChapterDOI
20 Aug 2017
TL;DR: As the miners' population evolves over time, so should the difficulty of these proofs as mentioned in this paper, and Bitcoin provides this adjustment mechanism, with empirical evidence of a constant block generation rate against such population changes.
Abstract: Bitcoin’s innovative and distributedly maintained blockchain data structure hinges on the adequate degree of difficulty of so-called “proofs of work,” which miners have to produce in order for transactions to be inserted. Importantly, these proofs of work have to be hard enough so that miners have an opportunity to unify their views in the presence of an adversary who interferes but has bounded computational power, but easy enough to be solvable regularly and enable the miners to make progress. As such, as the miners’ population evolves over time, so should the difficulty of these proofs. Bitcoin provides this adjustment mechanism, with empirical evidence of a constant block generation rate against such population changes.

219 citations

Proceedings Article
26 Apr 2018
TL;DR: HARP as discussed by the authors compresses the input graph prior to embedding it, effectively avoiding troublesome embedding configurations (i.e., local minima) which can pose problems to nonconvex optimization.
Abstract: We present HARP, a novel method for learning low dimensional embeddings of a graph’s nodes which preserves higher-order structural features. Our proposed method achieves this by compressing the input graph prior to embedding it, effectively avoiding troublesome embedding configurations (i.e. local minima) which can pose problems to non-convex optimization. HARP works by finding a smaller graph which approximates the global structure of its input. This simplified graph is used to learn a set of initial representations, which serve as good initializations for learning representations in the original, detailed graph. We inductively extend this idea, by decomposing a graph in a series of levels, and then embed the hierarchy of graphs from the coarsest one to the original graph. HARP is a general meta-strategy to improve all of the state-of-the-art neural algorithms for embedding graphs, including DeepWalk, LINE, and Node2vec. Indeed, we demonstrate that applying HARP’s hierarchical paradigm yields improved implementations for all three of these methods, as evaluated on classification tasks on real-world graphs such as DBLP, BlogCatalog, and CiteSeer, where we achieve a performance gain over the original implementations by up to 14% Macro F1.

218 citations

Proceedings ArticleDOI
07 Dec 2015
TL;DR: Multi-label Canonical Correlation Analysis (ml-CCA), an extension of CCA, is introduced for learning shared subspaces taking into account high level semantic information in the form of multi-label annotations, which results in a discriminative subspace which is better suited for cross-modal retrieval tasks.
Abstract: In this work, we address the problem of cross-modal retrieval in presence of multi-label annotations. In particular, we introduce multi-label Canonical Correlation Analysis (ml-CCA), an extension of CCA, for learning shared subspaces taking into account high level semantic information in the form of multi-label annotations. Unlike CCA, ml-CCA does not rely on explicit pairing between modalities, instead it uses the multi-label information to establish correspondences. This results in a discriminative subspace which is better suited for cross-modal retrieval tasks. We also present Fast ml-CCA, a computationally efficient version of ml-CCA, which is able to handle large scale datasets. We show the efficacy of our approach by conducting extensive cross-modal retrieval experiments on three standard benchmark datasets. The results show that the proposed approach achieves state of the art retrieval performance on the three datasets.

218 citations

Proceedings ArticleDOI
23 Jul 2007
TL;DR: Using a query log spanning a whole year, a new algorithm is proposed for static caching of posting lists, which outperforms previous methods and can achieve higher hit rates than caching query answers.
Abstract: In this paper we study the trade-offs in designing efficient caching systems for Web search engines. We explore the impact of different approaches, such as static vs. dynamic caching, and caching query results vs.caching posting lists. Using a query log spanning a whole year we explore the limitations of caching and we demonstrate that caching posting lists can achieve higher hit rates than caching query answers. We propose a new algorithm for static caching of posting lists, which outperforms previous methods. We also study the problem of finding the optimal way to split the static cache between answers and posting lists. Finally, we measure how the changes in the query log affect the effectiveness of static caching, given our observation that the distribution of the queries changes slowly over time. Our results and observations are applicable to different levels of the data-access hierarchy, for instance, for a memory/disk layer or a broker/remote server layer.

217 citations


Authors

Showing all 26766 results

NameH-indexPapersCitations
Ashok Kumar1515654164086
Alexander J. Smola122434110222
Howard I. Maibach116182160765
Sanjay Jain10388146880
Amirhossein Sahebkar100130746132
Marc Davis9941250243
Wenjun Zhang9697638530
Jian Xu94136652057
Fortunato Ciardiello9469547352
Tong Zhang9341436519
Michael E. J. Lean9241130939
Ashish K. Jha8750330020
Xin Zhang87171440102
Theunis Piersma8663234201
George Varghese8425328598
Network Information
Related Institutions (5)
University of Toronto
294.9K papers, 13.5M citations

85% related

University of California, San Diego
204.5K papers, 12.3M citations

85% related

University College London
210.6K papers, 9.8M citations

84% related

Cornell University
235.5K papers, 12.2M citations

84% related

University of Washington
305.5K papers, 17.7M citations

84% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20232
202247
20211,088
20201,074
20191,568
20181,352