scispace - formally typeset
Search or ask a question
Institution

Yahoo!

CompanyLondon, United Kingdom
About: Yahoo! is a company organization based out in London, United Kingdom. It is known for research contribution in the topics: Population & Web search query. The organization has 26749 authors who have published 29915 publications receiving 732583 citations. The organization is also known as: Yahoo! Inc. & Maudwen-Yahoo! Inc.


Papers
More filters
Proceedings ArticleDOI
12 Aug 2007
TL;DR: This paper forms the problem of constraint-guided feature projection, which can be nicely integrated with semi-supervised clustering algorithms and has the ability to effectively reduce data dimension and shows that the SCREEN method can effectively deal with high-dimensional data and provides an appealing clustering performance.
Abstract: Semi-supervised clustering employs limited supervision in the form of labeled instances or pairwise instance constraints to aid unsupervised clustering and often significantly improves the clustering performance. Despite the vast amount of expert knowledge spent on this problem, most existing work is not designed for handling high-dimensional sparse data. This paper thus fills this crucial void by developing a Semi-supervised Clustering method based on spheRical K-mEans via fEature projectioN (SCREEN). Specifically, we formulate the problem of constraint-guided feature projection, which can be nicely integrated with semi-supervised clustering algorithms and has the ability to effectively reduce data dimension. Indeed, our experimental results on several real-world data sets show that the SCREEN method can effectively deal with high-dimensional data and provides an appealing clustering performance.

128 citations

Proceedings ArticleDOI
13 Aug 2016
TL;DR: This paper introduces three key techniques for base relevance -- ranking functions, semantic matching features and query rewriting, and describes solutions for recency sensitive relevance and location sensitive relevance.
Abstract: Search engines play a crucial role in our daily lives. Relevance is the core problem of a commercial search engine. It has attracted thousands of researchers from both academia and industry and has been studied for decades. Relevance in a modern search engine has gone far beyond text matching, and now involves tremendous challenges. The semantic gap between queries and URLs is the main barrier for improving base relevance. Clicks help provide hints to improve relevance, but unfortunately for most tail queries, the click information is too sparse, noisy, or missing entirely. For comprehensive relevance, the recency and location sensitivity of results is also critical. In this paper, we give an overview of the solutions for relevance in the Yahoo search engine. We introduce three key techniques for base relevance -- ranking functions, semantic matching features and query rewriting. We also describe solutions for recency sensitive relevance and location sensitive relevance. This work builds upon 20 years of existing efforts on Yahoo search, summarizes the most recent advances and provides a series of practical relevance solutions. The performance reported is based on Yahoo's commercial search engine, where tens of billions of urls are indexed and served by the ranking system.

128 citations

Journal ArticleDOI
TL;DR: The results provide further evidence for the continuity of MDD from childhood to adolescence and in comparison with children, adolescents had significantly more substance abuse and less comorbid separation anxiety disorder and attention-deficit/hyperactivity disorder.
Abstract: OBJECTIVE Very few studies have compared the symptoms of major depressive disorder (MDD) and rates of comorbid psychiatric disorders between depressed children and adolescents. The aim of this study was to reproduce and extend these findings. METHOD The Kiddie Schedule for Affective Disorders and Schizophrenia, present version (KSADS-P) was administered to parents (about their children) and in face-to-face interviews with 916 subjects aged 5.6 to 17.9 years with MDD (DSM criteria) (715 adolescents and 201 children; 348 male and 568 female). The subjects were consecutive referrals to an outpatient mood and anxiety disorders clinic. RESULTS Depressed adolescents had significantly more hopelessness/helplessness, lack of energy/tiredness, hypersomnia, weight loss, and suicidality compared with children (p values < or = .001). In comparison with children, adolescents had significantly more substance abuse and less comorbid separation anxiety disorder and attention-deficit/hyperactivity disorder (p values < or = .001). Depressed female adolescents had significantly more suicidality than depressed male adolescents (p < or = .001). There were no other sex differences between males and females. The symptoms of depressed adolescents grouped into 3 factors (endogenous, negative cognitions/suicidality, and appetite/weight), whereas the symptoms in children grouped into 2 factors (endogenous/negative cognitions/suicidality and appetite/weight). CONCLUSIONS These results provide further evidence for the continuity of MDD from childhood to adolescence.

128 citations

Proceedings ArticleDOI
16 Jun 2012
TL;DR: A novel framework to learn similarity metrics using the class taxonomy is proposed and it is shown that a nearest neighbor classifier using the learned metrics gets improved performance over the best discriminative methods.
Abstract: Categories in multi-class data are often part of an underlying semantic taxonomy. Recent work in object classification has found interesting ways to use this taxonomy structure to develop better recognition algorithms. Here we propose a novel framework to learn similarity metrics using the class taxonomy. We show that a nearest neighbor classifier using the learned metrics gets improved performance over the best discriminative methods. Moreover, by incorporating the taxonomy, our learned metrics can also help in some taxonomy specific applications. We show that the metrics can help determine the correct placement of a new category that was not part of the original taxonomy, and can provide effective classification amongst categories local to specific subtrees of the taxonomy.

127 citations

Proceedings ArticleDOI
16 Apr 2012
TL;DR: This work invokes statistical order estimation tests for Markov chains to establish that Web users are not Markovian, and derives a number of avenues for further research.
Abstract: User modeling on the Web has rested on the fundamental assumption of Markovian behavior --- a user's next action depends only on her current state, and not the history leading up to the current state. This forms the underpinning of PageRank web ranking, as well as a number of techniques for targeting advertising to users. In this work we examine the validity of this assumption, using data from a number of Web settings. Our main result invokes statistical order estimation tests for Markov chains to establish that Web users are not, in fact, Markovian. We study the extent to which the Markovian assumption is invalid, and derive a number of avenues for further research.

127 citations


Authors

Showing all 26766 results

NameH-indexPapersCitations
Ashok Kumar1515654164086
Alexander J. Smola122434110222
Howard I. Maibach116182160765
Sanjay Jain10388146880
Amirhossein Sahebkar100130746132
Marc Davis9941250243
Wenjun Zhang9697638530
Jian Xu94136652057
Fortunato Ciardiello9469547352
Tong Zhang9341436519
Michael E. J. Lean9241130939
Ashish K. Jha8750330020
Xin Zhang87171440102
Theunis Piersma8663234201
George Varghese8425328598
Network Information
Related Institutions (5)
University of Toronto
294.9K papers, 13.5M citations

85% related

University of California, San Diego
204.5K papers, 12.3M citations

85% related

University College London
210.6K papers, 9.8M citations

84% related

Cornell University
235.5K papers, 12.2M citations

84% related

University of Washington
305.5K papers, 17.7M citations

84% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20232
202247
20211,088
20201,074
20191,568
20181,352