Institution
Yahoo!
Company•London, United Kingdom•
About: Yahoo! is a company organization based out in London, United Kingdom. It is known for research contribution in the topics: Population & Web search query. The organization has 26749 authors who have published 29915 publications receiving 732583 citations. The organization is also known as: Yahoo! Inc. & Maudwen-Yahoo! Inc.
Papers published on a yearly basis
Papers
More filters
••
12 Aug 2007TL;DR: This paper forms the problem of constraint-guided feature projection, which can be nicely integrated with semi-supervised clustering algorithms and has the ability to effectively reduce data dimension and shows that the SCREEN method can effectively deal with high-dimensional data and provides an appealing clustering performance.
Abstract: Semi-supervised clustering employs limited supervision in the form of labeled instances or pairwise instance constraints to aid unsupervised clustering and often significantly improves the clustering performance. Despite the vast amount of expert knowledge spent on this problem, most existing work is not designed for handling high-dimensional sparse data. This paper thus fills this crucial void by developing a Semi-supervised Clustering method based on spheRical K-mEans via fEature projectioN (SCREEN). Specifically, we formulate the problem of constraint-guided feature projection, which can be nicely integrated with semi-supervised clustering algorithms and has the ability to effectively reduce data dimension. Indeed, our experimental results on several real-world data sets show that the SCREEN method can effectively deal with high-dimensional data and provides an appealing clustering performance.
128 citations
••
13 Aug 2016TL;DR: This paper introduces three key techniques for base relevance -- ranking functions, semantic matching features and query rewriting, and describes solutions for recency sensitive relevance and location sensitive relevance.
Abstract: Search engines play a crucial role in our daily lives. Relevance is the core problem of a commercial search engine. It has attracted thousands of researchers from both academia and industry and has been studied for decades. Relevance in a modern search engine has gone far beyond text matching, and now involves tremendous challenges. The semantic gap between queries and URLs is the main barrier for improving base relevance. Clicks help provide hints to improve relevance, but unfortunately for most tail queries, the click information is too sparse, noisy, or missing entirely. For comprehensive relevance, the recency and location sensitivity of results is also critical. In this paper, we give an overview of the solutions for relevance in the Yahoo search engine. We introduce three key techniques for base relevance -- ranking functions, semantic matching features and query rewriting. We also describe solutions for recency sensitive relevance and location sensitive relevance. This work builds upon 20 years of existing efforts on Yahoo search, summarizes the most recent advances and provides a series of practical relevance solutions. The performance reported is based on Yahoo's commercial search engine, where tens of billions of urls are indexed and served by the ranking system.
128 citations
••
TL;DR: The results provide further evidence for the continuity of MDD from childhood to adolescence and in comparison with children, adolescents had significantly more substance abuse and less comorbid separation anxiety disorder and attention-deficit/hyperactivity disorder.
Abstract: OBJECTIVE Very few studies have compared the symptoms of major depressive disorder (MDD) and rates of comorbid psychiatric disorders between depressed children and adolescents. The aim of this study was to reproduce and extend these findings. METHOD The Kiddie Schedule for Affective Disorders and Schizophrenia, present version (KSADS-P) was administered to parents (about their children) and in face-to-face interviews with 916 subjects aged 5.6 to 17.9 years with MDD (DSM criteria) (715 adolescents and 201 children; 348 male and 568 female). The subjects were consecutive referrals to an outpatient mood and anxiety disorders clinic. RESULTS Depressed adolescents had significantly more hopelessness/helplessness, lack of energy/tiredness, hypersomnia, weight loss, and suicidality compared with children (p values < or = .001). In comparison with children, adolescents had significantly more substance abuse and less comorbid separation anxiety disorder and attention-deficit/hyperactivity disorder (p values < or = .001). Depressed female adolescents had significantly more suicidality than depressed male adolescents (p < or = .001). There were no other sex differences between males and females. The symptoms of depressed adolescents grouped into 3 factors (endogenous, negative cognitions/suicidality, and appetite/weight), whereas the symptoms in children grouped into 2 factors (endogenous/negative cognitions/suicidality and appetite/weight). CONCLUSIONS These results provide further evidence for the continuity of MDD from childhood to adolescence.
128 citations
••
16 Jun 2012TL;DR: A novel framework to learn similarity metrics using the class taxonomy is proposed and it is shown that a nearest neighbor classifier using the learned metrics gets improved performance over the best discriminative methods.
Abstract: Categories in multi-class data are often part of an underlying semantic taxonomy. Recent work in object classification has found interesting ways to use this taxonomy structure to develop better recognition algorithms. Here we propose a novel framework to learn similarity metrics using the class taxonomy. We show that a nearest neighbor classifier using the learned metrics gets improved performance over the best discriminative methods. Moreover, by incorporating the taxonomy, our learned metrics can also help in some taxonomy specific applications. We show that the metrics can help determine the correct placement of a new category that was not part of the original taxonomy, and can provide effective classification amongst categories local to specific subtrees of the taxonomy.
127 citations
••
16 Apr 2012TL;DR: This work invokes statistical order estimation tests for Markov chains to establish that Web users are not Markovian, and derives a number of avenues for further research.
Abstract: User modeling on the Web has rested on the fundamental assumption of Markovian behavior --- a user's next action depends only on her current state, and not the history leading up to the current state. This forms the underpinning of PageRank web ranking, as well as a number of techniques for targeting advertising to users. In this work we examine the validity of this assumption, using data from a number of Web settings. Our main result invokes statistical order estimation tests for Markov chains to establish that Web users are not, in fact, Markovian. We study the extent to which the Markovian assumption is invalid, and derive a number of avenues for further research.
127 citations
Authors
Showing all 26766 results
Name | H-index | Papers | Citations |
---|---|---|---|
Ashok Kumar | 151 | 5654 | 164086 |
Alexander J. Smola | 122 | 434 | 110222 |
Howard I. Maibach | 116 | 1821 | 60765 |
Sanjay Jain | 103 | 881 | 46880 |
Amirhossein Sahebkar | 100 | 1307 | 46132 |
Marc Davis | 99 | 412 | 50243 |
Wenjun Zhang | 96 | 976 | 38530 |
Jian Xu | 94 | 1366 | 52057 |
Fortunato Ciardiello | 94 | 695 | 47352 |
Tong Zhang | 93 | 414 | 36519 |
Michael E. J. Lean | 92 | 411 | 30939 |
Ashish K. Jha | 87 | 503 | 30020 |
Xin Zhang | 87 | 1714 | 40102 |
Theunis Piersma | 86 | 632 | 34201 |
George Varghese | 84 | 253 | 28598 |