Institution
Yahoo!
Company•London, United Kingdom•
About: Yahoo! is a company organization based out in London, United Kingdom. It is known for research contribution in the topics: Population & Web search query. The organization has 26749 authors who have published 29915 publications receiving 732583 citations. The organization is also known as: Yahoo! Inc. & Maudwen-Yahoo! Inc.
Papers published on a yearly basis
Papers
More filters
••
03 Dec 2012
TL;DR: This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of multi-view models and topic models, including latent Dirichlet allocation (LDA).
Abstract: Topic modeling is a generalization of clustering that posits that observations (words in a document) are generated by multiple latent factors (topics), as opposed to just one. The increased representational power comes at the cost of a more challenging unsupervised learning problem for estimating the topic-word distributions when only words are observed, and the topics are hidden. This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of multi-view models and topic models, including latent Dirichlet allocation (LDA). For LDA, the procedure correctly recovers both the topic-word distributions and the parameters of the Dirichlet prior over the topic mixtures, using only trigram statistics (i.e., third order moments, which may be estimated with documents containing just three words). The method is based on an efficiently computable orthogonal tensor decomposition of low-order moments.
271 citations
••
20 Jul 2008TL;DR: Experiments on large-scale tagging datasets of scientific documents and web pages del.icio.us indicate that the proposed framework for real-time tag recommendation is capable of making tag recommendation efficiently and effectively.
Abstract: Tags are user-generated labels for entities. Existing research on tag recommendation either focuses on improving its accuracy or on automating the process, while ignoring the efficiency issue. We propose a highly-automated novel framework for real-time tag recommendation. The tagged training documents are treated as triplets of (words, docs, tags), and represented in two bipartite graphs, which are partitioned into clusters by Spectral Recursive Embedding (SRE). Tags in each topical cluster are ranked by our novel ranking algorithm. A two-way Poisson Mixture Model (PMM) is proposed to model the document distribution into mixture components within each cluster and aggregate words into word clusters simultaneously. A new document is classified by the mixture model based on its posterior probabilities so that tags are recommended according to their ranks. Experiments on large-scale tagging datasets of scientific documents (CiteULike) and web pages del.icio.us) indicate that our framework is capable of making tag recommendation efficiently and effectively. The average tagging time for testing a document is around 1 second, with over 88% test documents correctly labeled with the top nine tags we suggested.
271 citations
•
15 Nov 2007TL;DR: In this paper, the authors present a system for collecting data and presenting media to a user, which generally includes a data gathering module associated with an electronic device and a management module, which manages at least one user profile based on the gathered data.
Abstract: Systems, methods and apparatus for collecting data and presenting media to a user are provided. The systems generally includes a data gathering module associated with an electronic device. The data gathering module communicates gathered data to a management module, which manages at least one user profile based on the gathered data. The management module may select media for presentation to a user based on the user profile, and the selected media may be displayed to the user via a media output device co-located with the user, such as a display of the user's mobile electronic device or a television, computer, billboard or other display co-located with the user. Related methods are also provided.
271 citations
•
TL;DR: The review brings out the complexity of the host-pathogen interaction and underlines the importance of identifying the mechanisms involved in protection, in order to design vaccine strategies and find out surrogate markers to be measured as in vitro correlate of protective immunity.
Abstract: Tuberculosis is a major health problem throughout the world causing large number of deaths, more than that from any other single infectious disease. The review attempts to summarize the information available on host immune response to Mycobacterium tuberculosis. Since the main route of entry of the causative agent is the respiratory route, alveolar macrophages are the important cell types, which combat the pathogen. Various aspects of macrophage-mycobacterium interactions and the role of macrophage in host response such as binding of M. tuberculosis to macrophages via surface receptors, phagosome-lysosome fusion, mycobacterial growth inhibition/killing through free radical based mechanisms such as reactive oxygen and nitrogen intermediates; cytokine-mediated mechanisms; recruitment of accessory immune cells for local inflammatory response and presentation of antigens to T cells for development of acquired immunity have been described. The role of macrophage apoptosis in containing the growth of the bacilli is also discussed. The role of other components of innate immune response such as natural resistance associated macrophage protein (Nramp), neutrophils, and natural killer cells has been discussed. The specific acquired immune response through CD4 T cells, mainly responsible for protective Th1 cytokines and through CD8 cells bringing about cytotoxicity, also has been described. The role of CD-1 restricted CD8(+) T cells and non-MHC restricted gamma/deltaT cells has been described although it is incompletely understood at the present time. Humoral immune response is seen though not implicated in protection. The value of cytokine therapy has also been reviewed. Influence of the host human leucocyte antigens (HLA) on the susceptibility to disease is discussed. Mycobacteria are endowed with mechanisms through which they can evade the onslaught of host defense response. These mechanisms are discussed including diminishing the ability of antigen presenting cells to present antigens to CD4(+) T cells; production of suppressive cytokines; escape from fused phagosomes and inducing T cell apoptosis. The review brings out the complexity of the host-pathogen interaction and underlines the importance of identifying the mechanisms involved in protection, in order to design vaccine strategies and find out surrogate markers to be measured as in vitro correlate of protective immunity.
270 citations
••
12 Aug 2012TL;DR: A large-scale data mining approach to learning word-word relatedness, where known pairs of related words impose constraints on the learning process, and learns for each word a low-dimensional representation, which strives to maximize the likelihood of a word given the contexts in which it appears.
Abstract: Prior work on computing semantic relatedness of words focused on representing their meaning in isolation, effectively disregarding inter-word affinities. We propose a large-scale data mining approach to learning word-word relatedness, where known pairs of related words impose constraints on the learning process. We learn for each word a low-dimensional representation, which strives to maximize the likelihood of a word given the contexts in which it appears. Our method, called CLEAR, is shown to significantly outperform previously published approaches. The proposed method is based on first principles, and is generic enough to exploit diverse types of text corpora, while having the flexibility to impose constraints on the derived word similarities. We also make publicly available a new labeled dataset for evaluating word relatedness algorithms, which we believe to be the largest such dataset to date.
270 citations
Authors
Showing all 26766 results
Name | H-index | Papers | Citations |
---|---|---|---|
Ashok Kumar | 151 | 5654 | 164086 |
Alexander J. Smola | 122 | 434 | 110222 |
Howard I. Maibach | 116 | 1821 | 60765 |
Sanjay Jain | 103 | 881 | 46880 |
Amirhossein Sahebkar | 100 | 1307 | 46132 |
Marc Davis | 99 | 412 | 50243 |
Wenjun Zhang | 96 | 976 | 38530 |
Jian Xu | 94 | 1366 | 52057 |
Fortunato Ciardiello | 94 | 695 | 47352 |
Tong Zhang | 93 | 414 | 36519 |
Michael E. J. Lean | 92 | 411 | 30939 |
Ashish K. Jha | 87 | 503 | 30020 |
Xin Zhang | 87 | 1714 | 40102 |
Theunis Piersma | 86 | 632 | 34201 |
George Varghese | 84 | 253 | 28598 |